HTML manipulation and transformations


How to programmatically rewrite HTML in Plone.


It is recommended to use the lxml library for all HTML DOM manipulation in Python.

Plone is no exception.

Converting HTML to plain text

The most common use case is to override SearchableText() to return HTML content for portal_catalog for indexing.

Converting plain text to HTML

You can use portal_transforms to do plain text -> HTML conversion.

Below is an example how to create a Description field rendered with new line support.

from five import grok
from zope.interface import Interface
from Products.CMFCore.utils import getToolByName

class DescriptionHelper(grok.CodeView):
    A helper view which exports dublin core description w/new line support
    allowing several paragraphs in Plone's description field.

    def render(self):
        Get a content item description w/new line support.

        Transform hard lines to breaks in HTML.

        # Call archetypes accessor
        text = self.context.Description()

        # Transform plain text description with ASCII newlines
        # to one with
        portal_transforms = getToolByName(self.context, 'portal_transforms')

        # Output here is a single <p> which contains <br /> for newline
        data = portal_transforms.convertTo('text/html', text, mimetype='text/-x-web-intelligent')
        html = data.getData()
        return html

Now you can do in your page template

<metal:main-macro define-macro="main">

    <div tal:replace="structure provider:plone.abovecontenttitle" />

    <h1 metal:use-macro="here/kss_generic_macros/macros/generic_title_view">
        Title or id

    <div tal:replace="structure provider:plone.belowcontenttitle" />

    <div class="documentDescription">
       <tal:desc replace="structure context/@@description-helper" />


