================== Tweaking HTML text ================== >>> from cybertools.util.html import sanitize, stripComments >>> input = """ ...

... Text, and more ...

... """ Sanitize HTML ------------- >>> sanitize(input, validAttrs=['style']) u'\n

\nText, and more\n

\n' >>> sanitize(input, ['p', 'b'], ['class']) u'\n

\nText, and more\n

\n' All comments are stripped from the HTML input. >>> input2 = """ ...

text

... ...

text

""" >>> sanitize(input2) u'\n

text

\n\n

text

' It's also possible to remove only the comments from the HTML input. >>> stripComments(input2) u'\n

text

\n\n

text

' It is also possible to strip all HTML tags from the input string. >>> from cybertools.util.html import stripAll >>> stripAll(input) u'Text, and more' Extract first part of an HTML text ---------------------------------- >>> from cybertools.util.html import extractFirstPart >>> extractFirstPart(input) u'

\nText, and more\n

' >>> extractFirstPart(input2) u'

text

'