================================================= Text Transformations, e.g. for Full-text Indexing ================================================= ($Id$) If a converter program needed is not available we want to put a warning into Zope's server log; in order to be able to test this we register a log handler for testing: >>> from zope.testing.loggingsupport import InstalledHandler >>> log = InstalledHandler('zope.server') The test files are in a subdirectory of the text package: >>> import os >>> from cybertools import text >>> testdir = os.path.join(os.path.dirname(text.__file__), 'testfiles') PDF Files --------- Let's start with a PDF file: >>> from cybertools.text.pdf import PdfTransform >>> transform = PdfTransform(None) >>> f = open(os.path.join(testdir, 'mary.pdf')) This will be transformed to plain text: >>> result = transform(f) Let's check the log, should be empty: >>> print log So what is in the plain text result? >>> words = result.split() >>> len(words) 89 >>> u'lamb' in words True Word Documents -------------- >>> from cybertools.text.doc import DocTransform >>> transform = DocTransform(None) >>> f = open(os.path.join(testdir, 'mary.doc')) >>> result = transform(f) >>> print log >>> words = result.split() >>> len(words) 89 >>> u'lamb' in words True