cybertools/text
helmutm f2ec195a55 added package cybertools.text
git-svn-id: svn://svn.cy55.de/Zope3/src/cybertools/trunk@1383 fd906abe-77d9-0310-91a1-e0d9ade77398
2006-10-04 08:53:58 +00:00
..
testfiles added package cybertools.text 2006-10-04 08:53:58 +00:00
__init__.py added package cybertools.text 2006-10-04 08:53:58 +00:00
base.py added package cybertools.text 2006-10-04 08:53:58 +00:00
interfaces.py added package cybertools.text 2006-10-04 08:53:58 +00:00
pdf.py added package cybertools.text 2006-10-04 08:53:58 +00:00
README.txt added package cybertools.text 2006-10-04 08:53:58 +00:00
tests.py added package cybertools.text 2006-10-04 08:53:58 +00:00

=================================================
Text transformations, e.g. for full-text indexing
=================================================

  ($Id$)

  >>> import os
  >>> from cybertools import text
  >>> directory = os.path.dirname(text.__file__)
  >>> fn = os.path.sep.join((directory, 'testfiles', 'mary.pdf'))
  >>> f = open(fn)

  >>> from cybertools.text.pdf import PdfTransform
  >>> transform = PdfTransform(None)
  >>> words = transform(f).split()
  >>> len(words)
  89
  >>> u'lamb' in words
  True