cybertools/agent/crawl
helmutm 8582252967 work in progress: structuring of resources and metadata
git-svn-id: svn://svn.cy55.de/Zope3/src/cybertools/trunk@2580 fd906abe-77d9-0310-91a1-e0d9ade77398
2008-05-08 10:12:57 +00:00
..
__init__.py added cybertools.agent package (work in progress...) 2008-02-23 14:07:15 +00:00
base.py work in progress: structuring of resources and metadata 2008-05-08 10:12:57 +00:00
filesystem.py add basic filesystem crawler 2008-05-01 18:59:06 +00:00
filesystem.txt work in progress: structuring of resources and metadata 2008-05-08 10:12:57 +00:00
mail.py work in progress: structuring of resources and metadata 2008-05-08 10:12:57 +00:00
outlook.py changed outlook.py: now much more data from the outlook mails is available. Now we also can access the date the mail was sent as well when it has been received. Also data about the attachements is now available (if desired). 2008-05-04 21:26:57 +00:00
outlook.txt move com_error to system.windows.api 2008-04-23 08:36:53 +00:00
README.txt rearrange system startup so that components are not registered during initial import but via a controlled setup 2008-04-07 06:36:48 +00:00

================================================
Agents for Job Execution and Communication Tasks
================================================

  ($Id$)

  >>> config = '''
  ... controller(names=['core.sample'])
  ... scheduler(name='core')
  ... logger(name='default', standard=30)
  ... '''
  >>> from cybertools.agent.main import setup
  >>> master = setup(config)
  Starting agent application...
  Using controllers core.sample.


Crawler
=======

The agent uses Twisted's cooperative multitasking model.

Crawler is the base class for all derived crawlers like the filesystem crawler
and the mailcrawler. The SampleCrawler returns a deferred that already had a
callback being called, so it will return at once.

Returns a deferred that must be supplied with a callback method (and in
most cases also an errback method).

We create the sample crawler via the master's controller. The sample
controller provides a simple method for this purpose.

  >>> controller = master.controllers[0]
  >>> controller.createAgent('crawl.sample', 'crawler01')

In the next step we request the start of a job, again via the controller.

  >>> controller.enterJob('sample', 'crawler01')

The job is not executed immediately - we have to hand over control to
the twisted reactor first.

  >>> from cybertools.agent.tests import tester
  >>> tester.iterate()
  SampleCrawler is collecting.
  Job 00001 completed; result: [];