Labrador::Crawler::Manager::Partitioned
Provides the crawler with all queuing operations. This manager supports URL partitioning.
Called automagically by the constructor Loads URLState, URLAlloc, and Partitioning modules
Provides the next_url to be fetched
Number of seconds until the next url is ready to be fetched
Mark $url as finished.
Mark $url as finished. $HTTPresponse contains the HTTP::Response object which can be used to examine failure reasons and requeue appropraitely.
Enqueue @urls, that were all found in the page $url.
Returns a string depicting the status of the queues. Useful for displaying when the crawler has no work.
Returns a hash of statistics. Keys are master, delay and partition_size
Not intended to be called from outside the class. Only documented for completeness.
Load the module called $name
Used for spider trap detection - removes a URL with the fragment and querystring removed.
Fetches any available URLs from the dispatcher
Mark $url as finished. Contains common code extracted from finished, failure and failure events.
$Revision: 1.16 $