NAME

Labrador::Common::URLState

SYNOPSIS

	use Labrador::Common::URLState::Normal;
	my $states = new Labrador::Common::URLState::Normal($data);
	$states->url('http://www.gla.ac.uk/#', time);
	print "Seen http://www.gla.ac.uk/# before" if $states->url_exists(http://www.gla.ac.uk/#');

DESCRIPTION

Used for recording seen urls. Abstract class, must be implemented. Some papers recommend using a Bloom filter or digests of the URLs to save space.

STATES

-1 is in the master queue -2 is in the crawler queue -3 is with the crawler -4 failed >0 is the time we were informed by the crawler it finished crawling the URL

METHODS

new($data)

Constructs a new URLState object

init()

Initialises this module. Automagically called by new()

url($url, [$value])

Adds the $url to the hash if it doesnt exists. Returns the value it had if it was already there.

url_exists($url)

Returns 1 if url seen before

REVISION

	$Revision: 1.4 $