Labrador::Common::RobotsCache
use Labrador::Common::RobotsCache; my $robotscache = new Labrador::Common::RobotsCache($config); my @file; if (! @file = $robotscache->get_file('www.gla.ac.uk'); { #fetch http://www.gla.ac.uk/robots.txt using HTTP #.. #save to cache $robotscache->set_file('www.gla.ac.uk', @file); }
Implements a disk-cache of robots.txt for hosts. Files are expired after a default of 25 days.
Behaviour can be altered by the following configuration file options:
Constructor. Calls init() automatically;
Initialises class, loading appropriate directives from configuration file.
Returns a boolean determining whether the cache contains the robots.txt file for the given $hostname;
Retrieve the robots.txt file for $hostname. Note that an empty array signifies that the file was not found in the cache, and a single comment ('#') implies that the given host has no robots.txt file.
Update the cache with the robots.txt file for $hostname.
$Revision: 1.7 $