Labrador::Crawler:Document::HTML
use Labrador::Crawler::Document; my $doc = new Document(\$data, $req); #$doc will be instantiated with appopriate subclass if one exists
This is the custom subclass of Labrador::Crawler::Document for HTML classes. It provides two ways to extract links from an HTML document - using HTML::LinkExtor (which is part of the standard HTML::Parser distribution), or if it's available HTML::LinkExtractor. HTML::LinkExtractor is preferred as this also extracts Anchors texts of links.
Initialise the class.
Extract a list of links from the page.
Extract links from HTML document using HTML::LinkExtor
Extract links using HTML::LinkExtractor
$Revision: 1.4 $