PrePAN

Sign in to PrePAN

WWW::Fetch Fetch WWW pages with caching, last modified, compression

Author
benkasminbullock@github
Date
URL
Status
In Review
Good

Synopsis

    my $fetch = WWW::Fetch->new ();
    $fetch->get ("http://metacpan.org");

Description

Get world-wide web pages (make HTTP requests) with caching to local file system, correct if-modified and correctly handling compression of requests.

There are a lot of web scrapers on CPAN including Scrappy, Web::Scraper, and some others. However, these don't do what I want, which is to get a web page only if necessary, use a local cache if possible, always handle gzip requests. This would be built on top of LWP::UserAgent and friends, with the option to use another user agent module if necessary.

Unlike other modules, this would not handle HTML parsing but just get the page from the web.

I'm hoping I don't have to write this module but will suddenly find that a solution already exists on CPAN which I've somehow overlooked.

Comments

you could probably tweak WWW::Mechanize::Cached to work for you, or better yet, make a feature request so you don't have to do the work :)
Isn’t URI::Fetch the thing you’re describing?
@ap thanks I will definitely try that.

Please sign up to post a review.