Sign in to PrePAN

XML::LibXML::Proxy Force LibXML to use a proxy for HTTP(S) external entities



use XML::LibXML;
use XML::LibXML::Proxy;


# Use XML::LibXML normally...


This is my first expedition into CPAN territory. :) For the name, I have the blessing of Shlomi Fish, the current maintainer of XML::LibXML, so that part's settled.

Motivation: my project validates potentially a lot of different DTDs, so I needed to set up some form of local caching. Libxml2's "catalogs" are not appropriate for this because my project doesn't know in advance what versions it'll have to validate (it's not something popular like HTML) and I didn't want to reinvent an entire caching mechanism from scratch. I looked into using XML::LibXML::InputCallback of course, but those cannot be used to override Libxml's built-in "nanohttp" client for HTTP, it only allows adding new protocols. So I resorted to the extreme: XML::LibXML::externalEntityLoader().

With that, I just use LWP::UserAgent to request the URL through the defined proxy, which in my specific case is a local Nginx caching forward proxy.

I documented in KNOWN BUGS that using this quick hack breaks support for "file:///" and other schemes, because externalEntityLoader() is a catch-all. A future version could handle those but I simply had no need for it in my project and wanted to keep things simple.

The core of it is really just a few lines, and I found no elegant tests to include, so "t/" only includes the basics.


Please sign up to post a review.