WWW::Fetch Fetch WWW pages with caching, last modified, compression
Get world-wide web pages (make HTTP requests) with caching to local file system, correct if-modified and correctly handling compression of requests.
There are a lot of web scrapers on CPAN including Scrappy, Web::Scraper, and some others. However, these don't do what I want, which is to get a web page only if necessary, use a local cache if possible, always handle gzip requests. This would be built on top of LWP::UserAgent and friends, with the option to use another user agent module if necessary.
Unlike other modules, this would not handle HTML parsing but just get the page from the web.
I'm hoping I don't have to write this module but will suddenly find that a solution already exists on CPAN which I've somehow overlooked.
MVC::Neaf Not Even A (Web) Framework
[ni:f] stands for Not Even A Framework.
Much like Dancer, it splits an application into a set of handler subroutines associated with URI paths. Unlike Dancer, however, it doesn't export anything (except one tiny auxiliary sub) into the application namespace. Instead, a know-it-all Request object is fed to a handler when serving request, like in object-oriented CGI.pm or Kelp.
The response is expected in the form of unblessed hash reference which is in turn fed to the view object for rendering (Template Toolkit and JSON/JSONP currently supported, also Data::Dumper for debugging). Also the return value may contain some dash-prefixed switches altering the behavior of Neaf itself - awful looking, yet visible and simple way of doing it without going for a more complex structure.
Unlike anything I've seen so far, and much like Perl's own -T switch, it offers no (easy) way to get user inputs without validation, either through regexp, or through a form validator. (Regexp-based one is in stock, Validator::LIVR also supported, more planned).
My not-so-impressive feature list so far:
- GET, POST, and HEAD requests, query parameters (multivalues not done yet), uploads, cookies, and redirects;
- Template::Toolkit, JSON::XS view out of the box;
- Can serve raw generated content (e.g. images) and static files (like css/js);
- CLI-based debugging that can simulate posts, uploads, cookies etc;
- Can serve delayed or unspecified length replies, or do custom actions after the request is finished;
- cookie-based sessions supported (no storage drivers available out of the box yet, though);
- Form validation and resubmission;
- Same program can work as a CGI script, PSGI app, or Apache mod_perl handler;
- Half-decent example directory and 79% overall test coverage.
I mostly wrote it for my own education, and to look at possible ways of amending the hurdles that were plaguing me throughout my last two jobs. Now I'd like to share it, but still in doubt whether CPAN needs another framework.
OSA Modules supporting the Online Social Advocacy standards and APIs
Being social on the Internet has become a centralized activity that defies the natural manner of physical, real world, communications. The currently evolved on-line communication paradigm creates notable technical incongruence, fosters dramatic security incidents and most importantly strips information sender and recipient both of control over their data.
At TekAdvocates we are designing an online communication strategy based on what we call "Social Normalcy." This strategy is based on standards and API's we intend to make open once reasonably baselined, not on a single vertical application interface that locks anyone using it into a particular company's proprietary product set.
We have developed standards and APIs enough to have created a fully functioning prototype of our theories to demonstrate their viability. All our current code is written in Perl. (Though any language suffices provided the resulting program adheres to standards.) It follows naturally then that we would convert our efforts to a suite of modules supporting the referenced standards and APIs to aid all interested developers in quickly creating compliant applications.
Our efforts are called the "Online Social Advocate" standards and APIs because the end product is a localized service that centralizes, household, business, or other organization's data transfer needs within the confines of their personal space. The centralized service is known as that organization's data transfer advocate in that it manages routing, control and access of passed information within and outside the organization/household.
By definition of our effort, this is an entirely new code set and category of module on CPAN. As near as we can tell from reading the informational pages on submissions and searching CPAN, a top level OSA namespace is warranted. While the "OSA" name may mean nothing to most, it would be instantly recognizable to anyone looking to develop code compliant to these standards. The intention is for the OSA modules to evolve with the standards and as broader ways to use those standards are realized.
Our intention is to initially create 3 major namespaces: OSA - containing base modules that would be useful with any application using the OSA standards. OSA::App - containing modules useful when writing applications to initiate and receive data transfers. OSA::OCE - containing modules useful when authoring an "Online Communication Engine" that is the actual "advocate" server that manages and routes data.
We are currently planning to submit simply under the "OSA" umbrella. We are of course wide open to any advice offered in this regard. Other considerations were to submit individual submissions of "OSA", "OSA::App" and "OSA::OCE", but being that everything is under the new space of "OSA" that seemed to be the appropriate package level. Downloading "OSA" would be pointless without either OSA::App or OSA::OCE to create an application in which to use the base modules. One would never use OSA::App or OSA::OCE without the base modules of OSA.
Our thought is that eventually it could make sense to make OSA::App available separately from "OSA::OCE since OSA::App will likely be used significantly more often, however the size of OSA::OCE is going to initially be insignificant so packaging all together should not be an issue.
We have never submitted modules to CPAN before, so any feedback oriented toward helping us get this right would be appreciated.
PDLx::Algorithm::Center Various ways of centering a dataset
This module collects various algorithms for determining the center of a dataset into one place. It accepts data stored as PDL variables (piddles)
Currently it contains a single function,
sigma_clip, which provides an iterative algorithm which successively removes outliers by clipping those whose distances from the current center are greater than a given number of standard deviations.
sigma_clip finds the center of a data set by:
- ignoring the data whose distance to the current center is a specified number of standard deviations
- calculating a new center by performing a (weighted) centroid of the remaining data
- calculating the standard deviation of the distance from the data to the center
- repeat at step 1 until either a convergence tolerance has been met or the iteration limit has been exceeded
The initial center may be explicitly specified, or may be calculated by performing a (weighted) centroid of the data.
The initial standard deviation is calculated using the initial center and either the entire dataset, or from a clipped region about the initial center.
sigma_clip can center sparse (e.g., input is a list of coordinates) or dense datasets (input is a hyper-rectangle) with or without weights. It accepts a mask which directs it to use only certain elements in the dataset.
The coordinates may be transformed using (PDL::Transform)[https://metacpan.org/pod/PDL::Transform]. This is mostly useful for dense datasets, where coordinates are generated from the indices of the passed hyper-rectangle. This functionality is not currently documented, as tests for it have not yet been written.
More information is available at the github repo page, https://github.com/djerius/PDLx-Algorithm-Center
Geo::OLC API for Google's Open Location Codes
Open Location Codes are Google's open-sourced geohashing algorithm. They provide a nice set of APIs at https://github.com/google/open-location-code, but not for Perl.
Despite having worked with Perl since the Eighties, I've never contributed to CPAN, so I'm open to any recommendations about naming, packaging, code style, etc.
There is a module on Github that implements the same API (discovered after I wrote mine...), but it was apparently never submitted to CPAN: https://github.com/nkwhr/Geo-OpenLocationCode
Test::DocClaims Help assure documentation claims are tested
A module should have documentation that defines its interface. All claims in that documentation should have corresponding tests to verify that they are true. Test::DocClaims is designed to help assure that those tests are written and maintained.
It would be great if software could read the documentation, enumerate all of the claims made and then generate the tests to assure that those claims are properly tested. However, that level of artificial intelligence does not yet exist. So, humans must be trusted to enumerate the claims and write the tests.
How can Test::DocClaims help? As the code and its documentation evolve, the test suite can fall out of sync, no longer testing the new or modified claims. This is where Test::DocClaims can assist. First, a copy of the POD documentation must be placed in the test suite. Then, after each claim, a test of that claim should be inserted. Test::DocClaims compares the documentation in the code with the documentation in the test suite and reports discrepancies. This will act as a trigger to remind the human to update the test suite. It is up to the human to actually edit the tests, not just sync up the documentation.
The comparison is done line by line. Trailing white space is ignored. Any white space sequence matches any other white space sequence. Blank lines as well as "=cut" and "=pod" lines are ignored. This allows tests to be inserted even in the middle of a paragraph by placing a "=cut" line before and a "=pod" line after the test.
Additionally, a special marker, of the form "=for DC_TODO", can be placed in the test suite in lieu of writing a test. This serves as a reminder to write the test later, but allows the documentation to be in sync so the Test::DocClaims test will pass with a todo warning. Any text on the line after DC_TODO is ignored and can be used as a comment.
Especially in the SYNOPSIS section, it is common practice to include example code in the documentation. In the test suite, if this code is surrounded by "=begin DC_CODE" and "=end DC_CODE", it will be compared as if it were part of the POD, but can run as part of the test. For example, if this is in the documentation
Here is an example: $obj->process("this is some text");
this could be in the test
Here is an example: =begin DC_CODE =cut $obj->process("this is some text"); =end DC_CODE
Example code that uses print or say and has a comment at the end will also match a call to is() in the test. For example, this in the documentation POD
The add function will add two numbers: say add(1,2); # 3 say add(50,100); # 150
will match this in the test.
The add function will add two numbers: =begin DC_CODE =cut is(add(1,2), 3); is(add(50,100), 150); =end DC_CODE
When comparing code inside DC_CODE markers, all leading white space is ignored.
When the documentation file type does not support POD (such as mark down files, *.md) then the entire file is assumed to be documentation and must match the POD in the test file. For these files, leading white space is ignored. This allows a leading space to be added in the POD if necessary.
Bitcoin::Client Implements bitcoin-cli methods
A module for bootstrapping Bitcoin Core RPC client calls (bitcoin-cli).
The idea is that someone can install the module from CPAN and immediately start coding against and bitcoind instance in an OO way with similar syntax to bitcoin-cli without compiling or installing many perl dependencies (just Moo and JSON::RPC::Client, thinking about taking out Moo).
Right now the module is just named BTC. But I think Bitcoin::Client or Bitcoin::Cli would be more appropriate.
There are a couple of other bitcoin modules that are similar but the syntax is not as simple and there are many more dependencies.