PrePAN

Sign in to PrePAN

Profile

User's Modules

File::CombineSorted Efficiently combine two sorted files

I haven't yet written this as a module but I realized I've written code to do this several times in the last couple of weeks. I searched the CPAN and could find nothing like it...

My use-case is that the input files share a "key" column, and are already sorted by the values in that column, but due to various issues, the files are now in different formats and have different data. To make matters even more interesting, these files are huge - forget slurping them into an array!

The goal of this module is to get rid of all the repetitive boilerplate of this kind of processing and hide the details of reading the input and finding matching lines, detecting errors, etc.

I want to know if other people have a need for this sort of thing, and if the interface is what other people want. Is the interface too dumb? Too clever? Is there functionality I should add or remove?

For example, In addition to the 'single-function' interface from the synopsis, I'd like to make an iterator-like interface - though that would take some thought to figure out how best to do it. (suggestions welcome!)

I'll put something up on github soon* but I'm hoping for some constructive input before uploading to the CPAN.

*soon eq 'in the next week or so...';

Hercynium@github 3 comments

Tie::Handle::CountChars Make your file handles keep track of the characters read and written

This module allows you to transparently keep track of the characters written to and read from a perl "HANDLE" variable (or IO::Handle object). This seems to work for anything that behaves as a HANDLE, including anything that inherits from IO::Handle. I wrote this because I found no good/simple way to do so transparently, and on handles that I was not creating directly (for example, the file-handle that is passed to your "accept" callback in AnyEvent's tcp_server)

Hercynium@github 0 comments