Sign in to PrePAN



I deleted the Twitter account associated with this, which I had only set up to start this when github was not available.

Twitter: @benkbullock GitHub: benkasminbullock PAUSE ID: BKB URL:

User's Modules

Unicode::Confuse Deal with confusable unicode characters

Parser/interface to

Probably would parse the above file into JSON, and distribute the JSON file with it.

benkasminbullock@github 0 comments

Convert::Maker Make a conversion generator for fixed keys

A conversion generator which outputs C code. The C code output converts hash keys to values in the style of Data::Munge's list2re or lex. Construct automata to translate a fixed list of inputs into a certain set of outcomes.

The projected use of this is in creating very fast converters for translating symbol tables into other symbols. For example, to convert "ASCII IPA" into Unicode symbols very rapidly, to convert Chinese characters into Pinyin, etc.

benkasminbullock@github 0 comments

Code::Search Search code for strings

Search source code for strings.

Trigram index of medium number (40,000 or so) source code files.

Already have developed code but it is not in a state fit to release, and it is all in C except for a few scripts.

benkasminbullock@github 4 comments

Web::HackerNews Scrape the HTML of Hackernews

Given a Hacker News page, scrape the HTML to extract the contents. For example, get the title and the "hide" URL, etc., so that one can automatically match the titles against a regular expression then "hide" stories about Elon Musk, James Damore, react.js, Google memos, or other tedious things and people.

This is an HTML scraper and not related to WebService::HackerNews by Neil Bower. Note that Hacker News uses tables and "center" tags for layout, with no particular logical subdivision.

benkasminbullock@github 2 comments

XS::Check Check XS for errors, something like Perl Critic for XS

Something like Perl critic for XS.

benkasminbullock@github 3 comments

WWW::CheckGzip Check of WWW gzipping

Check for

  • URL is valid, regardless of compression
  • URL returns compressed content when requested
  • URL does not return compressed content when not requested
  • Compressed content is smaller than it would be if not compressed

benkasminbullock@github 0 comments

C::Check, C::Critic, or C::Fuss static check of C programs

C compilers typically don't warn about daft mistakes like putting function arguments in the wrong order, or using the wrong size of a variable, or things like not checking the return value of malloc, etc. This module would check for typical errors in C programs like switch fallthroughs, use of equals instead of == in an if statement, insist on using braces with if statements, bad if statement indentation, like

if (x) 
      printf ("reached");
      printf ("reached even if x is not true, despite this indentation");


At the moment I have a script which does something like the above, wondering whether it would be worth working up into a module.

benkasminbullock@github 2 comments

Gram::Index Trigram index of files

I've been developing a C program which indexes files by making trigrams of the contents of files. It's working reasonably well now and I'm thinking of extending it to a Perl version which could be used to index files, database entries and other things.

benkasminbullock@github 0 comments

WWW::Fetch Fetch WWW pages with caching, last modified, compression

Get world-wide web pages (make HTTP requests) with caching to local file system, correct if-modified and correctly handling compression of requests.

There are a lot of web scrapers on CPAN including Scrappy, Web::Scraper, and some others. However, these don't do what I want, which is to get a web page only if necessary, use a local cache if possible, always handle gzip requests. This would be built on top of LWP::UserAgent and friends, with the option to use another user agent module if necessary.

Unlike other modules, this would not handle HTML parsing but just get the page from the web.

I'm hoping I don't have to write this module but will suddenly find that a solution already exists on CPAN which I've somehow overlooked.

benkasminbullock@github 3 comments

Compress::Huffman::Binary Binary-only Huffman

  • Form a binary Huffman code from symbol table and frequencies or probabilities
  • Encode a sequence of symbols (integers) to the Huffman code
  • Decode the above encoded sequence again

benkasminbullock@github 0 comments