Sign in to PrePAN

Bio::DB::Big XS bindings to libBigWig for access the UCSC/kent big formats



use Bio::DB::Big;
use Bio::DB::Big::AutoSQL;

# Setup CURL buffers

my $bw = Bio::DB::Big->open('path/to/');
# Generic: get the type
if($bw->is_big_wig()) {
  print "We have a bigwig file\n";

# Generic: Get headers
my $header = $bw->header();
printf("Working with %d zoom levels", $header->{nLevels});
# Generic: Get chromosomes (comes back as a hash {chrom => length})
my $chroms = $bw->chroms();

#Get stats, values and intervals
if($bw->has_chrom('chr1')) {
  my $bins = 10;

  # uses the zoom levels and returns an array of 10 bins over chromsome positions 1-100
  my $stats = $bw->get_stats('chr1', 0, 100, $bins, 'mean');
  foreach my $s (@{$stats}) {
    printf("%f\n", $s);

  # Go directly to the raw level and calc on that but ask for maximum value per bin this time
  my $full_stats = $bw->get_stats('chr1', 0, 100, $bins, 'max', 1);

  # Get a value for each base over chromsome positions 1 - 100. Values can be undef if not set
  my $values = $bw->get_values('chr1', 0, 100);

  # Get the real intervals where a value was assigned
  my $intervals = $bw->get_intervals('chr1', 0, 100);
  foreach my $i (@{$intervals}) {
    printf("%d - %d: %f\n", $i->{start}, $i->{end}, $i->{value})

  # Or iterate which allows you to move through a file without loading everything into memory
  my $blocks_per_iter = 10;
  my $iter = $bw->get_intervals_iterator('chr1', 0, 100, $blocks_per_iter);
  while(my $intervals = $iter->next()) {
    foreach my $i (@{$intervals}) {
      printf("%d - %d: %f\n", $i->{start}, $i->{end}, $i->{value})

my $bb = Bio::DB::Big->open('');
if($bb->is_big_bed) {
  my $with_string = 1;
  # Optionally you do not retrieve the "string" if you don't want to potenitally saving memory
  my $entries = $bb->get_entries('chr21', 9000000, 10000000, $with_string);
  foreach my $e (@{$entries}) {
    printf("%d - %d: %s\n", $e->{start}, $e->{end}, $e->{string});

  # Or you can use an iterator
  my $blocks_per_iter = 10;
  my $iter = $bb->get_entries_iterator('chr21', 0, $bb->chrom_length('chr21'), $with_string, $blocks_per_iter);
  while(my $entries = $iter->next()) {
    foreach my $e (@{$entries}) {
      printf("%d - %d: %s\n", $e->{start}, $e->{end}, $e->{string});

  # Finally you can request AutoSQL and parse if available
  if($bb->get_autosql()) {
    my $autosql = $bb->get_autosql();
    my $as = Bio::DB::Big::AutoSQL->new($autosql);
    if($as->has_field('name')) {
      printf("%s: The field 'name' is in position %d\n", $as->name(), $as->get_field('name')->position());
    # Or just get all fields as an arrayref
    my $fields = $as->fields();


This library provides access to the BigWig and BigBed file formats designed by UCSC. However rather than use kent libraries this uses libBigWig from as it provides an implementation that avoids exiting when errors happen. libBigWig provides access to BigWig summaries, values and intervals alongside providing access to BigBed entries.


Just a few comments on your XS code:

h = (HV *)sv_2mortal((SV *)newHV());

To fix the refcount problem, I would use the T_HVREF_REFCOUNT_FIXED typemap entry, rather than the sv_2mortal black magic. See perlxstypemap.

Same for the arrays.

hv_store(h, "version", 7, newSVuv(big->hdr->version), 0);

I would use the more maintainable hv_stores(h, "version", newSVuv(...)).
Thanks for the comment and the tip. I was not aware of the T_HVREF_REFCOUNT_FIXED entry. However I would like this code to run on 5.14+ at the very least. Reading through perlxstypemap T_HVREF_REFCOUNT_FIXED was introduced in 5.15.4, which is a shame.

The hv_store calls have been converted to hv_stores().
Hi Andrew,

You can add the T_HVREF_REFCOUNT_FIXED input and output typemaps to your typemap file. This way it works for < 5.14 (at least to perl 5.8 in my tests). I've seen this in other modules and now use it for my Lab::Zhinst:

Please sign up to post a review.