PrePAN

Sign in to PrePAN

Bio::DB::Big XS bindings to libBigWig for access the UCSC/kent big formats

Good

Synopsis

use Bio::DB::Big;
use Bio::DB::Big::AutoSQL;

# Setup CURL buffers
Bio::DB::Big->init();

my $bw = Bio::DB::Big->open('path/to/file.bw');
# Generic: get the type
if($bw->is_big_wig()) {
  print "We have a bigwig file\n";
}

# Generic: Get headers
my $header = $bw->header();
printf("Working with %d zoom levels", $header->{nLevels});
# Generic: Get chromosomes (comes back as a hash {chrom => length})
my $chroms = $bw->chroms();

#Get stats, values and intervals
if($bw->has_chrom('chr1')) {
  my $bins = 10;

  # uses the zoom levels and returns an array of 10 bins over chromsome positions 1-100
  my $stats = $bw->get_stats('chr1', 0, 100, $bins, 'mean');
  foreach my $s (@{$stats}) {
    printf("%f\n", $s);
  }

  # Go directly to the raw level and calc on that but ask for maximum value per bin this time
  my $full_stats = $bw->get_stats('chr1', 0, 100, $bins, 'max', 1);

  # Get a value for each base over chromsome positions 1 - 100. Values can be undef if not set
  my $values = $bw->get_values('chr1', 0, 100);

  # Get the real intervals where a value was assigned
  my $intervals = $bw->get_intervals('chr1', 0, 100);
  foreach my $i (@{$intervals}) {
    printf("%d - %d: %f\n", $i->{start}, $i->{end}, $i->{value})
  }

  # Or iterate which allows you to move through a file without loading everything into memory
  my $blocks_per_iter = 10;
  my $iter = $bw->get_intervals_iterator('chr1', 0, 100, $blocks_per_iter);
  while(my $intervals = $iter->next()) {
    foreach my $i (@{$intervals}) {
      printf("%d - %d: %f\n", $i->{start}, $i->{end}, $i->{value})
    }
  }
}

my $bb = Bio::DB::Big->open('http://genome.ucsc.edu/goldenPath/help/examples/bigBedExample.bb');
if($bb->is_big_bed) {
  my $with_string = 1;
  # Optionally you do not retrieve the "string" if you don't want to potenitally saving memory
  my $entries = $bb->get_entries('chr21', 9000000, 10000000, $with_string);
  foreach my $e (@{$entries}) {
    printf("%d - %d: %s\n", $e->{start}, $e->{end}, $e->{string});
  }

  # Or you can use an iterator
  my $blocks_per_iter = 10;
  my $iter = $bb->get_entries_iterator('chr21', 0, $bb->chrom_length('chr21'), $with_string, $blocks_per_iter);
  while(my $entries = $iter->next()) {
    foreach my $e (@{$entries}) {
      printf("%d - %d: %s\n", $e->{start}, $e->{end}, $e->{string});
    }
  }

  # Finally you can request AutoSQL and parse if available
  if($bb->get_autosql()) {
    my $autosql = $bb->get_autosql();
    my $as = Bio::DB::Big::AutoSQL->new($autosql);
    if($as->has_field('name')) {
      printf("%s: The field 'name' is in position %d\n", $as->name(), $as->get_field('name')->position());
    }
    # Or just get all fields as an arrayref
    my $fields = $as->fields();
  }
}

Description

This library provides access to the BigWig and BigBed file formats designed by UCSC. However rather than use kent libraries this uses libBigWig from https://github.com/dpryan79/libBigWig as it provides an implementation that avoids exiting when errors happen. libBigWig provides access to BigWig summaries, values and intervals alongside providing access to BigBed entries.

Comments

Please sign up to post a review.