Missing links: using SwissProt IDtracker in your code

The BioPerl Bio::DB::SwissProt module lets you fetch sequences from SwissProt by ID or AC and store them as sequence objects:

use Bio::DB::SwissProt;
my $sp = Bio::DB::SwissProt->new('-servertype' => 'expasy', 'hostlocation' => 'australia');
my $seq = $sp->get_Seq_by_id('myod1_pig');

If you obtained SwissProt identifiers from a database that hasn’t been updated for some time, you may find that the ID or AC has changed. For example at NLSdb, the ID from the example shown is given as “myod_pig”. In this case, BioPerl will throw an error like this:

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: id does not exist
STACK: Error::throw
STACK: Bio::Root::Root::throw /usr/local/share/perl/5.8.8/Bio/Root/Root.pm:359
STACK: Bio::DB::WebDBSeqI::get_Seq_by_id /usr/local/share/perl/5.8.8/Bio/DB/WebDBSeqI.pm:154
STACK: test.pl:3
-----------------------------------------------------------

SwissProt provides a web page named IDtracker to help you find new identifiers using old ones. Here’s how we can integrate the service into Perl.

By the way, I’m experimenting with a WordPress tool to format source code – it may be error-prone.

use strict;
use LWP::Simple;
use Bio::DB::SwissProt;

my $sp = Bio::DB::SwissProt->new('-servertype' => 'expasy', 'hostlocation' => 'australia');
my $idtracker = "http://ca.expasy.org/cgi-bin/idtracker?id=";

# get our ID from somewhere
my $tryID = "myod_pig";
my $seq;

eval {
     $seq = $sp->get_Seq_by_id($tryID);
     };

if($@) {
# you should check that get() worked as expected
   my $tracker = get($idtracker.$tryID);
      if($tracker =~/was renamed to <b>(.*?)<\/b>/) {
        my $newID = $1;
# you should really eval() this as well
        $seq = $sp->get_Seq_by_id($newID);
                                                    }
else {
  # no joy with $newID; do something else  
     }
       }
else {
  # all is well; do something with $seq from $tryID
     }

How it works
The key here is using eval() to test whether get_Seq_by_id() threw an error (lines 12-16). If so, we append the offending ID to the IDtracker URL and fetch the resulting web page using LWP::Simple get() (line 18). Next, we use a regex to extract the new ID (if it exists) from the HTML output (lines 19-20) and go on our merry way to a sequence object using that correct ID (line 22).

Couple of quick notes: $@ is the special Perl variable that stores eval() errors, so if it doesn’t exist, all is well (lines 28-30). Don’t forget a semi-colon after your eval() block and declare variables ($seq in this case) before the block, so you can use them later.

Question: how many other databases on the web provide something similar to IDtracker?