My first “AJAX for bioinformatics” page

So you’ve heard about this wondrous thing called AJAX. You’re dimly aware that it can generate interactive, user-friendly and dynamic websites without the usual cycle of reloading the page after submitting a request to the server. You know that Google Maps and Flickr are two excellent examples. You’re keen to explore the possibilities for your bioinformatics web applications. What you need is a “minimal example”. Where do you start?

That’s the situation that I was in last weekend and here’s what I did.

I’ll start by making it clear that much of what follows is lifted from the W3Schools AJAX tutorial, with minimal adaptation to make it relevant for bioinformaticians. Please go there and read their excellent work.

When I figured out how AJAX works my response was: “Oh. Is that all it is?” AJAX, you see, is nothing new. In fact if you’re familiar with web programming and know a little about HTML, server-side scripting, javascript and XML – well, that’s all it is. It’s just combined in a clever way to produce a pleasing result.

Here’s what we’re going to do. We’re going to construct a simple form with a drop-down list of options. The options will be UIDs from the NCBI Gene database. When we select an option, our form will display the protein name associated with the UID – without the need to reload the page. It’s the AJAX equivalent of “Hello World” for bioinformatics, but it should give you an idea.

1. Getting set up
First, I went to the place where I do my web testing (e.g. /var/www/testing/) and created 3 directories: php for the PHP, js for the javascript and xml for – you guessed right. In fact no XML files were saved in this example but I like to be organised. The HTML files just go in the /testing root, right above these 3 directories.

2. The HTML form
There’s nothing special about the form. I named the file form.html and it goes like this:

1.  <html>
2.  <head>
3.  <script>script src="js/ncbi.js"</script>
4.  </head>
5.  <body>
6.  <h3>Get protein name from NCBI Gene DB ID</h3>
7.  <form>
8.  <b>Select a Gene ID:<b>
9.  <select name="geneID" onchange="showName(this.value)">
10. <option value="none" selected="selected">-----</option>
11. <option value="54123">54123</option>
12. <option value="21354">21354</option>
13. <option value="11988">11988</option>
14. </select>
15. </form>
16. <p>
17. <div id="geneName"><b>Gene info will be listed here.</b></div>
18. </p>
19. </body>
20. </html>

Nothing complicated about that. Our select list has 3 values which correspond to NCBI Gene UIDs. When we choose one (onchange, line 9), we fire the javascript code in js/ncbi.js. At the bottom of the form is a DIV element with the name geneName. Initially it displays “Gene info will be listed here”; later on we’ll see the javascript alter it to something different.

OK, how about that javascript?

3. The javascript
Once again, nothing to be scared of. The file ncbi.js reads like this:

1.  var xmlHttp

2.  function showName(str)
3.  { 
4.  xmlHttp=GetXmlHttpObject()
5.  if (xmlHttp==null)
6.   {
7.   alert ("Browser does not support HTTP Request")
8.   return
9.   } 
10. var url="php/ncbi.php"
11. url=url+"?q="+str
12. url=url+"&sid="+Math.random()
13. xmlHttp.onreadystatechange=stateChanged 
14. xmlHttp.open("GET",url,true)
15. xmlHttp.send(null)
16. }

17. function stateChanged() 
18. { 
19.  document.getElementById("geneName").innerHTML = "Fetching XML file..."
20.  if (xmlHttp.readyState==4 || xmlHttp.readyState=="complete")
21.  { 
22.  var response = xmlHttp.responseText
23.  if (!response)      {
24.    document.getElementById("geneName").innerHTML="No data returned!"
25.                      }
26.  else {
27.  document.getElementById("geneName").innerHTML=response 
28.       }
29.  }
30. }

31. function GetXmlHttpObject()
32. {
33. var xmlHttp=null;
34. try
35.  {
36.  // Firefox, Opera 8.0+, Safari
37.  xmlHttp=new XMLHttpRequest();
38.  }
39. catch (e)
40.  {
41.  // Internet Explorer
42.  try
43.   {
44.   xmlHttp=new ActiveXObject("Msxml2.XMLHTTP");
45.   }
46.  catch (e)
47.   {
48.   xmlHttp=new ActiveXObject("Microsoft.XMLHTTP");
49.   }
50.  }
51. return xmlHttp;
52. }

I’m not a strong javascript programmer – truth be told, I don’t like the language much, but even I can follow this one. We’ve got 3 functions: showName() on lines 2-16, stateChanged(), lines 17-30 and GetXmlHttpObject(), lines 31-52. showName() first calls GetXmlHttpObject(), assigning the returned value to the variable xmlHttp. All you need to know about lines 31-52 is that they test whether your browser supports AJAX and if so, return a special object, the xmlHttp request object. This object is what “does” AJAX. As you can see from the code it has a number of methods that send, listen to and act on HTTP requests.
In fact, the main reason why many of us are only now hearing about AJAX is – you guessed it – browser standards. See if you can guess which browser is being difficult from the code.

Assuming that all is well, we move to lines 10-16. Here, the javascript is calling a server-side PHP script named php/ncbi.php. It appends a couple of things to the URL query string. The first, “q”, is the value that we get from the select list in our form. The second is a random number (which W3Schools assures us is to prevent server caching). The PHP script is going to get the value of “q”, use it to make a request to the NCBI and return some data. The javascript is going to grab that data and display it. This happens in lines 13-15.

We know when our data comes back thanks to the function stateChanged(). When the request is sent, the text of the “geneName” element (formerly, you recall, “Gene info will be displayed here”) is altered to “Fetching XML file…”, line 19. When the request is complete (line 20), we check the variable named response to see what came back. If nothing, we display “No data returned!”, line 24. Otherwise, we set “geneName” to the value of response.

For me, the javascript is the trickiest part of the whole thing. If you’re like me, read the code through a few times and you’ll soon have it. OK – the last part is the server-side PHP script, ncbi.php.

4. The PHP
The PHP isn’t much more complex than the HTML:

<?php
1.  $val = $_GET['q'];
2.  $baseURL = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=gene&id=";
3.  $urlSUFF = "&retmode=xml";
4.  $url     = $baseURL.$val.$urlSUFF;

5.  $xmlDoc = new DOMDocument();
6.  $xmlDoc->load($url);
7.  $titles = $xmlDoc->getElementsByTagName("Prot-ref_name");

8.  foreach($titles as $node) {
9.     echo  "<b>" . $node->nodeName . ": </b>".
10.                   $node->textContent . "<br />";
11.                            }
?>

First, you should be aware that this code is PHP 5. PHP 5 has some new functions to make handling XML files quite easy, even for people like me who don’t much care for XML. There’s a good introduction to XML in PHP 4 and 5 at this Zend page.

Off we go. In line 1-4 we grab the value “q” which, you recall, is sent from ncbi.js and corresponds to a gene UID from form.html. We then construct a URL to the Efetch component of NCBI EUtils, to return our record as XML.
We can read the XML stream straight into the variable $xmlDoc and parse the XML for the element “Prot-ref_name” (lines 5-7). This contains the official protein name for the gene UID. We then loop through the stored XML object, retrieving the node name (“Protein-ref_name”) and its value ($node->textContent). Purists will frown at the use of textContent, by the way. These values are what the script returns to ncbi.js and are displayed as the value of the “geneName” element in form.html.

To recap then:

We select a gene UID from a drop-down list in a normal HTML form
Javascript and PHP interact to perform an EUtils query at the NCBI and return an XML file
The XML is parsed and appropriate values retrieved
Using asynchronous requests to the server (that’s the first ‘A’ in AJAX), javascript updates the page with progress and displays the result
All without reloading the page

That’s it. That’s AJAX. It’s a particularly stupid example – fetching a huge XML file to parse out one element, but hopefully you get the idea. You can imagine all sorts of uses for this in bioinformatics applications: fetching the most recent data rather than local storage, XML-SQL interconversions, real-time BLAST results and so on. As ever, the only limits are your creativity and requirements.

10 thoughts on “My first “AJAX for bioinformatics” page”

alf

February 20, 2007 at 20:13

I did something similar here: http://hublog.hubmed.org/archives/001261.html but never took it any further…

For the javascript you should try using jQuery – it’ll cut all that XMLHttpRequest stuff down to one line, and for the PHP side use SimpleXML, which also makes things a lot easier.
nsaunders

February 20, 2007 at 21:18

Thanks for the SimpleXML tip – that’s my kind of “dealing with XML” code for sure.

Everyone has their favourite js library, don’t they. Prototype and script.aculo.us have both been recommended to me.
alf

February 20, 2007 at 22:27

jQuery’s nice and small, and well documented. If you’re not doing any particularly complex AJAX, it should be fine.

$.get(url, { sid: “whatever” }, function (data){ $(“#geneName”).html(data); }) is all you need for the above.
alf

February 20, 2007 at 22:28

…without the smart quotes, obviously.
nsaunders

February 20, 2007 at 22:46

Looks good, thanks.

I find the possibilities very exciting. I think in bioinformatics we tend to focus on pulling files from servers, parsing them and storing locally. However, in a world where files and annotations are in constant flux, perhaps it’s more sensible to pull down the latest versions in XML as and when you need them, parse and display dynamically. Of course there’s the network overhead if your pages get a lot of use.
Pierre

February 21, 2007 at 03:30

AJAX can be used to build great things. Just look after what google has done with reader.google.com, mail.google.com, docs.google.com … but I still cannot explain the success of ajax over java: a rather complete application can be created asing an applet or java WebStart.
nsaunders

February 21, 2007 at 08:40

the success of ajax over java

Perhaps AJAX has wider appeal because for people just starting out, it’s easier to learn scripting languages. Or maybe Java is perceived as overkill in many cases – it seems to be more popular for “industrial strength” corporate projects. Or maybe developers feel that memory usage and CPU loads are better dealt with by the server, rather than the user’s machine.

I’m not a Java fan – numerous applets have crashed my browser, numerous standalone applications have had enormous memory leaks and CPU load. I’ve never figured out if this is something inherent or if it’s just that a lot of Java stuff is poorly written.
Brian Gilman

February 23, 2007 at 23:56

Great post!

My company, Panther Informatics has started porting a lot of interesting tools for bioinformatics and beyond. Take a look at our promoter and hapmap AJAX database query and extraction applications at: http://www.pantherinformatics.com:8082/vpd/index.jsf and
http://www.pantherinformatics.com:8082/hapmap/index.jsf

You may also be interested in the DAS and DAS2 protocols. These are XML specifications that, coupled with Ajax techniques make writing applications like the one you describe above a breeze to work with. In fact, I’m looking for someone to port the application Haploview with me as an open source (and super fun) project. Any takers?
Pingback: My first “AJAX for bioinformatics” page « biobits
Jonathan

May 19, 2007 at 16:33

I came across this cool Perl script illustrating Ajax/Perl for Bioinformatics.Have a look:
http://perlmonks.org/index.pl?next=15;node_id=1044