Interacting with bioinformatics webservers using R

In an ideal world, all bioinformatics tools would be made available via the Web as a web service with an API, as well as a standalone package to download for local use. This is rarely the case and sometimes, even where one or the other is available, factors such as cost come into play. So we resort to web scraping; writing code to interact with the code that lies behind a web server so as to submit queries, retrieve and parse results.

Normally, I’d use something like Ruby’s Mechanize library for this purpose. However, where the purpose is to retrieve delimited data for analysis using R, I figured it was time to try and achieve the entire process within R. So here’s how I used the RCurl and XML packages to interact with the WHAT IF server, which provides tools for the analysis of protein structure.
Read the rest…