Monthly Archives: April 2010

Experiments with igraph

Networks – social and biological – are all the rage, just now. Indeed, a recent entry at Duncan’s QOTD described the “hairball” network representation as the dominant cultural icon in molecular biology.

I’ve not had occasion to explore networks “professionally”, but have always been fascinated by both networks and the tools used to analyse them. My grasp of graph theory, the mathematics behind networks, is more or less summarised by this Wikipedia page. I’ve also been exploring the igraph library and thought I’d share a few of my “experiments with igraph”. As I say, I’m learning myself as I go along, so none of this should be taken as professional advice.

Let’s start with my favourite network – FriendFeed of course – and ask a few questions about everyone’s favourite group, The Life Scientists (TLS).
Read the rest…

Getting your web application and R(Apache) to talk to each other

Here’s the situation. Web applications, built using a framework (e.g. Rails, Django) are great for fetching data from a database and rendering it. They’re not so great for crunching and charting the data. Conversely, R is great for crunching and charting, but doesn’t make for a great web application.

rapache-rails

Index view for values


The idea then, is to let each do what it does best and enable the passing of data between them. There isn’t a whole lot of literature on this topic, but there are a couple of guides:

  • In these seminar slides (PDF), Jeroen Ooms describes how data can be passed between a web browser and R. Briefly, he uses client-side javascript to format the data as a JSON string. Server-side R (RApache) then parses the POST variable using fromJSON() (in the rjson package), formats the results of an R function as JSON using toJSON() and sends them back to the browser.
  • Slide 46 of this presentation by Mike Driscoll of Dataspora illustrates a different approach, where a Django-based web application sends data to RApache in CSV format.

As a first step in understanding all of this, we can build a small demo application using Rails (version 2.3.5), which serves both JSON and CSV. We’ll see if we can get that into R, then see if R can return results back to Rails, via RApache. Baby steps, so we’ll avoid the AJAX stuff for now and just use Rails rendering methods to serve JSON from a controller.
Read the rest…

I’d be more than happy with the unlinked data web

Visit this URL and you’ll find a perfectly-formatted CSV file containing information about recent earthquakes. A nice feature of R is the ability to slurp such a URL straight into a data frame:

quakes <- read.csv("http://neic.usgs.gov/neis/gis/qed.asc", header = T)
colnames(quakes)
# [1] "Date"      "TimeUTC"   "Latitude"  "Longitude" "Magnitude" "Depth"
# number of recent quakes
nrow(quakes)
# [1] 3135
# biggest recent quake
subset(quakes, quakes$Magnitude == max(quakes$Magnitude, na.rm = T))
#            Date    TimeUTC Latitude Longitude Magnitude Depth
# 2060 2010/02/27 06:34:14.0  -35.993   -72.828       8.8    35

I hear a lot about the “web of data” and the “linked data web” but honestly, I’ll be happy the day people start posting data as delimited, plain text instead of HTML and PDF files.