I’d be more than happy with the unlinked data web

Visit this URL and you’ll find a perfectly-formatted CSV file containing information about recent earthquakes. A nice feature of R is the ability to slurp such a URL straight into a data frame:

quakes <- read.csv("http://neic.usgs.gov/neis/gis/qed.asc", header = T)
colnames(quakes)
# [1] "Date"      "TimeUTC"   "Latitude"  "Longitude" "Magnitude" "Depth"
# number of recent quakes
nrow(quakes)
# [1] 3135
# biggest recent quake
subset(quakes, quakes$Magnitude == max(quakes$Magnitude, na.rm = T))
#            Date    TimeUTC Latitude Longitude Magnitude Depth
# 2060 2010/02/27 06:34:14.0  -35.993   -72.828       8.8    35

I hear a lot about the “web of data” and the “linked data web” but honestly, I’ll be happy the day people start posting data as delimited, plain text instead of HTML and PDF files.

8 thoughts on “I’d be more than happy with the unlinked data web

  1. Pingback: “The next big thing”, R, and Statistics in the cloud | R-statistics blog

    • So, what is Linked Data about then, you might wonder… well, try doing the following with CSV tables:

      – find the number of people living within 10km of the location of the earth quake
      – link that to the average income per person
      – and who in your friends who published a paper together with someone within that 10km range

      Is it starting to make sense? Hints: unique identifiers, uniform API.

  2. I’m not criticising (or saying anything at all) about linked data – and I’m quite aware of its uses. My point is, most providers haven’t even solved the simple problem of serving regular data.

  3. You get the data you work on as tables in HTML or PDF files? Oh, I dream of getting data as HTML or PDF tables. I get them as boxes full of paper that need to be scanned, OCRed, and text-mined before I have anything that begins to resemble a table!

    I know that it may sound like I am aspiring to become the fifth Yorkshireman, but I am involved in a project where this how the data were provided.

Comments are closed.