What You’re Doing Is Rather Desperate

Notes from the life of a bioinformatics researcher

DokuWiki, PubMed and Ruby

I recently built a wiki for a research group using DokuWiki, one of my favourite wiki packages. As with many other wikis, developers have extended its functionality by writing plugins. Some of these are excellent, allowing users to generate lots of content with a minimum of syntax. For example, using the PubMed plugin, you type this:

{{pubmed>long:15595725}}

and the result is this:
pubmed

Which got me thinking. Assuming that you’ve searched PubMed and retrieved a bunch of references in XML format, how might you generate text in DokuWiki syntax, to paste into your wiki? Here’s the small parser that I wrote in ruby:


#!/usr/bin/ruby
require 'rubygems'
require 'hpricot'

h = {}
d = Hpricot.XML(open('pubmed_result.xml'))

(d/:PubmedArticle).each do |a|
  (h["=== #{a.at('DateCreated/Year').inner_html} ==="] ||= []) << "{{pubmed>long:#{a.at('PMID').inner_html}}}"
end

puts h.sort {|a,b| b<=>a}

Nine lines – how cool is that? It uses Hpricot to parse the XML and creates a hash of arrays. Hash key is the year, formatted to show a level 4 headline in DokuWiki; hash value is an array of PMIDs, formatted with PubMed plugin syntax. At the end we just print it all out, sorting by year from newest – oldest.

As Pierre would say – that’s it.

Written by nsaunders

November 6, 2008 at 6:54 pm

Posted in computing

Tagged with , , , , , ,

3 Responses

Subscribe to comments with RSS.

  1. arg… this strong feeling …. I need to learn ruby… again….

    Pierre Lindenbaum

    November 6, 2008 at 7:05 pm

  2. Awesome. Do you just dabble in ruby or have you become converted?

    Jonathan Badger

    November 6, 2008 at 7:06 pm

  3. I dabble in the sense that I’ve not been using it for long and have that nagging sensation that I’m doing it all wrong. Converted – definitely; it’s a great language.

    nsaunders

    November 6, 2008 at 9:26 pm


Comments are closed.