Tangled Bank #54

Tangled Bank #54 is online, over at Science & Politics.

Just in case you’re saying “Tangled what?”, Tangled Bank is a regular compendium (monthly or so) of articles from science blogs. There’s always a wide variety of submissions and it’s a great way to get a taster of the best blogs and learn something new and interesting. A bit like reading a review article for the blogosphere. It has a nice “community feel” and any blogger can host it.

Perl and XML

Recently, I found myself having to deal with XML files – specifically, PSI MI XML version 2.5 as used by the MINT and IntAct databases. Being a relative novice to parsing this kind of XML, I found it pretty painful. Normally I’d look to BioPerl but their Bio::Graph modules are rather far from “production” (for me they work only on a small range of PSI MI version 1 files).

I highly recommend the O’Reilly XML.com site. Lots of tutorials and introductory material that should point you in the right direction. Of course eventually, you’ll have to settle on a module of choice – for me XML::Twig was overkill, XML::Simple too simple and I ended up with XML::SimpleObject – works for me. I still find it hard to get my head around XML parsing – one Perl head argues that Perl mentality and XML mentality don’t sit well together and I’m inclined to agree. “Why would any freedom-loving Perl poet submit to this insanity?” he asks, in relation to the DOM.


There are days when the information is just too much. You’ve been sat at the computer for weeks on end without a break. You have a minimum of 10 open tabs in Firefox and your terminal, half a dozen unrelated scripts on the go in emacs, your email client, tens of news feeds and two or three OpenOffice documents in your half-dozen desktop workspaces. You’re logged into multiple machines, including the box at home to monitor those torrents. And one day you wake up late, exhausted and unable to do anything productive.
