Our ISMB 2008 conference report

In a nutshell: we went to ISMB 2008, had a great time and live-blogged the meeting using this FriendFeed room. The nice people at ISCB were so impressed that they asked us to write up a report.

So here it is. FriendFeed life scientists might recognise some (or all) of the author names.

Saunders, N., Beltrão, P., Jensen, LJ, Jurczak, D., Krause, R., Kuhn, M. and Wu, S. (2009).
Microblogging the ISMB: A New Approach to Conference Reporting.
PLoS Comput Biol 5(1): e1000263

This is a highlight of my career to date. Thanks are due: to the authors (especially Roland who got it all started); to everyone who contributed at the ISMB 2008 room; and to BJ Morrison and Phil Bourne for their support in making this happen.

Big data: shoot first, ask questions later

The terms “big science” and “big data” have recently become quite prominent on the Web. For commentary, I point you to the man with the tag.

There are those who believe that big data means fundamental change in how science is done. We’ll take all this data, make it machine-readable, put it in the cloud and – poof! – science will emerge. Almost as if it were self-aware. At the other extreme are those who see no fundamental difference in how we go about our business – there’ll just be “more” of it.

One analysis, of course, is that they’re both right and they’re both wrong.
Read the rest…

Wikification: thinking in public

Over the last 3 years, I’ve stored many small snippets of information in a set of Google Notebooks. Sample topics: notes for blog posts, programming skills that I’d like to learn and preliminary (or half-baked) ideas for research or software projects. I’ve learned that:

  • Whilst Google Notebook is great for scraping information from web pages, it leaves a lot to be desired in terms of editing and presentation
  • Ideas left in private notebooks quickly become dead ideas

Yes you can publicise, tag and collaborate at a Google Notebook, but this doesn’t fit with my workflow – or that of many others, I suspect. So, I’ve taken as much of the material as I want to make public and dumped it on a wiki at Wikidot.com. By the way, if you’re looking for a free hosted wiki with plenty of features, you could do a lot worse.

If anything there interests you enough to add material, let me know and I’ll invite you as an editor (you’ll need to create a wikidot account if you don’t have one).

Easy visualisation of database schemas using SQLFairy

BioSQL schema

BioSQL schema

Here’s a common problem solved: how to generate a pretty picture of your database schema. A Google search throws up all manner of home-brewed solutions using graphviz, perl scripts and so on. Or you can make life easier and simply install SQLFairy.

Under Ubuntu: as simple as “sudo apt-get install sqlfairy”.

Next, dump your database tables, e.g. for MySQL:

mysqldump -u username -p -d mydatabase > mydatabase.sql

Finally, for a PNG image of your schema:

sqlt-graph -f MySQL -o mydatabase.png -t png mydatabase.sql

Too easy. Example shown is the BioSQL schema.

update: if your schema lacks explicit foreign keys, try the –natural-join options (man sqlt-graph, man sqlt-diagram)

Words fail me

As I’m a biologist, rather than an inorganic chemist or a mineralogist, I don’t have much (well, any) need to look at crystal structures of simple inorganic compounds. Just as well…

…our story begins at Twitter, where David Bradley asks:

Anyone know where to find crystal structures of sodium hypochlorite and sodium bisulfate (cif files or similar) ? #science #crystal

Never thought about it, you say, but surely it can’t be very difficult. So you head to Google and try searches such as “inorganic crystal structure database”. Where you unearth two main players: the Inorganic Crystal Structure Database (ICSD) and the Cambridge Structural Database (CSD). Both are private, requiring registration, login and in one case, installation of an X-client.

Coming from bioinformatics where comparable resources such as the PDB are freely-available via web interfaces, I find this utterly perplexing. Why do these research communities stand for it? Is anyone developing free, open alternatives?

Darwin 2009: multimedia and more

Good to see that the BBC are getting into the Darwin anniversary celebrations. Here’s their informative website with TV/radio shows and special features.

BBC Radio 4 also have a Darwin website. You could do a lot worse than start by listening to Melvyn Bragg’s 4-part Darwin series from the show “In Our Time”. It’s available via the iplayer or as a podcast.

Elsewhere in the UK there are Darwin 200 events organised by the Natural History Museum and the Wellcome Trust.

Text to fasta and other delights of the shell

One thing I’ve learned in my current job is that some familiarity with Linux tools for processing text files: awk, sed, grep, head/tail, cut/paste and so on, often provides a speedier solution than writing a script in (insert scripting language of choice here). I know this stuff is trivial to shell gurus, but I still get a little buzz out of it. A couple of real-life examples.
Read the rest…