Two related stories

  1. Bosco wonders whether “read the code and you’ll get it” is really an adequate description of a file format
  2. In the much-neglected Source Code for Biology and Medicine, Bioinformatics Computational Journal (BCJ) – a framework for conducting and managing computational experiments

I like the concept of workflows – really, I do – and I understand that they are used widely in industry: biotech, pharma, drug design and so on. But I predict that they will never find wide application in academic biological sciences research. Why? Because in my experience it’s essentially impossible to convince biologists that things like standards, file formats, appropriate software tools, clean code and logical organisation of computational data are important. Let me give you a typical example of a “bioinformatics problem” in academia:

Dear Neil,
Here are the sequences that you asked for. They are in fasta format, except that I’ve marked the acetylation sites with a “*” and after that, a score in square brackets.

Gee thanks – oh, it’s a Word file too, better and better. Taking my cue from Rosie, I give you the Saunders principle:

The first step in any collaboration is to reformat the data sent by your collaborators.

Does bioinformatics have protocols?

Continuing with my “get involved at Nature Network” theme: Bronwen Dekker has started a thread in the bioinformatics forum. She is interested in compiling protocols for bioinformatics and is looking for a way into the seemingly-infinite mass of available software tools.

I’ve posted a few thoughts there concerning workflows, categories of data and whether bioinformatics even has protocols in the traditional sense. I think that this is an important topic, so go there and contribute to the discussion if you can.