What You’re Doing Is Rather Desperate

Notes from the life of a bioinformatics researcher

Two related stories

  1. Bosco wonders whether “read the code and you’ll get it” is really an adequate description of a file format
  2. In the much-neglected Source Code for Biology and Medicine, Bioinformatics Computational Journal (BCJ) – a framework for conducting and managing computational experiments

I like the concept of workflows – really, I do – and I understand that they are used widely in industry: biotech, pharma, drug design and so on. But I predict that they will never find wide application in academic biological sciences research. Why? Because in my experience it’s essentially impossible to convince biologists that things like standards, file formats, appropriate software tools, clean code and logical organisation of computational data are important. Let me give you a typical example of a “bioinformatics problem” in academia:

Dear Neil,
Here are the sequences that you asked for. They are in fasta format, except that I’ve marked the acetylation sites with a “*” and after that, a score in square brackets.

Gee thanks – oh, it’s a Word file too, better and better. Taking my cue from Rosie, I give you the Saunders principle:

The first step in any collaboration is to reformat the data sent by your collaborators.

Written by nsaunders

December 4, 2007 at 10:19 am

6 Responses

Subscribe to comments with RSS.

  1. HI Neil

    I think part of the burden in on us (bioinformaticians). Even among people working in the field these things are common. It is an uphill battle trying to convince other people to use SVN, standardize inputs/outputs, create a design before coding, etc.

    I share your pain. And Bosco’s.

    Paulo

    December 4, 2007 at 12:12 pm

  2. [...] The Saunders principle reads thusly: The first step in any collaboration is to reformat the data sent by your collaborators. [...]

  3. [...] easily founder on an inappropriately structured dataset (this is actually just a rephrasing of the Saunders Principle). It will be by enabling easy conversion between different formats that we might approach a [...]

  4. [...] :”The Saunders Principle“, “Comment on the  saunders principle from Chris a Miller” [...]

  5. [...] the Saunders principle wherein Bioinformaticians struggle with inconsistent formats , prompting code-itch (Hari) to [...]

  6. [...] the latest podcast earlier today. Heavy discussion around the Saunders principle and the tension between bioinformaticians and bench [...]


Comments are closed.