- Bosco wonders whether “read the code and you’ll get it” is really an adequate description of a file format
- In the much-neglected Source Code for Biology and Medicine, Bioinformatics Computational Journal (BCJ) – a framework for conducting and managing computational experiments
I like the concept of workflows – really, I do – and I understand that they are used widely in industry: biotech, pharma, drug design and so on. But I predict that they will never find wide application in academic biological sciences research. Why? Because in my experience it’s essentially impossible to convince biologists that things like standards, file formats, appropriate software tools, clean code and logical organisation of computational data are important. Let me give you a typical example of a “bioinformatics problem” in academia:
Here are the sequences that you asked for. They are in fasta format, except that I’ve marked the acetylation sites with a “*” and after that, a score in square brackets.
Gee thanks – oh, it’s a Word file too, better and better. Taking my cue from Rosie, I give you the Saunders principle:
The first step in any collaboration is to reformat the data sent by your collaborators.
6 thoughts on “Two related stories”
I think part of the burden in on us (bioinformaticians). Even among people working in the field these things are common. It is an uphill battle trying to convince other people to use SVN, standardize inputs/outputs, create a design before coding, etc.
I share your pain. And Bosco’s.
Pingback: Chris Miller’s Blog » Blog Archive » The Saunders principle
Pingback: Science in the open » Connecting the dots - the well posed question and code as a liability
Pingback: Code-itch » Blog Archive » Why Bioinformaticians have to grin and bear it!
Pingback: Coast to Coast Bio Podcast » Blog Archive » Coast to Coast Bio podcast #3
Pingback: Coast to Coast Bio #3 is now available : business|bytes|genes|molecules
Comments are closed.