I’ve had a half-formed, but not very interesting blog post in my head for some months now. It’s about a conversation I had with a PhD student, around 10 years ago, after she went to a bioinformatics talk titled “Excel is not a database” and how she laughed as I’d been telling her that “for years already”. That’s basically the post so as I say, not that interesting, except as an illustration that we’ve been talking about this stuff for a long time (and little has changed).Anyway, we have something better. I was exploring PubMed Commons, which is becoming a very good resource. The top-featured comment looks very interesting (see image, right).
Intrigued, I went to investigate the Database of Cross-contaminated or Misidentified Cell Lines, hovered over the download link and saw that it’s – wait for it – a PDF. I’ll say that again. The “database” is a PDF.
The sad thing is that this looks like very useful, interesting information which I’m sure would be used widely if presented in an appropriate (open) format and better-publicised. Please, biological science, stop embarrassing yourself. If you don’t know how to do data properly, talk to someone who does.