Just when I was beginning to despair at the state of publicly-available microarray data, someone sent me an article which…increased my despair.
The article is:
Deriving chemosensitivity from cell lines: Forensic bioinformatics and reproducible research in high-throughput biology (2009)
Keith A. Baggerly and Kevin R. Coombes
Ann. Appl. Stat. 3(4): 1309-1334
It escaped my attention last year, in part because “Annals of Applied Statistics” is not high on my journal radar. However, other bloggers did pick it up: see posts at Reproducible Research Ideas and The Endeavour.
In this article, the authors examine several papers in their words “purporting to use microarray-based signatures of drug sensitivity derived from cell lines to predict patient response.” They find that not only are the results difficult to reproduce but in several cases, they simply cannot be reproduced due to simple, avoidable errors. In the introduction, they note that:
…a recent survey [Ioannidis et al. (2009)] of 18 quantitative papers published in Nature Genetics in the past two years found reproducibility was not achievable even in principle for 10.
You can get an idea of how bad things are by skimming through the sub-headings in the article. Here’s a selection of them:
Read the rest…