I’m constantly amazed, bemused and troubled by how little published scientific research is genuinely reproducible, in that you or I (or even the original authors) could go back and check the results. Three examples from around the Web converged in my mind this week.
A BioStar user asks: where is the software for the method described in a Science article, “A Composite of Multiple Signals Distinguishes Causal Variants in Regions of Positive Selection.”
No-one can find it on the Web; the best we can do is a press release from 2010 stating that the software “should soon be available.” Why do the so-called flagship journals and their reviewers have so little interest in the methods used to generate data shown in papers? It’s beyond my comprehension.
Retractions on retractions
I was reviewing PubMed retraction data for 2011 (the “year of retractions”) using my PMRetract application, when I noticed:
It caught my eye for a couple of reasons. First, the authors decided to retract because they based their analyses on methods described in – another retracted article. Second, although the authors of that article don’t mention it in the retraction notice, their retraction is the result of a reproducibility study in one of my all-time favourite articles: “Deriving chemosensitivity from cell lines: Forensic bioinformatics and reproducible research in high-throughput biology.”
I note that the PLoS ONE retraction occurred about 8 months after the Nat. Med. retraction, which in turn occurred over 4 years after the original publication and just over a year after the Ann. Appl. Stat. re-analysis. This suggests to me that the PLoS ONE authors were alerted by the Nat. Med. retraction, not the excellent Baggerley et al. article. If only the latter article were published in Nat. Med., as opposed to a less widely-read applied statistics journal.
Regardless, credit to the PLoS ONE authors for a courageous retraction based on the flawed work of others.
I was somewhat surprised (and pleased) to see that org-mode, a tool for Emacs aimed at generating reproducible code, data analysis and documentation, has been written up and published in the Journal of Statistical Software.
And then cynicism set in. Is software that life scientists don’t use, published in a journal that they don’t read, really going to address the problem?
Perhaps one day, all journals will demand appropriate standards for reproducible research. Until then I’ll just have to continue feeling amazed, bemused and troubled.