May as well begin 2014 where we left off: complaining about the attitude of scientific publishers regarding reproducible computational research.
I had a “Twitter blurt”. That’s when you read, react and tweet. Happens to the best of us. With hindsight, it was perhaps a little harsh:
The link is to an editorial in Nature Genetics, “Credit for code.” It points out, quite rightly, that “review, replication, reuse and recognition are all incentives to provide code” in research publications. After that promising start though, things get a little strange.
The article is written in a rather awkward, unconvincing style which suggests the editor(s) are not familiar nor comfortable with the subject. Phrases like “instantiated in software written for computers and other laboratory machines” sound, well, just weird. As for “it is also useful to offer the code actually used nonexclusively to the journal in a supplementary text or archived file” – first, that barely makes sense and second, it’s the legalese of people more accustomed to coaxing authors into giving up copyright. It’s unlikely to sit well with many scientific programmers.
The article uses CRAN and Bioconductor as examples of good practice in scientific software development, but again the tone is a little odd.
The journal has sufficient experience with these resources to endorse their use by authors. We do not yet provide any endorsement for the suitability or usefulness of other solutions but will work with our authors and readers, as well as with other journals, to arrive at a set of principles and recommendations.
What are they trying to say? “Our authors seem to use R a lot, so we’re guessing it’s good and besides, we don’t know about anything else”? There’s a substantial and active online community which has already developed principles and recommendations for publishing computational research. I’d suggest the editors get started by visiting Software Carpentry, searching Titus’s blog and reading this Ten Simple Rules article.
The last paragraph is the reason for my “Twitter blurt”. It begins:
If these best practices are not possible, there are ways not to make the current situation worse.
I’d rather we – especially the journals – strive for best practice, rather than adopt an air of resignation. It gets worse, though:
If none of these solutions are feasible, please do declare when there is code involved in the work, even if it is proprietary or unavailable, and provide equations or algorithms that enable a reader to understand and replicate analytical decisions made with the research software.
Few things are more frustrating, or more likely to result in irreproducibility and error, than trying to reconstruct a computational analysis based on a prosaic description of an algorithm in a research article. Yet this is a very typical part of the working day in my field (bioinformatics) and I imagine, in many others.
I may have blurted, but 12 retweets and 10 favourites suggest I hit a chord with a few people. As I suggested in a reply, I’d rather see journals leading the way by mandating standards for publishing computational research, rather than making weak suggestions.