Retraction Watch reports a study of microarray data sharing. The article, published in Clinical Chemistry, is itself behind a paywall despite trumpeting the virtues of open data. So straight to the Open Access Irony Award group at CiteULike it goes.
I was not surprised to learn that the rate of public deposition of data is low, nor that most deposited data ignores standards and much of it is low quality. What did catch my eye though, was a retraction notice for one of the articles from the study, in which the authors explain the reason for retraction.
Two phrases in particular stand out:
we discovered an error in the data fed into the software
This decision was based on the instructions from the software during the initial data feed process
The language used strongly suggests a process whereby data was blindly “fed” into software, with little or no understanding of either how the software worked or the statistical methods employed. To quote Bill in our Twitter discussion:
If you are in this situation, seek help. Talk to a friendly local statistician. Or if there isn’t one, do your research on the Web before publishing. At the very least, try to ensure that what you’re doing corresponds broadly with what most other people in the field would consider “best practice” – even if figuring out what that is, from the literature, is not always easy.