PubMed retractions report has moved

May 23, 2018 / nsaunders

A brief message for anyone who uses my PubMed retractions report. It’s no longer available at RPubs; instead, you will find it here at Github. Github pages hosting is great, once you figure out that docs/ corresponds to your web root :)

Now I really must update the code and try to make it more interesting than a bunch of bar charts.

PubMed retraction reporting update

March 24, 2015 / nsaunders / 1 Comment

Just a quick update to the previous post. At the helpful suggestion of Steve Royle, I’ve added a new section to the report which attempts to normalise retractions by journal. So for example, J. Biol. Chem. has (as of now) 94 retracted articles and in total 170 842 publications indexed in PubMed. That becomes (100 000 / 170 842) * 94 = 55.022 retractions per 100 000 articles.

Top 20 journals, retracted articles per 100 000 publications

This leads to some startling changes to the journals “top 20” list. If you’re wondering what’s going on in the world of anaesthesiology, look no further (thanks again to Steve for the reminder).

PMRetract: PubMed retraction reporting rewritten as an interactive RMarkdown document

March 23, 2015March 24, 2015 / nsaunders / 4 Comments

Back in 2010, I wrote a web application called PMRetract to monitor retraction notices in the PubMed database. It was written primarily as a way for me to explore some technologies: the Ruby web framework Sinatra, MongoDB (hosted at MongoHQ, now Compose) and Heroku, where the app was hosted.

I automated the update process using Rake and the whole thing ran pretty smoothly, in a “set and forget” kind of way for four years or so. However, the first era of PMRetract is over. Heroku have shut down git pushes to their “Bamboo Stack” – which runs applications using Ruby version 1.8.7 – and will shut down the stack on June 16 2015. Currently, I don’t have the time either to update my code for a newer Ruby version or to figure out the (frankly, near-unintelligible) instructions for migration to the newer Cedar stack.

So I figured now was a good time to learn some new skills, deal with a few issues and relaunch PMRetract as something easier to maintain and more portable. Here it is. As all the code is “out there” for viewing, I’ll just add few notes here regarding this latest incarnation.
Continue reading →

Just how many retracted articles are there in PubMed anyway?

March 20, 2015March 22, 2015 / nsaunders / 3 Comments

I am forever returning to PubMed data, downloaded as XML, trying to extract information from it and becoming deeply confused in the process.

Take the seemingly-simple question “how many retracted articles are there in PubMed?”
Continue reading →

It’s #overlyhonestmethods come to life!

January 31, 2013January 31, 2013 / nsaunders / 1 Comment

Retraction Watch reports a study of microarray data sharing. The article, published in Clinical Chemistry, is itself behind a paywall despite trumpeting the virtues of open data. So straight to the Open Access Irony Award group at CiteULike it goes.

I was not surprised to learn that the rate of public deposition of data is low, nor that most deposited data ignores standards and much of it is low quality. What did catch my eye though, was a retraction notice for one of the articles from the study, in which the authors explain the reason for retraction.
Read the rest…

Reproducibility: releasing code is just part of the solution

August 21, 2012August 21, 2012 / nsaunders / 2 Comments

This week in Retraction Watch: Hypertension retracts paper over data glitch.

The retraction notice describes the “data glitch” in question (bold emphasis added by me):

…the authors discovered an error in the code for analyzing the data. The National Health and Nutrition
Examination Survey (NHANES) medication data file had multiple observations per participant and
was merged incorrectly with the demographic and other data files. Consequently, the sample size was
twice as large as it should have been (24989 instead of 10198). Therefore, the corrected estimates of
the total number of US adults with hypertension, uncontrolled hypertension, and so on, are significantly
different and the percentages are slightly different.

Let’s leave aside the observation that 24989 is not 2 x 10198. I tweeted:

"an error in the code for analyzing the data" - http://t.co/ZlWeK26B. Entirely avoidable if methods were published in full.
— Neil Saunders (@neilfws) August 17, 2012

Not that simple though, is it? Read on for the Twitter discussion.
Read the rest…

PMRetract: now with rake tasks

July 5, 2012July 5, 2012 / nsaunders

Bioinformaticians (and anyone else who programs) love effective automation of mundane tasks. So it may amuse you to learn that I used to update PMRetract, my PubMed retraction notice monitoring application, by manually running the following steps in order:

Run query at PubMed website with term “Retraction of Publication[Publication Type]”
Send results to XML file
Run script to update database with retraction and total publication counts for years 1977 – present
Run script to update database with retraction notices
Run script to update database with retraction timeline
Commit changes to git
Push changes to Github
Dump local database to file
Restore remote database from file
Restart Heroku application

I’ve been meaning to wrap all of that up in a Rakefile for some time. Finally, I have. Along the way, I learned something about using efetch from BioRuby and re-read one of my all-time favourite tutorials, on how to write rake tasks. So now, when I receive an update via RSS, updating should be as simple as:

rake pmretract

In other news: it’s been quiet here, hasn’t it? I recently returned from 4 weeks overseas, packed up my office and moved to a new building. Hope to get back to semi-regular posts before too long.

Reproducible research: three links that made me think

January 27, 2012 / nsaunders / 3 Comments

I’m constantly amazed, bemused and troubled by how little published scientific research is genuinely reproducible, in that you or I (or even the original authors) could go back and check the results. Three examples from around the Web converged in my mind this week.
Read the rest…

Monitoring PubMed retractions: updates

August 16, 2011 / nsaunders / 6 Comments

PubMed cumulative retractions 1977-present

There’s been a recent flurry of interest in retractions. See for example: Scientific Retractions: A Growth Industry?; summarised also by GenomeWeb in Take That Back; articles in the WSJ and the Pharmalot blog; and academic articles in the Journal of Medical Ethics and Infection & Immunity.

Several of these sources cite data from my humble web application, PMRetract. So now seems like a good time to mention that:

The application is still going strong and is updated regularly
I’ve added a few enhancements to the UI; you can follow development at GitHub
I’ve also added a long-overdue about page with some extra information, including the fact that I wrote it :)

Now I just need to fix up my Git repositories. Currently there’s one which pushes to GitHub and a second, with a copy of the Sinatra code for pushing to Heroku, which isn’t too smart.

Monitoring PubMed retractions: a Heroku-hosted Sinatra application

December 22, 2010December 22, 2010 / nsaunders / 1 Comment

In a previous post analysing retractions from PubMed, I wrote:

It strikes me that it would be relatively easy to build a web application (Rails, Heroku), which constantly monitors retraction data at PubMed and generates a variety of statistics and charts.

“Relatively easy” it was. Let me introduce you to PMRetract, my first publicly-available web application.
Read the rest…

What You're Doing Is Rather Desperate

Notes from the life of a [data] scientist

retraction