Searching for duplicate resource names in PMC article titles

I enjoyed this article by Keith Bradnam, and the associated tweets, on the problem of duplicated names for bioinformatics software.

I figured that to some degree at least, we should be able to search for such instances, since the titles of published articles that describe software often follow a particular pattern. There may even be a grammatical term for it, but I’ll call it the announcement colon:

eDuS: Segmental Duplication Simulator
Reveel: large-scale population genotyping using low-coverage sequencing data
RNF: a general framework to evaluate NGS read mappers
Hammock: A Hidden Markov model-based peptide clustering algorithm to identify protein-interaction consensus motifs in large datasets

You get the idea. “XXX COLON a [METHOD] to [DO SOMETHING] using [SOME DATA].”

Let’s go in search of announcement colons, using titles from the PubMed Central dataset. You can find this mini-project at Github.
Continue reading

“Open”: motivation versus definition

Tweet length: 140 characters. Quote + URL that I wanted to tweet: 160 characters. Solution: brief blog post.

the probability that people who can help each other can be connected has risen to the point that for many types of problem that they actually are

Please read the rest of Cameron’s thoughts on motivations for openness in research: Open is a state of mind.

Lots of “open goodness” in the AU/NZ region

January/February are exciting months for open [data|research|science|access] proponents in our region – by which I mean Australia and New Zealand.

First, we’ve enjoyed a speaking tour by Sir Tim Berners-Lee, during which he discussed the benefits of open data several times. I was able to attend two events in Sydney in person and a third,, by video stream. The events were the work of many people but in particular, Pia Waugh. Go follow her on Twitter, now.

Next – I wish I had been able to get to this one – the Open Research Conference on February 6-7, University of Auckland. I’m enjoying the high-quality live stream right now. Flying the flag for Sydney are Mat and Alex.

Not strictly under the “open” umbrella but worth a mention anyway: software carpentry is in town, February 7-8, just up the road from me at Macquarie University. Looking forward to hearing some reports from that.

Open Access: sometimes all it takes is the right person

We can debate the economics, complexities, details, implementation… of open access publishing for as long as we like. However, the basic principle: that publicly-funded research should be publicly-accessible seems to me at least, very obviously correct and “the right thing to do”.

So this, from April 2012, was very depressing.

Open access not as simple as it sounds: outgoing ARC boss

For those outside Australia, the ARC is the Australian Research Council. Much debate ensued in which one contributor to the comment thread wrote:

…it is particularly galling that Sheil is projecting her own simplistic understanding of open access onto its advocates. Hopefully she will be replaced at the Australian Research Council by someone who understands and supports open access.

VoilĂ .

The ARC has introduced a new open access policy for ARC funded research which takes effect from 1 January 2013. According to this new policy the ARC requires that any publications arising from an ARC supported research project must be deposited into an open access institutional repository within a twelve (12) month period from the date of publication.

I did giggle at the assumption that the author’s version of their article is by default a Word document, but then I guess that’s true for > 90% of authors.

Outcomes like this come dangerously close to restoring hope.

Can a journal make a difference? Let’s find out.

Academic journals. Frankly, I’m not a big fan of any of them. There are too many. They cost too much. Much of what they publish is inconsequential, read by practically no-one or just downright incorrect. Much of the rest is badly-written and boring. The people who publish them have an over-inflated sense of their own importance. They’re hidden behind paywalls. And governed by ludicrous metrics. The system by which articles are accepted or rejected is arcane and ridiculous. I mean, I could go on…

No, what really troubles me about journals is that they only tell a very small part of the story – the flashy, attention-grabbing part called “results”. We learn from high school onwards that a methods section should be sufficient for anyone to reproduce the results. This is one of the great lies of science. Go read any journal in your field and give it a try. It’s even the case in computation, an area which you might think less prone to the problems in reproducing wet-lab science (“the Milli-Q must have been off”).

We have this wonderful thing called the Web now. The Web doesn’t have a page limit, so you can describe things in as much detail as you wish. Better still, you can just post your methods and data there in full, for all to see, download and reproduce to their hearts content. You’d like some credit for doing that though, right?

So if you do research – any kind of research – that involves computation, your code is open-source, reusable, well-documented and robust (think: tests) and you want to share it with the world, head over to a new journal called BMC Open Research Computation, which is now open for submissions. Your friendly team of enlightened editors awaits.

More information at Science in the Open and Saaien Tist. Full disclosure: I’m on the editorial board of this journal and was invited to write a launch post.

Open Access Day

It’s Open Access Day. Mission: to broaden awareness and understanding of Open Access. Their approach: “synchro-blogging” – an attempt to get as many folks as possible to blog on the given topic at the same time.

So, to answer their questions:

  1. Why does Open Access matter to you?
  2. OA is important for many reasons: go and read this by Jonathan Eisen instead of my rambling. One that stands out for me: it signals a fundamental change in the way that information is conveyed from writers to readers and an admission that the traditional publishing process is obsolete in the internet age. We live in a world where people expect instant, relevant information in the top 20 hits from a Google search and that expectation is transferring to science too. I don’t care how prestigious you think your journal is, or whether you see yourself as some kind of “guardian of knowledge”. I want information, I want it now and if you can’t deliver, I’m going somewhere else (*).

  3. How did you first become aware of it?
  4. I honestly don’t remember, but it was some years ago. I suspect it was around the time that journals such as Nucleic Acids Research and Bioinformatics introduced an OA option for authors. I also remember quite vividly the appearance of BMC on the scene and thinking “now, this is different and exciting”.

  5. Why should scientific and medical research be an open-access resource for the world?
  6. Lots of reasons. (i) The world pays for the research and shouldn’t have to pay again to view the results. (ii) Scientists should be accountable – exactly what are we doing with your tax dollars? (iii) When information is free, many eyes can look at it and many eyes = more ideas than fewer eyes.

  7. What do you do to support Open Access, and what can others do?
  8. Hey, I’m just a postdoc – I don’t get to make influential decisions! That said, five of my last six publications are OA. Where possible, I try to submit to OA journals and I review papers for OA journals. Once in a while, I blog about OA and other “open science” issues.
    What can others do? The same and more. Read blogs that cover OA – starting with Jonathan and Bora. Understand its philosophy. Promote it in public (blogs, wikis, FriendFeed). Make it the norm, not a novelty.

(*) OK, I work in a large, relatively-funded university which subscribes to most journals – so I won’t deny myself a non-OA article on principle. Others are less fortunate.

Vindication for video

Announced first via FriendFeed (of course), Moshe from JoVE is circulating an email with exciting news. I can’t do better than to quote it here:

JoVE, the video-publication for biological research, was accepted for indexing in PubMed and MEDLINE.

JoVE is the first and only video-publication to be included in these databases maintained by the National Library of Medicine (NLM). The decision was made by the NLM advisory committee, Literature Selection Technical Review Committee, which is composed of the authorities in the field of biomedicine, such as researchers, physicians, editors, health science librarians and historians. This committee evaluates the scientific quality of publications and typically approves only 20-25% of the applications.

Inclusion in PubMed/MEDLINE is a big milestone for JoVE, and for the scientific publishing in general. It demonstrates the official acceptance of new approaches to science communication, such as video online, by the scientific community. Overall, it will increase the interest of the scientists to communicate their findings in video, making biological sciences more transparent and efficient.

Well done to the JoVE team.

Rewards, output and academia

Academia takes a very narrow view of what constitutes “output”. Rewards (such as funding or tenure) are given out on the basis of (1) publications, preferably first-author, preferably in so-called high-impact journals; (2) citations, in the same journals and (3) previous rewards – “demonstrated ability in securing funding”. I always find that last catch-22 clause particularly amusing.

I started to think about this when I read What is principal component analysis? [DOI 10.1038/nbt0308-303], in the current issue of Nature Biotechnology (subscription only). Now, I’m not criticising the article or its publication: it’s well-written, educational and a good basic overview of PCA for biologists who have not previously encountered the method. However, my first reaction was to recall a number of excellent blog posts on the same topic that I’ve read recently. For example:

The Nature Biotechnology article is recognised by academia and qualifies for academic rewards. The blog posts – which are longer, more detailed, written by enthusiastic communicators and in theory, accessible to a much wider audience (as opposed to people with a subscription to Nature Biotechnology) – are not.

It doesn’t seem right to me. How does your institution evaluate and reward “non-traditional” output?