Monthly Archives: July 2011

I can’t resist a word cloud: now using R!

wcloud

Top 1000 words in FriendFeed comments, ISMB 2008-2011

The wordcloud package is word clouds for R with a difference: they look great.

Of course, having just analysed online coverage of the ISMB conference, I had to run all 6 906 comments from the 2008-2011 meetings through some code. If you followed along via the Sweave code, I went as far as generating the data frame of comments, ismb.comments, then pulled the comment text into a new data frame using:

data.frame(ismb.comments$body)

It was then simply a case of following along with the excellent example code from the post Word Cloud in R, over at One R Tip A Day, limiting myself to the 1000 most-used words. Watch out, the TermDocumentMatrix() function from the tm package uses quite a lot of memory.

Result shown at right: click image for full-size version. I think that word in the centre says it all.

Analysis of ISMB coverage at FriendFeed: 2008 – 2011

ISMB/ECCB 2011 was held between July 15-19 this year and as in previous years, FriendFeed was used to cover the meeting.

Last year, I wrote a post about how to use R to analyse the coverage. I was planning something similar for 2011 when I thought: we have 4 years of ISMB at FriendFeed now – why not look at all of them?

So I did. Read on for the details.
Read the rest…

I give up

It’s what – 10 years or more? – since we began to wonder when web technologies such as RSS, wikis and social bookmarking sites would be widely adopted by most working scientists, to further their productivity.

The email that I received today which began “I’ve read 3 interesting papers” and included 1 .doc, 3 .docx and 4 .pdf files as attachments is indicative of the answer to this question, which is “not any time soon.”

I’ve given up trying to educate colleagues in best practices. Clearly, I’m the one with the problem, since this is completely normal, acceptable behaviour for practically everyone that I’ve ever worked with. Instead, I’m just waiting for them to retire (or die). I reckon most senior scientists (and they’re the ones running the show) are currently aged 45-55. So it’s going to be 10-20 years before things improve.

Until then, I’ll just have to keep deleting your emails. Sorry.

R: calculations involving months

Ask anyone how much time has elapsed since September last year and they’ll probably start counting on their fingers: “October, November…” and tell you “just over 9 months.”

So, when faced as I was today with a data frame (named dates) like this:

pmid1       year1    month1     pmid2      year2    month2
21355427    2010     Dec        21542215   2011     Mar
21323727    2011     Feb        21521365   2011     Jun
21297532    2011     Feb        21336080   2011     Mar
21291296    2011     Apr        21591868   2011     Jun
...

How to add a 7th column, with the number of months between “year1/month1″ and “year2/month2″?
Read the rest…