August 28, 2009
Since I haven’t posted for 14 days, what better (and lazier) way to post something than to surf over to a 14-day summary from the Life Scientists Group and link to the top ten items!
- Review process files in the EMBO Journal – but why only for “the majority of papers”?
- How XML threatens Big Data. Or not. How JSON might be an alternative – or not.
- Solve any computer problem – with this classic XKCD flowchart.
- Science reviews the revolution in ‘strategic scientific reading’ – are they way behind the curve, or providing a useful summary for the uninitiated?
- Best practice in microbial genome annotation – spirited discussion on the nature of best bioinformatics practice.
- FriendFeed Life Scientists user survey – no further word on whether this will happen.
- 50 Years of Structure – link to a JMB review on the early days of structural biology.
- Reflections on Science Online London 2009
- Workflow tools that speak SOAP?
- Advice on cleaning up a protein sample – a nice example of useful discussion from the group.
Who knows, this could become a semi-regular feature.
August 14, 2009
I use Google Reader to subscribe to the RSS feeds from journals that interest me (see my public page). I’m also a big fan of CiteULike as a reference management system.
For a long time I’ve thought: it would be great if GReader handled journal articles more efficiently. Rather than going from link in GReader -> article at journal -> CiteULike bookmark -> back to GReader, how about “post directly from GReader?”
With Google Reader’s new send-to feature, you can do just that. See this forum post for the details. Also, take a look at this how-to for a quick way to post to CiteULike by entering a PubMed PMID, DOI or ISBN identifier in the address bar.
August 6, 2009
R is terrific, of course, for all your statistical needs. But those data structures! “Everything is a list.” Leading to such wondrous ways to access variables as “p <- Meta(gds)$platform", or "last <- mylist[][length(mylist[])]".
Sometimes, you want something more familiar. An array, a hash, a hash of arrays. Or, you may need to access R data in the language of your choice – e.g. as part of a Rails project.
In Ruby, IRB is your friend. On the right, an IRB session in which we invoke RSRuby, load the GEOquery library from Bioconductor, fetch a dataset from the GEO database and examine the metadata that describes the experiment. Result: a ruby hash of arrays, where the keys are covariate types (“sample, disease.state, description”) and the values covariate names for each column (sample) in the dataset. Now easy to access using:
columns.each_pair do |key,val|
# do something with keys
val.each do |element|
# do something with values