Monthly Archives: May 2007

Nature snippets

  • Academics strike back at spurious rankings
    “Thomson Scientific’s ISI citation data are notoriously poor for use in rankings; names of institutions are spelled differently from one article to the next, and university affiliations are sometimes omitted altogether. After cleaning up ISI data on all UK papers for such effects, the Leeds-based consultancy Evidence Ltd, found the true number of papers from the University of Oxford, for example, to be 40% higher than listed by ISI, says director Jonathan Adams.”

    Someone explain to me: why do scientists willingly submit to assessment using the rubbish from ISI?

  • Complex set of RNAs found in simple green algae
    “A class of RNA molecule, called a microRNA, has been found in a unicellular green alga. The discovery, made independently by two labs, dismantles the popular theory that the regulatory role of microRNAs in gene expression is tied to the evolution of multicellularity.”

    Like I keep saying – it’s the biological process that’s important, not the organism. Why this constant surprise based on ill-founded notions of complexity?

  • Algae bloom again
    “A handful of pioneers are trying to bring algae-based biofuels back from a near-death experience.”

Google Gears

Google Gears, according to a post on every productivity blog today (here’s one), is the latest cool Google tool. There’s a new Gears blog too.

So far as I understand, the only current application lets you read feeds offline. Perhaps I’m missing something but – what exactly is the point of that?

  • I use feeds to alert me to interesting website content – which I then want to visit and read. Can’t really do that offline. . .
  • When I’m offline it’s for a good reason – to relax, stop working and have a break from feeds, email and all the rest of it. I don’t need to carry a reminder of what I’m not reading everywhere I go – especially when I can’t do anything about it. If you’re offline – well, be offline, not semi-online. Smell some flowers or something.

Gears today covers what we think is the minimal set of primitives required for offline apps. It is still a bit rough and in need of polish, but we are releasing it early because we think the best way to make Gears really useful is to evolve it into an open standard

To be honest, I’m starting to tire of very-beta early release Google apps. They’re great when they work well, but Google seem very slow to improve those that don’t. I’m thinking Groups (still no integration with other apps), Reader (still no search), iGoogle (seems to break a little more every day at the moment). Are they overstretching themselves? Or struggling to prioritise? Time will tell.

Perhaps Google, by releasing their APIs, are just relying on the open source community to do the work. There are certainly lots of great enhancements around (Greasemonkey scripts and so on), but I’d still like to see a little more effort by Google to polish certain products before release.

Make your own NCBI handbook

My previous post reminded me of an Australian company that used to sell the NCBI Handbook on a CD for AUD 35. Yes, this NCBI handbook – available for free at the NCBI website. The only drawback is that if you want to download a copy, it’s distributed as 24 separate PDF files.

Well you could be stupid and pay 35 dollars plus postage for a free resource – or you could create a single PDF using some freely-available software and a small shell script. Specifically you’ll need:

  • wget – to fetch files over HTTP
  • PDFjam – to concatenate PDF files into one file
  • xargs – to submit the PDF filenames to pdfjoin, part of the PDFjam package

All of these are either available or easy to install on any Linux machine. And possibly other platforms, for all I know.

Here’s a shell script, ncbihbk.sh, to fetch the PDFs and stitch them together. Notice how the sneaky NCBI have named 3 of the files using a different convention to the other 21. I’m sure that it wasn’t deliberate.

#!/bin/sh
# ncbihbk.sh
# fetch NCBI handbook chapters 1-24 and concatenate

for i in `seq 1 24`
  do
if [ $i -eq 5 -o $i -eq 13 -o $i -eq 18 ]; then
# chapters 5, 13, 18
    echo "Fetching ch$i.pdf..."
    wget -q http://www.ncbi.nlm.nih.gov/books/bookres.fcgi/handbook/ch$i.pdf
    echo ch$i.pdf >> filelist
# don't bash the servers!
    sleep 3
else
# all other chapters
    echo "Fetching ch${i}d1.pdf..."
    wget -q http://www.ncbi.nlm.nih.gov/books/bookres.fcgi/handbook/ch${i}d1.pdf
    echo ch${i}d1.pdf >> filelist
    sleep 3
fi
done

# concatenate PDFs from list
echo "Concatenating PDF files..."
cat filelist | xargs pdfjoin --outfile ncbi.pdf

echo "Output in ncbi.pdf"
exit 0

Type “sh ncbihbk.sh”, sit back and relax. Voilà, the NCBI handbook in all its 407-page glory. Another triumph for free software. To concatenate any collection of PDF files, just run “pdfjoin –outfile mypdf.pdf file1.pdf file2.pdf file3.pdf. . .”

To be honest, it’s probably as easy to browse the handbook online.

Willing to learn

Via Nodalpoint: Paulo Nuin (Genedrift.org, Blind.Scientist) is running a series of interviews with scientists. I enjoyed Brian Golding’s answer to a question about companies who package and sell freely-available open source bioinformatics software:

If it is important to people to have a package that they need not worry about piecing together then they can pay for it. I am however, surprised to find any people that have paid big bucks for packages that I piece together for free often with a more powerful interface. Difference is, I am willing to spend some time to try to learn.

Selling free software is common practice for the very few biotech startups in Australia and to me, smacks of cashing in on ignorance to make a quick buck, rather than contributing to bioinformatics education. It’s always annoyed me immensely.

A welcome surprise

I attended a departmental seminar today, entitled “drug discovery in tropical diseases” and given by Matthew Todd.

Much of the talk focused on the synthesis of praziquantel, a drug used to treat schistosomiasis. Interesting, though organic chemistry isn’t my strong point. However, it turns out that Matthew is one of the faces behind The Synaptic Leap and he devoted a few slides to TSL, Jean-Claude Bradley’s Useful Chemistry blog and the concept of open science.

This might be the first time in 7 years that I’ve heard anyone in Australia talk about open science in a university seminar. Note this day and may there be many more.

Science snippets from last week

There are weeks when you skim through the TOCs of your favourite journals and nothing really grabs your attention. And then there are weeks like last week, when there’s almost too much to read. This is where a system to grab a web page comes into its own – you just mark the page using e.g. Zotero, Google Notebook or del.icio.us and come back to it later.

Stuff that I grabbed for later from last week includes:

And from PLoS Computational Biology:

As an aside – does anyone else find that PLoS journal websites take ages to load in Firefox and use a lot of CPU?

From Science:

From Nature:

Quiet passing of pioneer

I almost missed the news that Stanley Miller, of the famous Miller-Urey experiment, recently passed away.

The experiment has been criticised in recent years for making incorrect assumptions and encouraging an over-simplified view of how life originated. However, I think it’s important to place this work in the context of its time. It showed that the basic “building block” molecules of life could be synthesised via known, understood and relatively simple chemical processes. This means that given the right environment and ingredients, biochemistry goes from being unlikely to almost inevitable, with profound implications for the likelihood of life emerging both on the earth and elsewhere.

Mathematics and biology

There’s a blog meme going around concerning the mathematical education of biologists – interesting discussion here (Keith), here (Deepak) and here (RPM).

In my experience, you pick up the maths that you need during the course of your research. That makes the question “what maths should undergraduate biologists learn?” rather difficult to answer – but I think we’re all agreed that one response is “more than they do now”. Anyway, on to those meme questions and I’m with Deepak – it’s maths, not math:
Read the rest. . .

Hardware folly

This is just a dull Linux hardware post so I’ll cut to the chase: if you use Linux, never buy an ATI graphics card. I don’t know what I was thinking – normally I research this stuff pretty thoroughly, but it seems that 95% of Linux users have suffered the same fate. Having most stuff “just work” under Ubuntu has lulled me into a false sense of security.
Read the rest. . .