Archive for January 16th, 2007
The scriptome
Can’t recall if I’ve mentioned the scriptome project before. I know we’ve covered it at Nodalpoint, where I might have been unintentionally rude about it.
There’s a nice write-up of the project over at Perl.com, very much in the Perl.com style. The author points out that Scriptome is not just a collection of data-munging tools; it’s an attempt to make biologists think about how much time they waste doing repetitive tasks that could be easily automated if they bothered to learn some simple skills. From the introduction:
Have you ever renamed 768 files? Merged the content from 96 files into a spreadsheet? Filtered 100 lines out of a 20,000-line file?
Have you ever done these things by hand?
Disciples of laziness–one of the three Perl programmer’s virtues–know that you should never repeat anything five times, let alone 768. It dismayed me to learn that biologists do this kind of thing all the time.Experimental biologists increasingly face large sets of large files in often-incompatible formats, which they need to filter, reformat, merge, and otherwise munge. Biologists who can’t write Perl (most of them) often end up editing large files by hand. When they have the same problem a week later, they do the same thing again–or they just give up.
Don’t even get me started on people who use Word for sequence analysis.


