I’m going to be lazy and point you to some interesting discussion over at Cameron’s blog on the use of structured data to describe experiments: part 1; part 2; part 3.
My experience of discussing electronic lab notebooks, which is mostly from biochemistry/molecular biology labs, is that many biologists are quite resistant to the idea of structured data. I think one reason that the paper notebook persists is that people like free-form notes. You may believe that a lab notebook is a highly-ordered record of experiments but trust me, it’s not uncommon to see notes such as “Bollocks! Failed again! I’m so sick of this purification…” scrawled in the margins.
My take on the problem is that biologists spend a lot of time generating, analysing and presenting data, but they don’t spend much time thinking about the nature of their data. When people bring me data for analysis I ask questions such as: what kind of data is this? ASCII text? Binary images? Is it delimited? Can we use primary keys? Not surprisingly this is usually met with blank stares, followed by “well…I ran a gel…”.
I do believe that any experiment can be described in a structured fashion, if researchers can be convinced to think generically about their work, rather than about the specifics of their own experiments. All experiments share common features such as: (1) a date/time when they were performed; (2) an aim (“generate PCR product”, “run crystal screen for protein X”); (3) the use of protocols and instruments; (4) a result (correct size band on a gel, crystals in well plate A2). The only free-form part is the interpretation. Is the result good, bad, expected? What to do next? My simplistic view is that an XML element named “notes” of data type “string” covers anything free-form that somebody might want to say about their experiment. Now we just have to design the schema, build a nice forms-based web interface and force everyone in the lab to use it :)
One more point: we need to teach students that every activity leading to a result is an experiment. From my time as a Ph.D. student in the wet lab, I remember feeling as though my day-to-day activities: PCR reactions, purifications, cloning weren’t really experiments – they were just means to an end. Experiments were clever, one-shot procedures performed by brilliant postdocs to answer big questions. When I started to view each step: obtaining the right size PCR product, sequencing it, ligation, transformation, plasmid purification etc. as an experiment in its own right, with a defined goal, I felt a lot better about myself. Break your activities into steps and ways to describe them as structured data should suggest themselves.
I thought you were just pointing to Cameron’s posts :).
But seriously, this one and Cameron’s posts are very very good. Can’t wait to spend some time taking them in.
In my day job we face these challenges all the time from data being generating by the gigabytes of all types. By and large things are structured and usually process driven, but their is a lot of variation, leading to all kinds of challenges in software design.
A bit of a sidenote, but the New Yorker recently had a fascinating artilce titled The Checklist which describes how hospitals have slowly structured their thousands of steps for the hundreds of routine procedures that must occur for even the simplest medical condition into pragmatic checklists of tasks. By codifying this seemingly trivial activity, enormous operational improvements were obtained. Draw lessons about structured steps in experiments if you will, but be glad that someone will not be dying if you make a mistake.
Pingback: Unilever Centre for Molecular Informatics, Cambridge - petermr’s blog » Blog Archive » Structured Experiments and OR08
I don’t see the point of eletronic lab notebooks either and I am a computer scientist. My bench is miles away from my computer and while pipetting I need to look at my notes. It’s easier to work with a pipette in one hand and a pen in the other hand than with a whole big keyboard / screen around my bench, where there really isn’t enough space for all of this anyways.
Just FYI, the suggestions made are similar to work being done by NIAID’s http://www.immport.org and by the MIBBI community (http://mibbi.sourceforge.net/) with respect to the checklists.
@max – I thought you were a computer scientist; what are you doing with a pipette? ;)
I take your point – practically, it’s much easier to record at the bench in a paper notebook. Have a look at Cameron’s blog for some ideas about how in the future, a lot of this recording could be done by the machines in the lab.
A lot of this discussion is based on open notebook science, or at least shared notebook science. If you’re interested in sharing raw experimental data: with the group, colleagues or the outside world, electronic is the only option.
Pingback: A data model for life-science experiments; FuGE « peanutbutter
Pingback: Science in the open » Data models for capturing and describing experiments - the discussion continues
True, but it’s much easier to define a model after you have all the data/have finished the experiment (which is the point at which it’s getting passed to the bioinformatician, or you’re repeating the experiment several times). To define all the fields beforehand is really difficult.