Can every workflow be automated?

Some random thoughts for a Friday afternoon.

Many excellent posts by Deepak on the topic of workflows have got me thinking about the subject. I very much like the notion that all analysis in computational biology should be automated and repeatable, so far as is practicable. However, I’ve not yet experienced a “workflow epiphany”. There are some impressive and interesting projects around, notably Taverna and myExperiment, but I see these as prototypes and testbeds for how the future might look, rather than polished solutions usable by the “average researcher”.

I also can never quite escape the feeling that this type of workflow doesn’t describe how many researchers go about their business, at least in academia. Wrong directions, dead ends, trial and error, bad decisions. To me a workflow is rather like a scientific paper: an artificial summary of your work that you put together at the end, describing an imaginary path from starting point to destination that you couldn’t know you were going to follow when you set out. Useful for others who want to follow the same path, less so for the person blazing the trail. Is this in fact the primary purpose of a workflow? To allow others to follow the same path, rather than to plan your own?

I wonder in particular about operations where manual intervention and decision making is required. In structural biology for instance, I often see my coworkers doing something like this:

  • Open experimental data (e.g. electron density) in a GUI-based application
  • “Fiddle” with it until it “looks right”
  • Save output

How do you automate that middle step? It may be that the operation is described using parameters which can be saved and run again later, but a lot of science seems to rely on a human decision as to whether something is “sensible”.

I don’t know if we can capture everything that we do in a form that a machine can run. Perhaps workflows highlight to us the difference between research versus analysis; a creative thought process versus a set of algorithms.

9 thoughts on “Can every workflow be automated?

  1. Neil, I kinda agree. Workflows are great for prototyping and for locking down processes and then distributing them. I don’t think today’s workflow tools necessarily help the creative process, although one could (one hopes) that would change in the future.

  2. I agree Neil. I’ve tried to use workflows in the past, but the investment is rather for trial and error work. I’d like to see Taverna as easy to use and share as yahoo pipes, then there would be a lot less overhead.

  3. I agree that workflows can’t really capture the creative process of science, particularly the ‘fiddle till you get it right’ bit. But I don’t think that’s what they’re for. To me workflows are one of two things; they either describe a very small part of the overall experiment, the kind of things that we capture in a single post in our notebook often, how do you do a plasmid prep, how do you label a protein. These ‘real world’ workflows may be quite a good way of describing specific protocols to people.

    Conversely they are for automating repetitive tasks. At our recent workshop we talked about a few examples that I do think will work well and hope to write the blog post soon. Processing of multiple similar datasets is a good example here. The processing of diffraction datasets to structures is a good, if extreme, example of this kind of thing.

    But for this to be useful I certainly agree that Taverna needs to be as easily useable as pipes. Its no good having to go to a specialist to do an everyday task. And for e.g. Taverna to get traction as a widely used tool, it has to be used for everyday tasks.

  4. My concern lies more along the availability, persistence and general promotion of the webservices that things like Taverna require. I agree I think Taverna is great (and I’m sure Taverna 2.0 even greater) but repeatability of workflows is going to be hard when the webservices they run on are up and down like yo-yo’s..

    Discovery of appropriate services, and their description I think are also issues that need addressing further before widespread adoption. MyExperiment is definitely a step in the right direction.

  5. Hi Neil, I agree with you that Taverna is currently too hard to use for some people, but its getting easier all the time. myExperiment now allows users to run workflows from the browser (like Yahoo! Pipes!) , hopefully this is a step in the right direction because it allows people to execute complex workflows that other people have built.

  6. Pingback: Mailund on the Internet » On workflows in science

  7. Pingback: BioBlogs 19: Bioengineering « O’Really? at

  8. Pingback: Semi-automated workflows - Taverna Interaction Service « Freelancing science

  9. Workflows presently are like live CDs of popular distros, it is fun to play with but nothing useful emerges. Ultimately one has to install on the hard drive, ‘ “Fiddle” with it until it “looks right” ‘, get out something useful!
    But all said, aren’t workflows meant to automate things already created using ‘traditional’ [pen-paper?] approaches.

Comments are closed.