How to (wilfully?) misunderstand -omics science

A recent biography of crystallographer Max Perutz sounds well worth reading (Nature review, subscription only – sorry). Unfortunately, the reviewer (infamous for their Genome Biology editorials) just can’t resist slipping in this ill-informed dig:

What would he have made of the recent International Structural Genomics Initiative, I wonder, which aims to turn out massive numbers of protein crystal structures without regard to biological or biochemical function?

Some people just don’t get the biological informatics revolution, do they? Let me explain. In the past, researchers worked on narrow problems – often a single biological process, organism or even a molecule, for years. The most likely factor influencing their choice of system was – the interests of their first supervisor.
Now we have a wonderful tool called bioinformatics. Bioinformatics allows us to take large datasets – all sequences, all structures, all whatever you like and ask: “what’s interesting?” By applying appropriate computational filters (the devising of which is a creative research process), we can say: show me the putative metalloproteins, the putative DNA-binding proteins, the putative phosphorylation sites. Having reduced “everything” down to “interesting stuff”, we can then do some focused experiments.

In other words – deciding what to work on is no longer an arbitrary decision. The data tell us where to go.

The notion that the initial dataset itself is the be-all and end-all is simplistic and wrong. The notion that genomics researchers have no interest in understanding biological function is insulting.

7 thoughts on “How to (wilfully?) misunderstand -omics science

  1. I totally didn’t notice that it was the same person from Genome Biology — I noticed the dig, but I just assumed that the reviewer was yet another curmudgeonly biologist of a previous generation saying the scientific equivalent of “things were so much better back in my day; I just don’t know what’s wrong with the kids of today.”

  2. Guys, you are a little tough on Gregory Petsko here. It would be really interesting to hear what Perutz (and others of his generation) view the hypothesis free research that we do. Was the money blown on the Struc Genomics initiatives really worth it? Many crystallographers are very disappointed by their results.
    I am all for a all-out-descriptions of what ever biological data comes in reach but if you are driven by the right question, chances are much higher that you come up with something worthwhile.

  3. How ironic that the comment is in the context of Perutz, whose lab waded through reams of diffraction patterns in the late 50’s [haemoglobin labelled at each residue with mercury] to solve a single structure. He, of all, people, would have appreciated the breadth made possible with automation.

  4. Roland, I would like to see an independent assessment of the value of structural genomics programs. Like many, I suspect that a lot of them have failed to live up to the hype, but I have no data either way.

    However, I don’t believe that “-omics-scale” science is so different to so-called hypothesis-driven science. It’s just that you get a bunch of data first, then generate the hypotheses. Or as Jonathan once commented elsewhere on this blog, my hypothesis is that genomes are interesting and informative. Personally, I find that kind of data-driven discovery science far more exciting than the old-fashioned “I think gene X does this, let’s knock it out, hey I was right” type of science. Not to knock the efforts of past generations though, who did after all supply us with much of our data – albeit incredibly slowly, a molecule at a time.

  5. The structural genomics initiatives have only failed in comparison to the original, somewhat overhyped, expectations. In general, the technologies developed and the wealth of structures available is a goldmine of information.

    I am in the middle – I see the virtue in the top down approach, where we are given a blueprint and we try and find out what various parts do. On the flip side, a bottom up approach has its virtues, but the wonderful thing is that today, it is always in the context of the bigger picture, i.e. in the presence of a ton of genomic and/or structural information. In the end, our goal should be to make the two meet, so that we can choose which direction to go depending on the problem in front of us.

  6. Big sigh. Would you please tell this to the editors that have recently rejected a half-wet, half-dry paper out of hand, because the majority of what is interesting in it (aside from analysis of a new cell type) has been found through elementary bioinformatics, and we have not yet tackled the thousands of hypotheses we just generated but want to share the wealth in the meantime (whilst getting an impact factor-dependent job for my postdoc in the interim)?

    “By applying appropriate computational filters (the devising of which is a creative research process), we can say: show me the putative metalloproteins, the putative DNA-binding proteins, the putative phosphorylation sites. Having reduced “everything” down to “interesting stuff”, we can then do some focused experiments.

    And why hasn’t anyone much outside of programming understood just how creative you have to be to think of and devise those filters?

  7. Sorry Alethea, well known fact that anything creative which lies on the cracks between disciplines is not allowed to be published. You have the classic problem that from each side of the crack the reviewers see only see the bits that look conventional to them. On the protein side ‘well they’ve just automated something by hand’ and on the computational side ‘well they’re just filters aren’t they?’.

    We had a similar problem with a paper where an existing statistical physics model had been applied to develop a new way of calculating DNA melting temperatures. The interesting bit is not that you can calculate melting temperatures, nor that the model can be tweaked to give you that result,but by combining the two you have an accurate way of predicting Tms that is actually based on a proper physical model. The first journal we sent it to said we should send it to a journal focussing on DNA melting temperatures or statistical physics. Now to be fair I shouldn’t complain about this because it ended up in Nature Physics, but I despair of anyone who is actually looking for methods to calculate Tms ever finding it there.

Comments are closed.