Box plots. Like box plots, only…box plots.

On a rare, brief holiday (here and here, if you’re interested; both highly-recommended), I make the mistake of checking my Twitter feed:

This points me to BoxPlotR. It draws box plots. Using Shiny Server. That’s the “innovation”, presumably.

With “quilt plots” and now this, I’m starting to think that I’ve been doing science wrong all these years. If I’d been told to submit the trivial computational work I do every single day to journals, I could have thousands of publications by now.

I’m still pretty relaxed post-holiday, so let’s just leave it there.

BLATting the internet: the most frequent gene?

I enjoyed this story from the OpenHelix blog today, describing a Microsoft Research project to mine DNA sequences from web pages and map them to UCSC genome builds.

Laura DeMare asks: what was the most-hit gene?

Continue reading

Quilt plots. Like heat maps, only…heat maps

Stephen tweets:

A "quilt plot"

A “quilt plot”

Quilt plots. Sounds interesting. The link points to a short article in PLoS ONE, containing a table and a figure. Here is Figure 1.

If you looked at that and thought “Hey, that’s a heat map!”, you are correct. That is a heat map. Let’s be quite clear about that. It’s a heat map.

So, how do the authors justify publishing a method for drawing heat maps and then calling them “quilt plots”?
Read the rest…

This blog in 2013

In something of an end-of-year tradition, WordPress provides users with an effort-free blog post in the form of an annual report. Here is mine.

My ambitious plan at the start of 2013 was to aim for 4 posts a month. I managed 28 and I’m happy with that; about one every two weeks.

Looking forward to a new year of blogging. All the best to you and yours for 2014.

Career advice: switching to computational research

Laboratory work, of the “wet” kind, not working out for you? Or perhaps you just need new challenges. Think you have some aptitude with data analysis, computers, mathematics, statistics? Maybe a switch to computational biology is what you need.

That’s the topic of the Nature Careers feature “Computing: Out of the hood“. With thoughts and advice from (on Twitter) @caseybergman, @sarahmhird, @kcranstn, @PavelTomancak, @ctitusbrown and myself.

I enjoyed talking with Roberta and she did a good job of capturing our thoughts for the article. One of these days, I might even write here about my own journey in more detail.

R: how not to use savehistory() and source()

Admitting to stupidity is part of the learning process. So in the interests of public education, here’s something stupid that I did today.

You’re working in the R console. Happy with your exploratory code, you decide to save it to a file.

savehistory(file = "myCode.R")

Then, you type something else, for example:

# more lines here

And then, decide that you should save again:

savehistory(file = "myCode.R")

You quit the console. Returning to it later, you recall that you saved your code and so can simply run source() to get back to the same point:


Unfortunately, you forget that the sourced file now contains the savehistory() command. Result: since your new history contains only the single line source() command, then that is what gets saved back to the file, replacing all of your lovely code.

Possible solutions include:

  • Remember to edit the saved file, removing or commenting out any savehistory() lines
  • Generate a file name for savehistory() based on a timestamp so as not to overwrite each time
  • Suggested by Scott: include a prompt in the code before savehistory()

We’re only 10% human. According to…who?

Reading an interesting post at Genomes Unzipped, “Human genetics is microbial genomics“, which states:

Only 10% of cells on your “human” body are human anyway, the rest are microbial.

Have you read a sentence like that before? So have I. So has a reader who left a comment:

I was wondering if you have a source for “Only 10% of cells on your “human” body are human anyway, the rest are microbial”

It’s a good question. Everyone quotes this figure, almost no-one provides a reference. Let’s go in search of one.
Read the rest…

Bacteria and Alzheimer’s disease: I just need to know if ten patients are enough

You can guarantee that when scientists publish a study titled:

Determining the Presence of Periodontopathic Virulence Factors in Short-Term Postmortem Alzheimer’s Disease Brain Tissue

a newspaper will publish a story titled:

Poor dental health and gum disease may cause Alzheimer’s

Without access to the paper, it’s difficult to assess the evidence. I suggest you read Jonathan Eisen’s analysis of the abstract. Essentially, it makes two claims:

  • that cultured astrocytes (a type of brain cell) can adsorb and internalize lipopolysaccharide (LPS) from Porphyromonas gingivalis, a bacterium found in the mouth
  • that LPS was also detected in brain tissue from 4/10 Alzheimer’s disease (AD) cases, but not in tissue from 10 matched normal brains

Regardless of the biochemistry – which does not sound especially convincing to me[1] – how about the statistics?
Read the rest…