Published #2 (2008)

It’s turned out to be a pretty good week. This one has been in press for ever, but finally hit the web:

Frith, M.C., Saunders, N.F.W., Kobe, B. and Bailey, T.L. (2008).
Discovering Sequence Motifs with Arbitrary Insertions and Deletions.
PLoS Computational Biology 4(4):e1000071. [Open Access] | [PubMed]

This paper describes GLAM2, a Gibbs sampler that finds and refines variable-width motifs, allowing insertion and deletion, in related but dissimilar sets of sequences. The work is Martin’s baby; my very minor contribution was to try it out on some test datasets. It’s open-access and open source, so you can all go and enjoy it then grab the software to try.

Two more (unrelated) in press to tell you about soon. See, I do have a day job outside of this blog.

3 thoughts on “Published #2 (2008)

  1. Cool … I was getting excited reading this abstract, even before I noticed you were an author. We’ve used (and published) motifs discovered with MEME in the past – the lack of gaps in MEME motifs meant even a single residue indel in a few members in the family of sequence would result in otherwise continuous motifs being split into two motifs. Looks like GLAM2 handles this nicely, among other things. In fact I might use it for some stuff I was planning on trying this weekend rather than MEME, since I will be looking for short motifs which GLAM2 also seems best at :)

  2. @Andrew – that’s great, let me know how it goes. GLAM2 runs can take a while; it’s quite CPU-intensive. There are some notes in the documentation about how to interpret the scores with some statistical tests, the EMBOSS shuffleseq method works well for me.

