Published #2 (2008)
May 9, 2008 — nsaundersIt’s turned out to be a pretty good week. This one has been in press for ever, but finally hit the web:
Frith, M.C., Saunders, N.F.W., Kobe, B. and Bailey, T.L. (2008).
Discovering Sequence Motifs with Arbitrary Insertions and Deletions.
PLoS Computational Biology 4(4):e1000071. [Open Access] | [PubMed]
This paper describes GLAM2, a Gibbs sampler that finds and refines variable-width motifs, allowing insertion and deletion, in related but dissimilar sets of sequences. The work is Martin’s baby; my very minor contribution was to try it out on some test datasets. It’s open-access and open source, so you can all go and enjoy it then grab the software to try.
Two more (unrelated) in press to tell you about soon. See, I do have a day job outside of this blog.


May 9, 2008 at 11:06 pm
Congrats Neil!
May 10, 2008 at 11:00 am
Cool … I was getting excited reading this abstract, even before I noticed you were an author. We’ve used (and published) motifs discovered with MEME in the past - the lack of gaps in MEME motifs meant even a single residue indel in a few members in the family of sequence would result in otherwise continuous motifs being split into two motifs. Looks like GLAM2 handles this nicely, among other things. In fact I might use it for some stuff I was planning on trying this weekend rather than MEME, since I will be looking for short motifs which GLAM2 also seems best at
May 10, 2008 at 3:40 pm
@Andrew - that’s great, let me know how it goes. GLAM2 runs can take a while; it’s quite CPU-intensive. There are some notes in the documentation about how to interpret the scores with some statistical tests, the EMBOSS shuffleseq method works well for me.