Other people’s research problems

I’m a big fan of the Redfield lab. They might be unique in that each lab member has a blog where they describe their daily research problems. I’m sure that this is a great way for them to interact, keep track of progress and organise their thoughts. Often, the blogs throw up interesting problems too – like this one.

I’m no expert on the statistics of mutation rates but from Heather’s description, it struck me that the problem might lie with the way her code generates mutations. If a site is “A” and the code chooses “A” as a replacement – that’s not really a mutation, is it? In real-life DNA, we don’t see “A changing to A”. We might see “A” being removed and then replaced, or “A” changing to “C” and then back to “A”, but “A changing to A” doesn’t make sense, biologically.
So I suspect that Rosie’s 0.75x correction is correct – but perhaps a better solution would be to change the way in which mutations are introduced. For instance, instead of “the program finds a site in the DNA and randomly decides whether it will insert A, T, G, or C”, how about “the program finds a site in the DNA and randomly decides whether it will insert (anything other than what the current site is)”? Or is there in fact a good statistical reason and an associated probability of “A changing to A”?

3 thoughts on “Other people’s research problems

  1. I also think A to A is not a mutation. The mutation rate should set the likelihood per unit time that a nucleotide changes. The mutations are not all the same also. A to C or G or T are different. Transitions (A to G) are more common than transversions (A to C/T). I would decide if each nucleotide mutates based on the mutation rate provided by the user and then decide to what it mutates according the transition/tranversion rates of the species (or set by the user).

  2. Sort of a side point, but people often conflate substitution rate with mutation rate. I wonder if that’s an issue here.

  3. Pingback: Things I noticed #10

Comments are closed.