<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
		>
<channel>
	<title>Comments on: Genome annotation:  who&#8217;s responsible?</title>
	<atom:link href="http://nsaunders.wordpress.com/2006/09/22/genome-annotation-whos-responsible/feed/" rel="self" type="application/rss+xml" />
	<link>http://nsaunders.wordpress.com/2006/09/22/genome-annotation-whos-responsible/</link>
	<description>Notes from the life of a bioinformatics researcher</description>
	<lastBuildDate>Sat, 26 Dec 2009 01:41:40 +0000</lastBuildDate>
	<generator>http://wordpress.com/</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Chris Fields</title>
		<link>http://nsaunders.wordpress.com/2006/09/22/genome-annotation-whos-responsible/#comment-1167</link>
		<dc:creator>Chris Fields</dc:creator>
		<pubDate>Tue, 05 Dec 2006 20:35:30 +0000</pubDate>
		<guid isPermaLink="false">http://nsaunders.wordpress.com/2006/09/22/genome-annotation-whos-responsible/#comment-1167</guid>
		<description>I had a similar situation with a Mycobacterium tuberculosis gene (Rv1379) when extracting intergenic regions.  It is close to a gene (Rv1378c) on the opposite strand; both genes are divergently expressed.  

This time, the other gene was misannotated to incorporate the largest ORF, which happened to overlap with my gene (and hence gobbled up all the intergenic region).  If they had compared the predicted ORF to its closest homologues in other bacterial chromosomes they would have found the (likely) correct start codon.  The error was reported but hasn&#039;t been corrected yet.</description>
		<content:encoded><![CDATA[<p>I had a similar situation with a Mycobacterium tuberculosis gene (Rv1379) when extracting intergenic regions.  It is close to a gene (Rv1378c) on the opposite strand; both genes are divergently expressed.  </p>
<p>This time, the other gene was misannotated to incorporate the largest ORF, which happened to overlap with my gene (and hence gobbled up all the intergenic region).  If they had compared the predicted ORF to its closest homologues in other bacterial chromosomes they would have found the (likely) correct start codon.  The error was reported but hasn&#8217;t been corrected yet.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: nsaunders</title>
		<link>http://nsaunders.wordpress.com/2006/09/22/genome-annotation-whos-responsible/#comment-410</link>
		<dc:creator>nsaunders</dc:creator>
		<pubDate>Tue, 03 Oct 2006 21:46:49 +0000</pubDate>
		<guid isPermaLink="false">http://nsaunders.wordpress.com/2006/09/22/genome-annotation-whos-responsible/#comment-410</guid>
		<description>&lt;i&gt;Do you come accross these often?&lt;/i&gt;
Oh yes.  On the whole though, I&#039;d say large centres with annotation pipelines (NCBI, JGI) do a pretty good job.  And really they have to, because there&#039;s too much data for smaller groups to handle.  That&#039;s why we&#039;ve come to rely on the large data centres.  There are 2 main points.  First, always be critical of third-party data.  Second, there&#039;d be less of a problem if there were agreed standards for annotation pipelines.  I could easily write my own annotation pipeline, but I don&#039;t have the hardware required to deal with multiple genomes, plus what would be the point of yet another non-standard pipeline to add to the mix?  So we rely on what&#039;s out there.

Couldn&#039;t give you an exact figure for frame-call problems.  Pyrrolysine is quite rare, some genomes in RefSeq are annotated correctly for selenocysteine.  GenBank has adopted the characters &quot;O&quot; and &quot;U&quot; for these amino acids.  The problem is that a lot of commonly-used software won&#039;t yet recognise these non-standard characters and most ORF-calling software doesn&#039;t account for coding stop codons.  Again, there&#039;s no reason why coding stop codons couldn&#039;t be handled by an annotation pipeline if someone put their mind to it and a standard was adopted.</description>
		<content:encoded><![CDATA[<p><i>Do you come accross these often?</i><br />
Oh yes.  On the whole though, I&#8217;d say large centres with annotation pipelines (NCBI, JGI) do a pretty good job.  And really they have to, because there&#8217;s too much data for smaller groups to handle.  That&#8217;s why we&#8217;ve come to rely on the large data centres.  There are 2 main points.  First, always be critical of third-party data.  Second, there&#8217;d be less of a problem if there were agreed standards for annotation pipelines.  I could easily write my own annotation pipeline, but I don&#8217;t have the hardware required to deal with multiple genomes, plus what would be the point of yet another non-standard pipeline to add to the mix?  So we rely on what&#8217;s out there.</p>
<p>Couldn&#8217;t give you an exact figure for frame-call problems.  Pyrrolysine is quite rare, some genomes in RefSeq are annotated correctly for selenocysteine.  GenBank has adopted the characters &#8220;O&#8221; and &#8220;U&#8221; for these amino acids.  The problem is that a lot of commonly-used software won&#8217;t yet recognise these non-standard characters and most ORF-calling software doesn&#8217;t account for coding stop codons.  Again, there&#8217;s no reason why coding stop codons couldn&#8217;t be handled by an annotation pipeline if someone put their mind to it and a standard was adopted.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Carolina</title>
		<link>http://nsaunders.wordpress.com/2006/09/22/genome-annotation-whos-responsible/#comment-409</link>
		<dc:creator>Carolina</dc:creator>
		<pubDate>Tue, 03 Oct 2006 16:32:18 +0000</pubDate>
		<guid isPermaLink="false">http://nsaunders.wordpress.com/2006/09/22/genome-annotation-whos-responsible/#comment-409</guid>
		<description>Hey Neil, Your examples are very good illustrations of annotation errors... Do you come accross these often?  
Biologists tend to use annotations as gold standards to guide their research... these major annotation groups need to pay closer attention to their annotation. People say one genome group hired undergraduates for a summer project and used their Blast skills to annotate genomes and now we are propagating that annotation using yet more Blast...
any insights on the percentage use of unusual aminoacids that could account for frame-call errors?</description>
		<content:encoded><![CDATA[<p>Hey Neil, Your examples are very good illustrations of annotation errors&#8230; Do you come accross these often?<br />
Biologists tend to use annotations as gold standards to guide their research&#8230; these major annotation groups need to pay closer attention to their annotation. People say one genome group hired undergraduates for a summer project and used their Blast skills to annotate genomes and now we are propagating that annotation using yet more Blast&#8230;<br />
any insights on the percentage use of unusual aminoacids that could account for frame-call errors?</p>
]]></content:encoded>
	</item>
</channel>
</rss>
