Couple of interesting articles in the current issue of Bioinformatics (vol. 21 no. 24).
Steven Salzberg and James Yorke point out that all
current assemblers generate misassemblies, particularly when dealing with repeat regions.
Pernille Nielsen and Anders Krogh analyse the annotation
of 143 prokaryotic genomes and find large numbers of errors – in some cases they claim 60% of genes have the wrong start codon.
I think these articles are not so much a criticism of software or methods – the limitations are well known, but a warning to biologists that they should not uncritically accept the contents of databases.