How many monotypic genera?

During all the recent discussion around Neandertals and modern humans, it’s often pointed out that Homo sapiens is the sole extant representative of the genus Homo. I began to wonder “how unusual is this?” in a FriendFeed comment thread. What resources exist that could help us to answer this question?

Genera that contain only one species are termed monotypic. Wikipedia even has a category page for this topic but their lists are limited, since Wikipedia is not a comprehensive taxonomy resource.

Taxonomy is not my specialty but once in a while, I enjoy challenging myself with unfamiliar resources and data types. I figured initially that we could get some way towards an answer using BioSQL and the NCBI taxonomy database. As it turned out I was completely wrong, but it was an interesting educational exercise. I turned instead to a “real” taxonomy resource, the Integrated Taxonomic Information System, or ITIS.

First, I set up the ITIS database:

# fetch and unpack
wget http://www.itis.gov/downloads/itisMySQL012710_v3.TAR.gz
tar zxvf itisMySQL012710_v3.TAR.gz
# Problem - 2 versions of the SQL setup file
cp dropcreateloaditis.sql itisMySQL020210/
cd itisMySQL020210
# and load into MySQL
mysql -u root -p --enable-local-infile < dropcreateloaditis.sql

A couple of minor issues here. First, ITIS, if your tarball name contains TAR in upper-case, Linux tab-completion doesn’t work. Second, confusingly, unpacking the tarball generates two files named dropcreateloaditis.sql: one inside the directory itisMySQL020210 and another one directory level up. The former does not work properly, the latter does.

OK, a brand new database with an unfamiliar schema. Some poking around in the MySQL console shows 24 tables. To make a long story short, the table taxon_unit_types contains a field named rank_id, which shows that “species” have a rank_id value of 220. The table taxonomic_units contains lots of fields, including the rank_id and a field called unit_name1 which for species records, appears to indicate the genus. There’s also a field in taxonomic_units called name_usage which takes values of “invalid”, “valid”, “accepted” or “not accepted”. I assume that it’s best to stick with “valid” or “accepted”.

So, to count species per genera, we can try something like this:

SELECT unit_name1, count(*) AS species FROM taxonomic_units WHERE rank_id = 220 AND (name_usage = 'valid' OR name_usage = 'accepted') GROUP BY unit_name1 ORDER BY species DESC INTO OUTFILE '/tmp/itis.txt';

Here are the first few lines of the resulting output file:

head /tmp/itis.txt
Lasioglossum    1740
Megachile       1522
Andrena         1495
Camponotus       965
Hylaeus          709
Nomada           701
Rhyacophila      647
Perdita          631
Pheidole         549
Chimarra         537

A quick cross-check using a few genus names at the ITIS website seems to confirm that we are counting species per genera correctly. So, how many did we retrieve and how many have only one species?

# total records
wc -l /tmp/itis.txt
41723 itis.txt
# records with number 1 in second column
grep -P "\t1$" itis.txt | wc -l
16786
# one of those is Homo, right?
grep -P "^Homo\t" itis.txt
Homo	    1

It seems then that around 40% of valid or accepted genera, as retrieved from ITIS, contain one species – assuming that I have not made an error in my SQL query. This raises some questions. Does this mean that humans are not particularly unusual in being the sole extant representative of Homo? How complete a resource is ITIS? 40% seems high – are there really so many monotypic genera, or is it more likely that many genera contain as-yet undescribed species?

I venture back onto safe ground and leave these questions to the experts.

3 thoughts on “How many monotypic genera?

  1. This analysis of course begs the question how biologically relevant such a finding would be, if it is true. Considering the definition of ‘genus’ is pretty arbitrary anyway (I’m sure taxonomists disagree), the status of ‘sole survivor’ is a tricky one at best. Evolutionary relationships on earth form a continuum and will inherently clash with our current taxonomic nomenclature.
    And for all we know, maybe an alien taxonomist would put Chimps and us in the same genera right?

  2. Yeah. As a microbial evolutionist/ecologist, I tend to think that multicellular life is over-divided taxonomically anway. There are sub-species strains of E.coli that differ more from each other on the genomic scale than chimps do from humans. We’ve put humans in their own genera rather arbitrarily.

Comments are closed.