-omics in 2013

Just how many (bad) -omics are there anyway? Let’s find out.

Update: code and data now at Github

1. Get the raw data

It would be nice if we could search PubMed for titles containing all -omics:


However, we cannot since leading wildcards don’t work in PubMed search. So let’s just grab all articles from 2013:


and save them in a format which includes titles. I went with “Send to…File”, “Format…CSV”, which returns 575 068 records in pubmed_result.csv, around 227 MB in size.

2. Extract the -omics
Titles are in column 1 and we only want the -omics, so:

cut -f1 -d "," pubmed_result.csv | grep -ioP "(\w+omics)" > omics.txt
wc -l omics.txt
# 1763 omics.txt

Note: grep changed so the following now works. Note: this approach will miss a very few cases where omics is preceded by a hyphen. That included the classic stain-omics.
It also ignores the standalone term “-omics”, which is used quite often

Of course, this results in some “false positives” (e.g. economics). Curation would be required to detect the “true -omics”.

3. Visualize
Time to break out the R. The top 20 -omics in 2013 and the less popular:

omics <- readLines("omics.txt")
omics <- tolower(omics)
omics.freq <- as.data.frame(table(omics))
omics.freq <- omics.freq[ order(omics.freq$Freq, decreasing = T),]
omics.freq$omics <- factor(omics.freq$omics, levels = omics.freq$omics)
ggplot(head(omics.freq, 20)) + geom_bar(aes(omics, Freq), stat = "identity", fill = "darkblue") +
       theme_bw() + coord_flip()
# and the less popular
subset(omics.freq, Freq == 1)
On the right, the top 20. Click for a larger version of the graphic. Top of the list so far for 2013 is proteomics, followed by genomics and metabolomics.

Listed below, those -omics found only once in titles from 2013. Some shockers, I think you’ll agree (paging Jonathan Eisen).

          aquaphotomics    1
       biointeractomics    1
             calciomics    1
            cholanomics    1
           cytogenomics    1
           cytokinomics    1
          econogenomics    1
            glcnacomics    1
 glycosaminoglycanomics    1
          interactomics    1
               ionomics    1
         macroeconomics    1
            materiomics    1
      metalloproteomics    1
     metaproteogenomics    1
           microbiomics    1
         microeconomics    1
          microgenomics    1
        microproteomics    1
               miromics    1
         mitoproteomics    1
             mobilomics    1
               modomics    1
             morphomics    1
              museomics    1
              neuromics    1
       neuropeptidomics    1
        nitroproteomics    1
      nutrimetabonomics    1
           oncogenomics    1
        orthoproteomics    1
            pangenomics    1
           petroleomics    1
   pharmacometabolomics    1
     pharmacoproteomics    1
   phylotranscriptomics    1
              phytomics    1
           postgenomics    1
              pyteomics    1
          radiogenomics    1
           rehabilomics    1
     retrophylogenomics    1
                 romics    1
            secretomics    1
              sensomics    1
         speleogenomics    1
           surfaceomics    1
              surfomics    1
     toxicometabolomics    1
            vaccinomics    1
              variomics    1
Top 20 -omics PubMed titles 2013

Top 20 -omics PubMed titles 2013

Never heard of romics? That’s OK. It’s a surname.

11 thoughts on “-omics in 2013

    • Phylogenomics is there in the top 20 with 16 occurrences – see the figure or the full data at Github. The list is just those that occur only once.

      If you can figure out how to retrieve only titles from all of PubMed, I’d like to hear about it!

        • Nice work. I should point out that your code does download data (as XML), just not to a file. Also that parsing the XML for titles is not the same as retrieving only titles from NCBI (which is not possible with their current API).

          You should also check out the rentrez package; it’s a nice wrapper to NCBI web services.

  1. Pingback: Pills of Science Week 2 | Raony Guimarães

Comments are closed.