Virus hosts from NCBI taxonomy: now at Github

After my previous post on extracting virus hosts from NCBI Taxonomy web pages, Pierre wrote:

An excellent idea and here’s my first attempt.

Here’s a count of hosts. By the way NCBI, it’s environment.

cut -f4 virus_host.tsv | sort | uniq -c

   1301 
    283 algae
    114 archaea
   4509 bacteria
      8 diatom
     51 enviroment
    267 fungi
      1 fungi| plants| invertebrates
      4 human
    761 invertebrates
    181 invertebrates| plants
      7 invertebrates| vertebrates
   3979 plants
    102 protozoa
   6834 vertebrates
 115052 vertebrates| human
     43 vertebrates| human  stool
    225 vertebrates| invertebrates
    656 vertebrates| invertebrates| human