Archive for November 5th, 2010

November 5, 2010

What the world needs is: lists of Entrez database fields

You know the problem. You want to qualify your NCBI/Entrez database search term using a field. For example: “autism[TIAB]“, to search PubMed for the word autism in either Title or Abstract. Problem – you can’t find a list of fields specific to that database.

Now you can. Follow the links in this public Dropbox file, to see a CSV file containing name, full name and description of the fields for each Entrez database.

Code to generate the files is listed below. This may or may not be the first in an occasional, irregular “what the world needs” series.

#!/usr/bin/ruby
require 'rubygems'
require 'bio'
require 'hpricot'
require 'open-uri'

Bio::NCBI.default_email = "me@me.com"
ncbi = Bio::NCBI::REST.new

ncbi.einfo.each do |db|
  puts "Processing #{db}..."
  File.open("#{db}.txt", "w") do |f|
    doc = Hpricot(open("http://eutils.ncbi.nlm.nih.gov/entrez/eutils/einfo.fcgi?db=#{db}"))
    (doc/'//fieldlist/field').each do |field|
      name = (field/'/name').inner_html
      fullname = (field/'/fullname').inner_html
      description = (field/'description').inner_html
      f.write("#{name},#{fullname},#{description}\n")
    end
  end
end

Follow

Get every new post delivered to your Inbox.

Join 1,464 other followers