Here’s a problem. You’d like to construct a complex query at NCBI Entrez using various fields. Example:
“9606″[Taxonomy ID]
to limit your search to Homo sapiens. Except – you don’t know which fields are available for the database that you want to query.
EInfo can return an XML file with this information. Ruby + Hpricot eats XML for breakfast. Here’s an example using the GEO Datasets (gds) database.
#!/usr/bin/ruby
require 'rubygems'
require 'hpricot'
require 'open-uri'
doc = Hpricot(open("http://eutils.ncbi.nlm.nih.gov/entrez/eutils/einfo.fcgi?db=gds"))
(doc/'//fieldlist/field').each do |f|
puts "#{(f/'/name').inner_html},#{(f/'/fullname').inner_html},#{(f/'description').inner_html}"
end
And the first few lines of output:
ALL,All Fields,All terms from all searchable fields
UID,UID,Unique number assigned to publication
FILT,Filter,Limits the records
ORGN,Organism,exploded organism names
....24 more lines....


