Just a brief technical note.
I figured that for a given compound in PubChem, it would be interesting to know whether that compound had been used in a high-throughput experiment, which you might find in GEO. Very easy using the E-utilities, as implemented in the R package rentrez:
library(rentrez)
links <- entrez_link(dbfrom = "pccompound", db = "gds", id = "62857")
length(links$pccompound_gds)
# [1] 741
Browsing the rentrez documentation, I note that db can take the value “all”. Sounds useful!
links <- entrez_link(dbfrom = "pccompound", db = "all", id = "62857")
length(links$pccompound_gds)
# [1] 0
That’s odd. In fact, this query does not even link pccompound to gds:
length(names(links))
# [1] 39
which(names(links) == "pccompound_gds")
# integer(0)
It’s not a rentrez issue, since the same result occurs using the E-utilities URL.
The good people at ropensci have opened an issue, contacting NCBI for clarification. We’ll keep you posted.