Rapid command-line access to the PDB
January 14, 2008 — nsaundersThis is hardly earth-shattering stuff, but just for reference.
There are multiple ways to grab PDB files from the RCSB PDB servers. If you know the accession code of a structure, the simplest way is wget (or similar) straight off the FTP or HTTP server:
FTP wget ftp://ftp.wwpdb.org/pub/pdb/data/structures/all/pdb/pdbXXXX.ent.gz HTTP wget http://www.rcsb.org/pdb/files/XXXX.pdb.gz
where XXXX is the 4-character PDB accession code.
Note the recent change of URL for the PDB archive: ftp://ftp.wwpdb.org. Note also the confusing 2, not 3 “w” in the URL.


January 16, 2008 at 6:29 am
Thanks! That actually inspired me to put this command into a bash script I called getpdb:
#!/bin/bash
wget ftp://ftp.wwpdb.org/pub/pdb/data/structures/all/pdb/pdb$1.ent.gz
so I can type “getpdb 1hio” and have the file downloaded to the current directory. Again, nothing groundbreaking, although I have never thought about simplifying process of getting PDB files.
January 16, 2008 at 10:04 am
This is what I love about using the command line in Linux. There are all these small, self-contained unassuming tools which come together to create an ultra-efficient way of working. I still get a buzz from it after all these years.
By the way, this list of HTML character codes will help you enter things like the dollar sign in the comments.
March 5, 2008 at 10:17 pm
[...] my research I’ve been trying to programmatically get EMBL files from the EBI database. I saw Neil’s post on accessing PDB files directly over HTTP, and there is also a similar method for the EBI. Building on this, I thought it would be [...]
March 7, 2008 at 5:39 am
[...] my research I’ve been trying to programmatically get EMBL files from the EBI database. I saw Neil’s post on accessing PDB files directly over HTTP, and there is also a similar method for the EBI. I thought it would be interesting to combine this [...]