Rapid command-line access to the PDB

This is hardly earth-shattering stuff, but just for reference.

There are multiple ways to grab PDB files from the RCSB PDB servers. If you know the accession code of a structure, the simplest way is wget (or similar) straight off the FTP or HTTP server:

FTP
wget ftp://ftp.wwpdb.org/pub/pdb/data/structures/all/pdb/pdbXXXX.ent.gz

HTTP
wget http://www.rcsb.org/pdb/files/XXXX.pdb.gz

where XXXX is the 4-character PDB accession code.

Note the recent change of URL for the PDB archive: ftp://ftp.wwpdb.org. Note also the confusing 2, not 3 “w” in the URL.

5 thoughts on “Rapid command-line access to the PDB

  1. Thanks! That actually inspired me to put this command into a bash script I called getpdb:


    #!/bin/bash
    wget ftp://ftp.wwpdb.org/pub/pdb/data/structures/all/pdb/pdb$1.ent.gz

    so I can type “getpdb 1hio” and have the file downloaded to the current directory. Again, nothing groundbreaking, although I have never thought about simplifying process of getting PDB files.

  2. This is what I love about using the command line in Linux. There are all these small, self-contained unassuming tools which come together to create an ultra-efficient way of working. I still get a buzz from it after all these years.

    By the way, this list of HTML character codes will help you enter things like the dollar sign in the comments.

  3. Pingback: Bioinformatics Zen » Using helper scripts to make bioinformatics analysis easier to maintain

  4. Pingback: Bioinformatics Zen » BioRuby and Ruby on Rails: Active BioRecords

  5. If you like Unix style of command-line work, you may try BioShell:
    http://bioshell.chem.uw.edu.pl/

    which helps you manipulating various files, for example:
    java Strc -ip=4mba.pdb -selection=A.34:108 -op=out.pdb
    to get residues from 34 to 108 from chain A

    The package is written in java and formally you have to start each command with “java”. This may be avoided by an alias (e.g. in your .bashrc file)

Comments are closed.