February 2, 2012
A question at BioStar: how to “return all pdb ids to a given one that differ only by one amino acid”?
My answer began: “I think it is not too much work to craft a solution using a few tools”, followed by some incomplete ideas. Let’s see if I was right.
Read the rest…
Posted in bioinformatics, perl, programming |
Comments Off
September 18, 2008
In the midst of preparing a talk for next Monday. It occurred to me that perhaps we don’t see more protein structure-based prediction in bioinformatics because – there aren’t enough structures.

pdbstats
Sure, the PDB has grown a lot in the past 5 years or so and 53 103 structures (as of now) looks impressive. However, if you’re interested in protein-protein interaction, you want at least 2 chains: which more or less halves the dataset. If you want two
different protein chains, you lose almost another 75%. Let’s specify a reasonable minimum resolution for X-ray diffraction data and there go ~ 3 000 entries. We probably don’t want multiple, similar proteins so let’s remove sequence identity at a redundancy of 90%. We’re left with about 2% of the original PDB, which might be useable for looking at interactions.
No wonder that most bioinformatics focuses on sequences and high-throughput interaction data.
Posted in bioinformatics, research diary, uncategorized |
4 Comments »
January 14, 2008
This is hardly earth-shattering stuff, but just for reference.
There are multiple ways to grab PDB files from the RCSB PDB servers. If you know the accession code of a structure, the simplest way is wget (or similar) straight off the FTP or HTTP server:
FTP
wget ftp://ftp.wwpdb.org/pub/pdb/data/structures/all/pdb/pdbXXXX.ent.gz
HTTP
wget http://www.rcsb.org/pdb/files/XXXX.pdb.gz
where XXXX is the 4-character PDB accession code.
Note the recent change of URL for the PDB archive: ftp://ftp.wwpdb.org. Note also the confusing 2, not 3 “w” in the URL.
Posted in bioinformatics, computing, linux, research diary, web resources |
5 Comments »
December 13, 2007
- People are finding many outlets for their work. Pierre maintains a repository of tools where you can find IBDStatus, his latest software for genetic analysis.
- Spotted in Nature this week:

Makes perfect sense doesn’t it: if you publish an article on a structure, include a link to the PDB resource. Yet so far as I can tell this is a new feature, since it jumped out at me. Given that the WWW is such a rich publishing platform, simply because of hyperlinks that connect data, how long before paper copies of all journals are considered quaint and obsolete?
Posted in bioinformatics, publications, web resources |
6 Comments »