Make your own NCBI handbook

My previous post reminded me of an Australian company that used to sell the NCBI Handbook on a CD for AUD 35. Yes, this NCBI handbook – available for free at the NCBI website. The only drawback is that if you want to download a copy, it’s distributed as 24 separate PDF files.

Well you could be stupid and pay 35 dollars plus postage for a free resource – or you could create a single PDF using some freely-available software and a small shell script. Specifically you’ll need:

  • wget – to fetch files over HTTP
  • PDFjam – to concatenate PDF files into one file
  • xargs – to submit the PDF filenames to pdfjoin, part of the PDFjam package

All of these are either available or easy to install on any Linux machine. And possibly other platforms, for all I know.

Here’s a shell script, ncbihbk.sh, to fetch the PDFs and stitch them together. Notice how the sneaky NCBI have named 3 of the files using a different convention to the other 21. I’m sure that it wasn’t deliberate.

#!/bin/sh
# ncbihbk.sh
# fetch NCBI handbook chapters 1-24 and concatenate

for i in `seq 1 24`
  do
if [ $i -eq 5 -o $i -eq 13 -o $i -eq 18 ]; then
# chapters 5, 13, 18
    echo "Fetching ch$i.pdf..."
    wget -q http://www.ncbi.nlm.nih.gov/books/bookres.fcgi/handbook/ch$i.pdf
    echo ch$i.pdf >> filelist
# don't bash the servers!
    sleep 3
else
# all other chapters
    echo "Fetching ch${i}d1.pdf..."
    wget -q http://www.ncbi.nlm.nih.gov/books/bookres.fcgi/handbook/ch${i}d1.pdf
    echo ch${i}d1.pdf >> filelist
    sleep 3
fi
done

# concatenate PDFs from list
echo "Concatenating PDF files..."
cat filelist | xargs pdfjoin --outfile ncbi.pdf

echo "Output in ncbi.pdf"
exit 0

Type “sh ncbihbk.sh”, sit back and relax. Voilà, the NCBI handbook in all its 407-page glory. Another triumph for free software. To concatenate any collection of PDF files, just run “pdfjoin –outfile mypdf.pdf file1.pdf file2.pdf file3.pdf. . .”

To be honest, it’s probably as easy to browse the handbook online.

4 thoughts on “Make your own NCBI handbook

  1. nsaunders Post author

    If you want to blow a bunch of your bandwith

    The final PDF weighs in at about 36 MB. Not too outrageous. Perhaps the Nodalpoint wiki would be the place.

  2. Personomics

    Hi,

    Thanks for your scripts. However, I noticed several code lines are obsolete or not up to date:

    1) The chapter format is ch$i.pdf for *all* chapters. The ch${i}d1.pdf doesn’t seem to work anymore.

    2) The -outfile argument needs a second “-” to be valid: –outfile

    3) I added a cleaning part.

    With those modifications, I get this much shorter code (note it is a bash code and not an sh one…):
    #!/bin/bash
    # ncbihbk.sh
    # fetch NCBI handbook chapters 1-24 and concatenate

    for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
    do
    echo “Fetching ch$i.pdf…”
    wget -q http://www.ncbi.nlm.nih.gov/books/bookres.fcgi/handbook/ch$i.pdf
    echo ch$i.pdf >> filelist
    sleep 3
    done

    # concatenate PDFs from list
    echo “Concatenating PDF files…”
    cat filelist | xargs pdfjoin –outfile ncbi.pdf

    for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
    do
    rm ch$i.pdf
    done
    rm filelist

    echo “Output in ncbi.pdf”
    exit 0

    This said, thanks again!

    http://personomics.wordpress.com

Comments are closed.