Zen of LaTeX

Zen moments are common if you use a lot of open-source software. Sometimes you download software, work your way through some tutorials and how-tos and scan mailing lists, but you don’t quite see what all the fuss is about. Then one day you have your zen moment – “Aaah!” – when you get it.

It’s taken way too many years but this week, I finally had my LaTeX zen moment.

Pre-preamble
Hardcore users of LaTeX will find nothing here. If you’re considering the LaTeX + BibTeX solution to writing but are unsure of how to go about it, you may find these ramblings useful – especially on Ubuntu. I’ve also put together a basic, public Google Notebook with a few useful links.

Preamble
First, I have a terrible confession to make. I’ve written a few scientific papers in my time and I have never used a reference manager or bibliography software. Normally, I copy and paste references on the fly from a PubMed search at NCBI and format them later.

Isn’t that dreadful?

So what’s my excuse? I don’t use Windows or Mac, so the Word + Endnote option is not available (OK, perhaps with wine, but I don’t want to go there). However, I generally work with Word users who ask me for Word files. OpenOffice can provide these, no problem, but bibliography software options for OpenOffice are woeful – practically non-existent. So it’s OpenOffice, manual handling of the bibliography and save to Word.

For years of course, I’ve been dimly aware of alternatives and I’ve installed and played around with a lot of them. I’ve tried many reference managers – wikindx, refbase, refdb. I’ve tried pybliographer. I’ve tried ODBC/JDBC drivers to interface OpenOffice to databases. I’ve installed every LaTeX-related package in the Ubuntu repository and completed several LaTeX “my first document” tutorials (several times). I’ve looked at tex using emacs. Until now though, I never put it all together in a way that satisfied me.

My problem has been that I’m on an eternal quest for the perfect reference/bibliography system under Linux. There is no such thing of course – what you need to do is figure out your needs, investigate the options and implement what works for you.

Breaking point
I don’t enjoy writing papers – I’d much rather be writing Perl. Each time that I have to write, I find myself procrastinating for days with my “perfect system” quest. Then I get stressed because I’m procrastinating. Eventually I figured out why I was procrastinating – it’s because I’m unhappy with my work setup. I was unable to slip easily into “paper writing mode” because subconsciously I knew that things were not right and it was going to be hassle. It’s at this point that you realise that something needs to be done. It’s not procrastinating – it’s taking action to fix a problem.

Figure out what you need to do
Here’s what I need to do regularly and with minimal hassle:

  • Fetch references from PubMed, either individually or in a batch
  • Store them, somewhere, somehow
  • Incorporate them into a document easily as I write and cite them in a journal style
  • Optionally, share both references and documents in a variety of formats with other people

The reference format
BibTeX appeals to me. It’s a simple, plain text readable format. It’s easy to obtain, convert and store. It’s also the basis of bibliographies in LaTeX. The simplest reference database is just a directory of BibTeX files and that may well be all you need.
Bibutils is an indispensable toolkit for converting between BibTex and other formats. It’s easy to compile or else available as an Ubuntu package. Bibtool is rather more cryptic but handy for fixing broken BibTeX – removing duplicate entries and so on.

The reference database
I searched Sourceforge for “bibtex database” and found quite a few projects – most of which were yet to release files or else not updated for many years. There were issues with all of the BibTeX-specific database software that I tried.
I’m relatively happy with both wikindx and refbase. I tend towards the latter because of its excellent import facilities – it can import using a list of PubMed PMIDs, for instance. It also offers the best current solution for OpenOffice integration, in the form of export to ODF XML which you can then use as an OO bibliography database.

Journal style files
Enlightened publishers such as BioMed Central provide you with style files for LaTeX. You could install them system-wide but if your Linux distro is package-based, it’s probably best to put them in a special location. This guide explains how to do just that. Put your .cls in ~/texmf/tex/latex, your .bst in ~/texmf/bibtex/bst and don’t forget to run “texhash” afterwards. Voila, BMC (or whatever) style is now available to you.

Writing and citing
It’s crunch time – let’s write. I’ve tried a few solutions for editing LaTeX documents. At one extreme we have the GUI, at the other a plain text editor environment. For the former I’ve tried Lyx and didn’t like it much. Purely personal – it’s a good piece of software, but I found the graphical environment distracting. It also distances you from the tex via its own intermediate format which I feel is a disadvantage. For instance, it may not play nice with your existing tex files (I found this to be the case for the BMC tex template). On the plus side, citation directly from pybliographer via lyxpipe is a nice feature.

The plain-text only approach? You can’t get past emacs for my money, but be prepared for quite a steep learning curve.

What I was really looking for was a half-way house which allowed me to see the source without distracting me from the document (or vice-versa) and remove some of the pain associated with compiling to DVI etc. In other words, a LaTeX IDE. I can report that I have found my LaTeX saviour and her name is kile.

In the Gnome v. KDE wars I’m a Gnome man – I just prefer the look – but of course KDE apps run fine under Gnome and I use many of them. Kile is just brilliant. You just open up your LaTeX plus your BibTeX and start editing. At any point hit the Quick Build button and your DVI pops up. All your other tools are built in (latex2html, PDF, PS etc.), syntax highlighting is great, there are consoles down at the bottom for messages and a host of other features. This really is software that I can just fire up and use without stress. Kile also uses the same ~/.lyx/lyxpipe as Lyx so if you like you can open up your references in pybliographer, find the ones you need and cite straight into Kile. It really is a joy to click a button and see your paper pop up in the correct journal style, without manually going through that “latex, latex, bibtex, latex” cycle.

(Completely unrelated – for those occasions when people send you Word files and you want to convert to something useful – “sudo apt-get install wv”. The wv package lets you convert Word to all sorts of formats including tex and worked nicely for me using the Word output of OpenOffice).

Toolkit and workflow
At long last then, I have a system:

  • Grab references from PubMed as required, import to refbase
  • Export from refbase as BibTeX bibliographies
  • Write my documents using Kile, optionally citing via pybliographer
  • Harness the power of LaTeX to export to multiple formats

I’ll never get to reference/bibliography nirvana, but at least I feel a few steps closer.

10 thoughts on “Zen of LaTeX

  1. Welcome. I know you occasionally use R, so I’ll recommend Sweave, which I’ve mentioned before. It’s basically latex with embedded chunks of R code, which the sweave R library recognises and executes, returning tables, figures etc to build a mature latex document which one then processes as usual.

    If you’ve of a mind, a refdb tutorial would be most useful.

  2. I think that I know more scientists that don’t use any kind of reference management system than those that bother using them. For most people, it’s just too much hassle, and getting everything setup correctly doesn’t seem to be worth it, so they simply copy the references from their last paper. I must admit that this may actually mean less work initially. But if you need to work with different citation styles (i.e. you’d need to reformat your references), can’t remember your vast amount of literature, or if you want to share your work between the members of your research group, then nothing beats a reference manager.

    Your post also reminds me of some recent discussion among refbase devs, where we were thinking about a simple way to maintain an offline BibTeX file that could be held in sync with an online refbase database. Such a bib document synchronization feature might be also useful for other bibliographic export formats (such as an ODF spreadsheet or Word 2007 bibliographic XML file). Guess there’s always room for improving one’s toolkit and workflow… (and I must admit that this is more fun than writing the actual paper :-/)

  3. I found JabRef useful when I was writing my thesis in LaTeX. Martin Jambon had a post about whether or not CiteULike could be a good online and collaborative reference manager.

    In terms of practical uses. I did submit one scientific manuscript to a journal in TeX a long time ago, but then had to convert it to RTF in the publishing phase. I find it hard when collaborating with people who don’t use it. I’m currently making a tutorial slides with the foiltex LaTeX template so I still find it useful outside of the monolithic. A while ago we tried and made BioPerl documents in LaTeX, then switched to DocBook which people found even more difficult than TeX (mostly because of the downstream processing).

  4. I remember the days when BioPerl docs came in every imaginable format! We had some interest in DocBook at Nodalpoint a few years ago and we also felt that it was more effort than we were able to make.

    I didn’t mention CiteULike or Connotea in this post, but did spend several days playing with both of them. Frankly, I think they are both awful. Which is a shame as I like the idea of online and collaborative reference management, but as they stand now, neither of them work for me. I know people who use them as online bookmark collections but not for serious reference management.

    Chris – I haven’t touched refdb for a couple of years, ever since it moved to using bleeding-edge libdbi libs that noone can install. A shame as I used to think it had promise. For me refbase is the natural replacement.

  5. I thought you may mean refbase.

    I like it. The web interface is good and there are Perl CLI clients for batch import/export that work very well. For instance you can supply a list of PubMed PMIDs for import. There are also plenty of import/export options: bibtex, ris, endnote, MODS, ODF and so on. It’s not hard to get it up, running and most importantly, useable. There’s also a reasonable solution to word processor integration, using ODF.

    What I like most is the developer community – Matthias (lead developer) always responds quickly and courteously to queries, is keen to implement new features and really knows his stuff when it comes to code and dealing with reference formats.

  6. Hi,

    Don’t forget, wikindx has a PubMed plug-in for doing all this and then the word processor keeps it all in one package instead of having to fiddle around importing/exporting between a variety of software.

    Mark.

  7. It’s been a long time since this post, but I’d like to suggest Zotero anyways:
    http://www.zotero.org/
    Despite it requires integration with a web browser like Firefox, Flock or Netscape, it can capture in web pages, images, links, papers + associated bibliographical information from sites like arxiv.org, ads.harvard.edu, etc, which can be exported in a lot of formats, including BibTeX. You can also add notes and tags to every record,
    import data and export the library.
    Of course, they provide plugins in order to work also with Open Office and MS Word. :D

  8. @David – I’m a fan of Zotero too and have blogged about it here. I don’t use it much just now as it lacks networking functionality – but I hear that is about to change.

Comments are closed.