Organise your bioinformatics projects using Subversion and Trac: part 1

Like most people with computer-related jobs, I work on numerous machines. At work, there’s the all-purpose server, the personal desktop machine, a second server for testing/backup and should I need it, a Linux cluster. At home there’s the desktop/server and laptop. In general, I use the work server as the “working machine”, with an NFS mount to the desktop and backups using rsync to the second work server and the home server. The laptop is, as much as possible, a dumb terminal for SSH to other places.

I imagine that I’m not alone in having a directory under my home directory named “projects/”, with a hierarchy of project directories below that and directories below them named “perl”, “fasta”, “gnuplot” and so on. This is all very well, until I’m sat at one or the other of the machines and decide to do some work. Pretty soon, confusion reigns as I try to recall in which direction I should rsync the altered files. What I really need is a master repository from which I make working copies that I can synchronise with the master from any location. In other words – revision control.

I’ve been using CVS for a few years just for code and found it to be very beneficial and quite easy to use, especially from within emacs. However, people kept telling me that Subversion (SVN), the CVS alternative, was far superior. Then I came across a project management system called Trac, which is used by the bioinformatics guys at the IMB on our campus and interfaces with SVN. So I thought – how about using SVN + Trac to handle all of my bioinformatics project files – not just the code?

It turns out that this is a great way to organise and maintain files and share them with other people. This post explains how I set up SVN + Trac; the next post looks at some ways to use them for bioinformatics projects.

1. Preamble

I use Ubuntu 7.04 (Feisty), which currently ships with SVN 1.4.3 and Trac 0.10.3. These notes are specific to that setup, but you may be able to adapt them. There are numerous, high-quality guides on the Web that I’ve cannibalised and glued together. See my svn tags. In particular, I’d like to acknowledge Ariejan de Vroom’s website, which has excellent notes on SVN and Trac and these notes by Volodymyr Orlenko.

I’ll assume that your server is set up with Apache 2 (as easy as “sudo apt-get install apache2” on Ubuntu).

2. Install required packages

Just “sudo apt-get install ” the following if you don’t have them already: ssl-cert, subversion, subversion-tools, trac, buildbot, libapache2-svn, libapache2-mod-python, libapache2-mod-python-doc, python-setuptools.

3. Configure Apache

I decided to use SSL (which means basically that passwords are transmitted encrypted, not as plain text and URLs begin with “https://”). Enable the apache ssl module and open up port 443 like so:

sudo a2enmod ssl
sudo sh -c "echo 'Listen 443' >> /etc/apache2/ports.conf"

Next, generate a SSL certificate. Some documentation mentions an Ubuntu script named “apache2-ssl-certificate”, but it seems to be absent in Feisty so do this instead:

sudo mkdir /etc/apache2/ssl
sudo make-ssl-cert /usr/share/ssl-cert/ssleay.cnf /etc/apache2/ssl/apache.pem

When creating the certificate, you’ll be asked for details about your location and organisation.

Now, create a Virtual Host from the default Apache configuration:

sudo cp /etc/apache2/sites-available/default /etc/apache2/sites-available/001-ssl
sudo nano -w /etc/apache2/sites-available/001-ssl

(or your favourite editor instead of nano)

I called my file 001-ssl; call yours whatever you like. It should be pretty minimal – mine looks like this:

NameVirtualHost *:443
<VirtualHost *:443>
    ServerAdmin you@your.mail
    ErrorLog /var/log/apache2/error.log
    LogLevel warn
    CustomLog /var/log/apache2/access.log combined
    ServerSignature On
## ssl stuff
    SSLEngine on
    SSLCertificateFile /etc/apache2/ssl/apache.pem
    SSLProtocol all
    SSLCipherSuite HIGH:MEDIUM

We’ll be adding more to that file later on when we install Trac. Enable your new virtual host:

sudo a2ensite 001-ssl
sudo /etc/init.d/apache2 restart

Now you need to enable basic authentication for the web server. Create a password for yourself:

sudo htpasswd -c -m /etc/apache2/dav_svn.passwd yourname

Some documentation refers to “htpasswd2” – if you don’t have it, just use htpasswd. Subsequent users can be added by omitting the -c switch.

Finally, we enable WebDAV and SVN in Apache. Edit /etc/apache2/mods-available/dav_svn.conf – it’s well annotated so just follow the notes in the file. Mine looks like this:

<Location /svn>
  DAV svn
  SVNParentPath /home/svn
  AuthType Basic
  AuthName "Subversion Repository"
  AuthUserFile /etc/apache2/dav_svn.passwd
  Require valid-user

A couple of notes. Use SVNParentPath if you want multiple repositories under your SVN root. Common locations for SVN repositories are /srv/svn, /var/lib/svn or /home/svn. I went with the last one, put yours where you like.

So much for Apache. Let’s get on with SVN.

4. Configure SVN

Setting up Subversion is easy. Create a directory for your repositories, make a repository and give it access permissions for the Apache user (www-data on Ubuntu):

sudo mkdir /home/svn
sudo svnadmin create /home/svn/projects
sudo chown -R www-data.www-data /home/svn/projects
sudo chmod -R g+ws /home/svn/projects

Here, I’ve called the repository “projects”. You may prefer a separate repository for each of your projects, or one repository with a hierarchy for each project. The former situation is more administration, the latter has the disadvantage that revision numbers will apply across files from different projects. We’ll look at that more closely in part 2.

At this point, you should be able to fire up a web browser on your server and navigate to “https://localhost/svn/projects&#8221;. It won’t look too exciting though; you should see a largely blank page with “Revision 0: /” at the top.

5. Configure Trac

Under Ubuntu, Trac installs itself in /var/lib/trac. The first step is to give it the appropriate permissions:

sudo chown -R www-data.www-data /var/lib/trac

Next, ensure that Apache mod_python is enabled:

sudo a2enmod mod_python

Trac comes with a command-line administration tool named trac-admin. Use it now to set up your Trac project directory and create an administrator user. Note that the name of the project directory is the same as the corresponding SVN repository (projects in my case):

cd /var/lib/trac
sudo trac-admin projects initenv
sudo trac-admin projects permission add yourname TRAC_ADMIN

We’re almost finished. The penultimate step is to go back to the Apache virtual host configuration file and tell Apache how to access Trac. Put the following somewhere between the <VirtualHost></VirtualHost> tags:

<location /trac>
   SetHandler mod_python
   PythonHandler trac.web.modpython_frontend
   PythonOption TracEnvParentDir /var/lib/trac
   PythonOption TracUriRoot /trac
<locationmatch "/trac/[^/]+/login">
   AuthType Basic
   AuthName "Trac Authentication"
   AuthUserFile /etc/apache2/dav_svn.passwd
   Require valid-user

A few notes about all of that. TracEnvParentDir indicates that we want multiple projects in our Trac environment. TracUriRoot can be named whatever you like, so long as it matches the name in <location >. In the above configuration, anyone can view the Trac website but login requires the same user/password that you set up in step 3. You could make access more restrictive by removing the <locationmatch> tags, so as login is required to see any Trac page. However, you can also set permissions for Trac later on, using either trac-admin or the webadmin tool.

Ah yes, the webadmin tool. Newer versions of Trac allow for adminstration through the web interface, but you need to install the module separately in Trac 0.10.3, the current Ubuntu Feisty version. You also require the correct version for your Trac. Ubuntu makes this easy: the python-setuptools package includes a tool called easy_install which will grab the webadmin plugin from the appropriate URL and install it:

sudo easy_install

All that’s left is to enable webadmin. Trac configuration uses a file named trac.ini, which you’ll find in /usr/share/trac/conf. Unfortunately, it only affects configuration system-wide if it lives in /etc/trac. So you have a few options: (1) copy trac.ini to /etc/trac, (2) symlink the file in /usr/share/trac/conf to /etc/trac or (3) edit the local trac.ini file for each Trac project (e.g. /var/lib/trac/projects/conf/trac.ini). Whatever you do, add these lines to trac.ini:

webadmin.* = enabled

One last “sudo /etc/init.d/apache2 restart” for good luck and you’re there! You should now be able to navigate to “https://localhost/trac/projects&#8221;, login, see the Admin tab and view your (currently empty) SVN repository from the Browse Source tab.

Now you’re ready to start loading files into SVN and setting up your Trac website. That’s the subject of the next post.

11 thoughts on “Organise your bioinformatics projects using Subversion and Trac: part 1

  1. Bill

    Would this be any good for files other than code — data with multiple annotations, manuscripts with multiple authors, experiment logs?

  2. nsaunders Post author

    It’s good for any type of file that you’d like to put under revision control – code, data, manuscripts. Wait for the next post :)

  3. Michael Barton

    Subversion is ideal for this type of project because it deals much better with binary files than does CVS, and often bioinformatics involves producing binary files, such as images.

    I have a similar approach to what you’re doing however I don’t use trac. I rarely use the revision feature either, but just knowing that all your files are backed up really sets your mind at ease. Plus, as you say, setting up a new computer just means checking out the subversion repository.

    I think WebSVN also has comparable functionality to trac, but is much much simpler.

    I also wrote a short script in Java, that I use with GeekTool, that just outputs to the lower left of my screen, telling me which files have been modified or are not yet under SVN control. Handy for keeping an eye on things.

  4. Andrew Perry

    Great how-to Neil. I followed the instructions almost blindly (see, I trust you enough to be r00t on my machine :)) and got it installed, with a few hiccups.

    First minor issue: After Step 4, I had to do an “/etc/init.d/apache2 restart” (or maybe reload), else I couldn’t see the SVN web interface. Frustrating just for a second.

    A few notes for those living on the bleeding edge: I’m running Gutsy on amd64 (current almost-released version, dist-upgraded) and there doesn’t seem to be any Trac stuff in /var/lib/trac. Also, once I make a trac project in that directory with appropriate permissions, the Ubuntu packaged version of Trac didn’t seem to work (apache logs indicate something wrong with the python-clearsilver bindings .. this may resolve once Gutsy is properly released). I decided to just “python ./ install” the latest dev version of Trac (0.11dev) since it no longer uses Clearsilver for templating. If I make the directory /var/trac/lib and continue with your instructions, ensuring correct ownership/permissions, it all works nicely.

    Cheers !

  5. nsaunders Post author

    Glad to be of service.

    On reflection, I don’t think /var/lib/trac is created automatically by the Ubuntu package. I’d like to try the cutting-edge Trac myself, but prefer to stick with the official repositories where I can. There’s a way to build deb packages from any source IIRC.

  6. Andrew Perry

    > There’s a way to build deb packages from any source IIRC.

    Yes, often I use checkinstall for this … it makes it much easier to uninstall / upgrade / cleanup self-compiled things using the apt package management system.

  7. Pingback: Using subversion and svn-time-lapse to edit a manuscript

  8. Pingback: Bio::Blogs #16 - Halloween edition « Freelancing science

Comments are closed.