Like most people with computer-related jobs, I work on numerous machines. At work, there’s the all-purpose server, the personal desktop machine, a second server for testing/backup and should I need it, a Linux cluster. At home there’s the desktop/server and laptop. In general, I use the work server as the “working machine”, with an NFS mount to the desktop and backups using rsync to the second work server and the home server. The laptop is, as much as possible, a dumb terminal for SSH to other places.
I imagine that I’m not alone in having a directory under my home directory named “projects/”, with a hierarchy of project directories below that and directories below them named “perl”, “fasta”, “gnuplot” and so on. This is all very well, until I’m sat at one or the other of the machines and decide to do some work. Pretty soon, confusion reigns as I try to recall in which direction I should rsync the altered files. What I really need is a master repository from which I make working copies that I can synchronise with the master from any location. In other words – revision control.
I’ve been using CVS for a few years just for code and found it to be very beneficial and quite easy to use, especially from within emacs. However, people kept telling me that Subversion (SVN), the CVS alternative, was far superior. Then I came across a project management system called Trac, which is used by the bioinformatics guys at the IMB on our campus and interfaces with SVN. So I thought – how about using SVN + Trac to handle all of my bioinformatics project files – not just the code?
It turns out that this is a great way to organise and maintain files and share them with other people. This post explains how I set up SVN + Trac; the next post looks at some ways to use them for bioinformatics projects.
1. Preamble
I use Ubuntu 7.04 (Feisty), which currently ships with SVN 1.4.3 and Trac 0.10.3. These notes are specific to that setup, but you may be able to adapt them. There are numerous, high-quality guides on the Web that I’ve cannibalised and glued together. See my del.icio.us svn tags. In particular, I’d like to acknowledge Ariejan de Vroom’s website, which has excellent notes on SVN and Trac and these notes by Volodymyr Orlenko.
I’ll assume that your server is set up with Apache 2 (as easy as “sudo apt-get install apache2” on Ubuntu).
2. Install required packages
Just “sudo apt-get install ” the following if you don’t have them already: ssl-cert, subversion, subversion-tools, trac, buildbot, libapache2-svn, libapache2-mod-python, libapache2-mod-python-doc, python-setuptools.
3. Configure Apache
I decided to use SSL (which means basically that passwords are transmitted encrypted, not as plain text and URLs begin with “https://”). Enable the apache ssl module and open up port 443 like so:
sudo a2enmod ssl sudo sh -c "echo 'Listen 443' >> /etc/apache2/ports.conf"
Next, generate a SSL certificate. Some documentation mentions an Ubuntu script named “apache2-ssl-certificate”, but it seems to be absent in Feisty so do this instead:
sudo mkdir /etc/apache2/ssl sudo make-ssl-cert /usr/share/ssl-cert/ssleay.cnf /etc/apache2/ssl/apache.pem
When creating the certificate, you’ll be asked for details about your location and organisation.
Now, create a Virtual Host from the default Apache configuration:
sudo cp /etc/apache2/sites-available/default /etc/apache2/sites-available/001-ssl sudo nano -w /etc/apache2/sites-available/001-ssl (or your favourite editor instead of nano)
I called my file 001-ssl; call yours whatever you like. It should be pretty minimal – mine looks like this:
NameVirtualHost *:443 <VirtualHost *:443> ServerAdmin you@your.mail ServerName your.server.com ErrorLog /var/log/apache2/error.log LogLevel warn CustomLog /var/log/apache2/access.log combined ServerSignature On ## ssl stuff SSLEngine on SSLCertificateFile /etc/apache2/ssl/apache.pem SSLProtocol all SSLCipherSuite HIGH:MEDIUM </VirtualHost>
We’ll be adding more to that file later on when we install Trac. Enable your new virtual host:
sudo a2ensite 001-ssl sudo /etc/init.d/apache2 restart
Now you need to enable basic authentication for the web server. Create a password for yourself:
sudo htpasswd -c -m /etc/apache2/dav_svn.passwd yourname
Some documentation refers to “htpasswd2” – if you don’t have it, just use htpasswd. Subsequent users can be added by omitting the -c switch.
Finally, we enable WebDAV and SVN in Apache. Edit /etc/apache2/mods-available/dav_svn.conf – it’s well annotated so just follow the notes in the file. Mine looks like this:
<Location /svn> DAV svn SVNParentPath /home/svn AuthType Basic AuthName "Subversion Repository" AuthUserFile /etc/apache2/dav_svn.passwd Require valid-user SSLRequireSSL </Location>
A couple of notes. Use SVNParentPath if you want multiple repositories under your SVN root. Common locations for SVN repositories are /srv/svn, /var/lib/svn or /home/svn. I went with the last one, put yours where you like.
So much for Apache. Let’s get on with SVN.
4. Configure SVN
Setting up Subversion is easy. Create a directory for your repositories, make a repository and give it access permissions for the Apache user (www-data on Ubuntu):
sudo mkdir /home/svn sudo svnadmin create /home/svn/projects sudo chown -R www-data.www-data /home/svn/projects sudo chmod -R g+ws /home/svn/projects
Here, I’ve called the repository “projects”. You may prefer a separate repository for each of your projects, or one repository with a hierarchy for each project. The former situation is more administration, the latter has the disadvantage that revision numbers will apply across files from different projects. We’ll look at that more closely in part 2.
At this point, you should be able to fire up a web browser on your server and navigate to “https://localhost/svn/projects”. It won’t look too exciting though; you should see a largely blank page with “Revision 0: /” at the top.
5. Configure Trac
Under Ubuntu, Trac installs itself in /var/lib/trac. The first step is to give it the appropriate permissions:
sudo chown -R www-data.www-data /var/lib/trac
Next, ensure that Apache mod_python is enabled:
sudo a2enmod mod_python
Trac comes with a command-line administration tool named trac-admin. Use it now to set up your Trac project directory and create an administrator user. Note that the name of the project directory is the same as the corresponding SVN repository (projects in my case):
cd /var/lib/trac sudo trac-admin projects initenv sudo trac-admin projects permission add yourname TRAC_ADMIN
We’re almost finished. The penultimate step is to go back to the Apache virtual host configuration file and tell Apache how to access Trac. Put the following somewhere between the <VirtualHost></VirtualHost> tags:
<location /trac> SetHandler mod_python PythonHandler trac.web.modpython_frontend PythonOption TracEnvParentDir /var/lib/trac PythonOption TracUriRoot /trac </location> <locationmatch "/trac/[^/]+/login"> AuthType Basic AuthName "Trac Authentication" AuthUserFile /etc/apache2/dav_svn.passwd Require valid-user SSLRequireSSL </locationmatch>
A few notes about all of that. TracEnvParentDir indicates that we want multiple projects in our Trac environment. TracUriRoot can be named whatever you like, so long as it matches the name in <location >. In the above configuration, anyone can view the Trac website but login requires the same user/password that you set up in step 3. You could make access more restrictive by removing the <locationmatch> tags, so as login is required to see any Trac page. However, you can also set permissions for Trac later on, using either trac-admin or the webadmin tool.
Ah yes, the webadmin tool. Newer versions of Trac allow for adminstration through the web interface, but you need to install the module separately in Trac 0.10.3, the current Ubuntu Feisty version. You also require the correct version for your Trac. Ubuntu makes this easy: the python-setuptools package includes a tool called easy_install which will grab the webadmin plugin from the appropriate URL and install it:
sudo easy_install http://svn.edgewall.com/repos/trac/sandbox/webadmin
All that’s left is to enable webadmin. Trac configuration uses a file named trac.ini, which you’ll find in /usr/share/trac/conf. Unfortunately, it only affects configuration system-wide if it lives in /etc/trac. So you have a few options: (1) copy trac.ini to /etc/trac, (2) symlink the file in /usr/share/trac/conf to /etc/trac or (3) edit the local trac.ini file for each Trac project (e.g. /var/lib/trac/projects/conf/trac.ini). Whatever you do, add these lines to trac.ini:
[components] webadmin.* = enabled
One last “sudo /etc/init.d/apache2 restart” for good luck and you’re there! You should now be able to navigate to “https://localhost/trac/projects”, login, see the Admin tab and view your (currently empty) SVN repository from the Browse Source tab.
Now you’re ready to start loading files into SVN and setting up your Trac website. That’s the subject of the next post.
Would this be any good for files other than code — data with multiple annotations, manuscripts with multiple authors, experiment logs?
It’s good for any type of file that you’d like to put under revision control – code, data, manuscripts. Wait for the next post :)
Hi Neil
I can help you with the post if you are interested. Drop me a note if you need a hand.
Paulo
Subversion is ideal for this type of project because it deals much better with binary files than does CVS, and often bioinformatics involves producing binary files, such as images.
I have a similar approach to what you’re doing however I don’t use trac. I rarely use the revision feature either, but just knowing that all your files are backed up really sets your mind at ease. Plus, as you say, setting up a new computer just means checking out the subversion repository.
I think WebSVN also has comparable functionality to trac, but is much much simpler.
I also wrote a short script in Java, that I use with GeekTool, that just outputs to the lower left of my screen, telling me which files have been modified or are not yet under SVN control. Handy for keeping an eye on things.
Great tutorial Neil, thanks.
With emacs I always used RCS but I was thinking of moving to GIT ( http://git.or.cz/ ) backed up by Linus. I am not sure if trac supports it.
GIT can be downloaded from http://www.kernel.org/pub/software/scm/git/ and tutorial is available at http://www.kernel.org/pub/software/scm/git/docs/tutorial.html .
Great how-to Neil. I followed the instructions almost blindly (see, I trust you enough to be r00t on my machine :)) and got it installed, with a few hiccups.
First minor issue: After Step 4, I had to do an “/etc/init.d/apache2 restart” (or maybe reload), else I couldn’t see the SVN web interface. Frustrating just for a second.
A few notes for those living on the bleeding edge: I’m running Gutsy on amd64 (current almost-released version, dist-upgraded) and there doesn’t seem to be any Trac stuff in /var/lib/trac. Also, once I make a trac project in that directory with appropriate permissions, the Ubuntu packaged version of Trac didn’t seem to work (apache logs indicate something wrong with the python-clearsilver bindings .. this may resolve once Gutsy is properly released). I decided to just “python ./setup.py install” the latest dev version of Trac (0.11dev) since it no longer uses Clearsilver for templating. If I make the directory /var/trac/lib and continue with your instructions, ensuring correct ownership/permissions, it all works nicely.
Cheers !
Glad to be of service.
On reflection, I don’t think /var/lib/trac is created automatically by the Ubuntu package. I’d like to try the cutting-edge Trac myself, but prefer to stick with the official repositories where I can. There’s a way to build deb packages from any source IIRC.
> There’s a way to build deb packages from any source IIRC.
Yes, often I use checkinstall for this … it makes it much easier to uninstall / upgrade / cleanup self-compiled things using the apt package management system.
Pingback: Using subversion and svn-time-lapse to edit a manuscript
Pingback: Bio::Blogs #16 - Halloween edition « Freelancing science
thank you very much..
your tutorial really helps me