As Mike keeps reminding me, getting your data into database tables is A Good Thing. Like many people, my database of choice is MySQL – largely because it was the first one that I tried and it works for me.
However, I’m far from being an expert MySQL user. In fact, I’ve identified 3 stages in my use of MySQL over the years; see if you recognise yourself in any of them.
- The single table database
- …Discovery of the join
- The optimised database
Early in your MySQL career, you discover either “LOAD DATA INFILE” or the command-line tool “mysqlimport”. You realise that this is a quick and easy way to get all of those delimited text files straight into a database. Off you go, merrily importing everything on your hard drive into tables.
At some stage, you begin to wonder whether listing the same identifier (gene name, PDB ID, species…) many times in multiple columns is sensible. This eureka moment leads you to…
At some stage, you get your head around joining tables. At first, you can’t quite see the point – isn’t it easier to devise a query on a single table? Maybe so, but it’s inefficient and severely limits the combinations of data that your query can return. It’s a slow realisation but pretty soon, you’re chopping up your data into multiple tables and joining all over the place. Including examples where you probably don’t need to do so.
You begin to wonder why your MySQL queries are so slow. Sure, you have a table with 20 million rows, but your computer does everything else within seconds, so what’s up with MySQL?
You head to the MySQL documentation website, try “optimization” as a query and discover a whole new world. OK, the various “my.ini” example files made little or no difference, but that 200x speed increase when you indexed multiple keys was definitely worth it.
Just starting out in MySQL? My advice: head over to Chapter 7: Optimization right now, skim through and if it doesn’t make complete sense yet, bookmark it for later.


