Years as coloured bars

I keep seeing years represented by coloured bars. First it was that demographic tsunami chart. Then there are examples like the one on the right, which came up in a web search today. I even saw one (whispers) at work today.

I get what they are trying to do – illustrate trends within categories over time – but I don’t think years as coloured bars is the way to go. To me, progression over time suggests that time should be an axis, so as the eye moves along the data from one end to the other, without interruption. What I want to see is categories over time, not time within categories.

So what is the way to go? Let’s ask “what would ggplot2 do?”
Continue reading

Chart golf: the “demographic tsunami”

“‘Demographic tsunami’ will keep Sydney, Melbourne property prices high” screams the headline.

While the census showed Australia overall is aging, there’s been a noticeable lift in the number of people aged between 25 to 32.
As the accompanying graph shows…

Whoa, that is one ugly chart. First thought: let’s not be too hard on Fairfax Media, they’ve sacked most of their real journalists and they took the chart from someone else. Second thought: if you want to visualise change over time, time as an axis rather than a coloured bar is generally a good idea.

Can we do better?
Continue reading

Nice graphic? Are they taking the p…

Yes, it started with a tweet:

By what measure is this a “nice graphic”? First, the JPEG itself is low-quality. Second, it contains spelling and numerical errors (more on that later). And third…do I have to spell this out…those are exploded 3D pie charts.

Can it be fixed?
Read the rest…

Putting data on maps using R: easier than ever

New Zealand earthquake density 2010 - November 2016

New Zealand earthquake density 2010 – November 2016

Using R to add data to maps has been pretty straightforward for a few years now. That said, it seems easier than ever to do things like use map APIs (e.g. Google, Open Street Map), overlay quite complex data visualisations (e.g. “heatmap-style” densities) and even generate animations.

A couple of key R packages in this space: ggmap and gganimate. To illustrate, I’ve used data from the recent New Zealand earthquake to generate some static maps and an animation. Here’s the Github repository and a report. Thanks to Florian Teschner for a great ggmap tutorial which got me started.

My own work in bioinformatics to date has not (sadly!) required much analysis of geospatial data but I can see use cases in many areas – environmental microbiology, for example.

The y-axis: to zero or not to zero

I don’t “do politics” at this blog, but I’m always happy to do charts. Here’s one that’s been doing the rounds on Twitter recently:

What’s the first thing that comes into your mind on seeing that chart?

It seems that there are two main responses to the chart:

  1. Wow, what happened to all those Democrat voters between 2008 and 2016?
  2. Wow, that’s misleading, it makes it look like Democrat support almost halved between 2008 and 2016

The question then is: when (if ever) is it acceptable to start a y-axis at a non-zero value?

Read the rest…

Note to journals: “methodologically sound” applies to figures too

PeerJ, like PLoS ONE, aims to publish work on the basis of “soundness” (scientific and methodological) as opposed to subjective notions of impact, interest or significance. I’d argue that effective, appropriate data visualisation is a good measure of methodology. I’d also argue that on that basis, Evolution of a research field – a micro (RNA) example fails the soundness test.
Continue reading

Venn figures go wrong

6-way Venn banana

6-way Venn banana

I thought nothing could top the classic “6-way Venn banana“, featured in The banana (Musa acuminata) genome and the evolution of monocotyledonous plants.

That is until I saw Figure 3 from Compact genome of the Antarctic midge is likely an adaptation to an extreme environment.

5-way Venn roadkill

5-way Venn roadkill

What’s odd is that Figure 2 in the latter paper is a nice, clear R/ggplot2 creation, using facet_grid(), so someone knew what they were doing.

That aside, the Antarctic midge paper is an interesting read; go check it out.

This led to some amusing Twitter discussion which pointed me to *A New Rose : The First Simple Symmetric 11-Venn Diagram.


[*] +1 for referencing The Damned, if indeed that was the intention.

The Life Scientists at FriendFeed: 2009 summary

The Life Scientists

The Life Scientists 2009


It’s Christmas Eve tomorrow and so I declare the year over. My Christmas gift to you is a summary of activity in 2009 at the FriendFeed Life Scientists group. It’s crafted using R + Ruby, with raw data and some code snippets available. If you want to see the most popular items from the group this year, head down to the bottom of this post.

(Note: this post is a work in progress)
Read the rest…

Easy visualisation of database schemas using SQLFairy

BioSQL schema

BioSQL schema


Here’s a common problem solved: how to generate a pretty picture of your database schema. A Google search throws up all manner of home-brewed solutions using graphviz, perl scripts and so on. Or you can make life easier and simply install SQLFairy.

Under Ubuntu: as simple as “sudo apt-get install sqlfairy”.

Next, dump your database tables, e.g. for MySQL:

mysqldump -u username -p -d mydatabase > mydatabase.sql

Finally, for a PNG image of your schema:

sqlt-graph -f MySQL -o mydatabase.png -t png mydatabase.sql

Too easy. Example shown is the BioSQL schema.

update: if your schema lacks explicit foreign keys, try the –natural-join options (man sqlt-graph, man sqlt-diagram)