Years as coloured bars

I keep seeing years represented by coloured bars. First it was that demographic tsunami chart. Then there are examples like the one on the right, which came up in a web search today. I even saw one (whispers) at work today.

I get what they are trying to do – illustrate trends within categories over time – but I don’t think years as coloured bars is the way to go. To me, progression over time suggests that time should be an axis, so as the eye moves along the data from one end to the other, without interruption. What I want to see is categories over time, not time within categories.

So what is the way to go? Let’s ask “what would ggplot2 do?”

The following charts illustrate different ways to visualise the same data using ggplot2. My motivation here is to show you that if you don’t immediately know or see a “right way” to do something, tools such as ggplot2 make it easy to “feel your way” to a solution, through exploration.

The charts and their accompanying code are available at Github. Click each image at right for a full-size version.

To start, we can create our own “years as coloured bars” chart, using some toy data.

It looks better already just for being generated using ggplot2. But can we go better?

Your first thought might be “why not just swap the years and categories around?” And sure, that gives us time along an axis. Now though, it’s a little difficult to follow each category, as the eye has to skip all the others when moving to the next time point.

OK you say, I can get all the categories at the same time point by stacking. A couple of problems now: first, abrupt changes in value can make a category shrink dramatically or move around vertically in a distracting fashion. And second, making the categories proportional can make it difficult to determine the absolute values for anything other than the lowest row of the bars.

What if we stack and fill so all the bars add up to 100%? Better, but still has some of the issues with the previous plot.

How about filling, but using an area plot instead of bars?

I think this works much better: the continuous connection of categories makes it easier to follow each one through time. However, there is still the issue of relative versus absolute values. And to my eye, downward lines can be interpreted as decreases even when the width of the area for a category indicates that the value has increased.

Coloured lines? They can work – in parallel coordinates for example – but in this case, the overlaps make a mess.

Let’s start thinking about facets: separate regions within the chart for each category. First attempt.

Now we’re getting somewhere – much easier to follow each category over time. One issue with this particular arrangement is that it’s a little difficult to compare categories and the eye finds it difficult to isolate a facet from surrounding facets.

How about wrapping facets, instead of a grid? Let’s try first with facets by year and bars for categories.

Not bad at all – this makes categories within a year very clear. But wait, we wanted a good view of each category over time. So how about…

…facet by category using lines for years.

I think we have a winner. This clearly illustrates change per category over time and the layout and common scales even allow for comparison between categories.

We started out complaining about time within categories but in fact, that is what we wanted after all: just arranged in a better way than years as coloured bars .

4 thoughts on “Years as coloured bars

  1. I think for the airline travel example, the point is probably the seasonal trend (with years as cases) rather the the interannual trend, so facet by year with bars for categories (your second last plot) works better than facet by category (last plot) in that case. Or else two plots; a boxplot or similar with month as category plus a second plot to show the longer-term trend over time.

    • I just grabbed that image from the web at random for a “years by bars” example, so have not given it much thought. But you make a good point: the key thing is to isolate the aspect of the data that you want to highlight and go from there.

  2. I like the stacked bars the best. I don’t think the final result is actually very intuitive in comparing one another.

Comments are closed.