R: calculations involving months

Ask anyone how much time has elapsed since September last year and they’ll probably start counting on their fingers: “October, November…” and tell you “just over 9 months.”

So, when faced as I was today with a data frame (named dates) like this:

pmid1       year1    month1     pmid2      year2    month2
21355427    2010     Dec        21542215   2011     Mar
21323727    2011     Feb        21521365   2011     Jun
21297532    2011     Feb        21336080   2011     Mar
21291296    2011     Apr        21591868   2011     Jun
...

How to add a 7th column, with the number of months between “year1/month1” and “year2/month2”?

R has lots of methods for date/time calculations, many of which seem notoriously difficult to get your head around. One answer for my problem came, as so often, from the R-help mailing list. I liked the title of the question too: “difference between 2 dates: IN MONTHS the way Mothers compute it“. It uses the zoo library and goes like this:

library(zoo)
dates$months <- 12 * as.numeric(as.yearmon(paste(dates$year2, dates$month2, sep = "-"), "%Y-%b") - as.yearmon(paste(dates$year1, dates$month1, sep = "-"), "%Y-%b"))

Rather ungainly and difficult to read, but it works. The trick is to paste year and month together to generate, e.g. “2011-Jun”, then tell as.yearmon() the date format: %Y = YYYY (e.g. 2011), %b = abbreviated month (e.g. Jun). Result:

  pmid1        year1     month1     pmid2     year2  month2 months
1 21355427     2010      Dec        21542215  2011   Mar    3
2 21323727     2011      Feb        21521365  2011   Jun    4
3 21297532     2011      Feb        21336080  2011   Mar    1
4 21291296     2011      Apr        21591868  2011   Jun    2

Note to self: improve understanding of R dates and times.

4 thoughts on “R: calculations involving months

  1. This uses the same basic idea but improves on the code so its not so ungainly:

    library(zoo)
    to.ym <- function(y, m) as.yearmon(paste(y, m), "%Y %b")
    12 * with(dates, to.ym(year2, month2) – to.ym(year1, month1))

    Also its almost never a good idea to represent dates using multiple columns in data frames and that is the reason its ungainly — not because of yearmon: Suppose we had represented them as DF below. Then we could have reduced it to just this:

    library(zoo)
    DF <- data.frame(d1 = "2010-01", d2 = "2011-02")
    12 * with(DF, as.yearmon(d2) – as.yearmon(d1))

  2. Pingback: R: calculations involving months | What You’re Doing Is Rather Desperate | R | Scoop.it

Comments are closed.