The recent ABC News article Australia’s pollution mapped by postcode reveals nation’s dirty truth is interesting. It contains a searchable table, which is useful if you want to look up your own suburb. However, I was left wanting more: specifically, the raw data and some nice maps.
So here’s how I got them, using R.
You can file this one under “I may have the very specific solution if you’re having exactly the same problem.”
So: if you’re running some R code and you see a warning like this:
In checkMatrixPackageVersion() : Package version inconsistency detected.
TMB was built with Matrix version 1.2.14
Current Matrix version is 1.2.15
Please re-install 'TMB' from source using
install.packages('TMB', type = 'source') or ask CRAN for a binary
version of 'TMB' matching CRAN's 'Matrix' package
Sydney’s congestion at ‘tipping point’
Dual-axes at tipping-point
blares the headline and to illustrate, an interactive chart with bars for city population densities, points for commute times and of course, dual-axes.
Yuck. OK, I guess it does show that Sydney is one of three cities that are low density, but have comparable average commute times to higher-density cities. But if you’re plotting commute time versus population density…doesn’t a different kind of chart come to mind first? y versus x. C’mon.
I love it when researchers take the time to share their knowledge of the computational tools that they use. So first, let me point you at Environmental Computing, a site run by environmental scientists at the University of New South Wales, which has a good selection of R programming tutorials.
One of these is Making maps of your study sites. It was written with the specific purpose of generating simple, clean figures for publications and presentations, which it achieves very nicely.
I’ll be honest: the sole motivator for this post is that I thought it would be fun to generate the map using Leaflet for R as an alternative. You might use Leaflet if you want:
- An interactive map that you can drag, zoom, click for popup information
- A “fancier” static map with geographical features of interest
- concise and clean code which uses pipes and doesn’t require that you process shapefiles
The code that generated the report (which I’ve used heavily and written about before) is at Github too. A few changes required compared with previous reports, due to changes in the
rtweet package, and a weird issue with kable tables breaking markdown headers.
I love that the most popular media attachment is a screenshot of a Github repo.
“Some R functions have an awful lot of arguments”, you think to yourself. “I wonder which has the most?”
A brief message for anyone who uses my PubMed retractions report. It’s no longer available at RPubs; instead, you will find it here at Github. Github pages hosting is great, once you figure out that
docs/ corresponds to your web root :)
Now I really must update the code and try to make it more interesting than a bunch of bar charts.
If you still follow my Twitter feed – I pity you, as it’s been rather boring of late. Consisting largely of Github commit messages, many including the words “knit to github document”.
Here’s why. RPubs, an early offering from RStudio, has been a great platform for easy and free publishing of HTML documents generated from RMarkdown and written in RStudio. That said, it’s always been very basic (e.g. no way to organise documents by content, tags). There’s been no real development of the platform for several years and of late, I’ve noticed it’s become less reliable. Bugs, for example, such as one document overwriting another when published from RStudio.
I think it’s unlikely that issues will be addressed, given that RStudio are now focused on RStudio Connect. So I’ve removed as many documents as I can and rewritten them as Github documents. These render as HTML when pushed to Github, generating attractive reports. Here’s an example.
I’ve done my best to update all blog posts here with links to the new reports. If you do come across old broken links to RPubs reports, just remember that the content is probably now at Github.
PubMed Commons, the NCBI’s experiment in comments for PubMed articles, has been discontinued. Thoroughly too, with all traces of it expunged from the NCBI website.
Last time I wrote about the service, I concluded “all it needs now is more active users, more comments per user and a real API.” None of those things happened. Result: “NIH has decided that the low level of participation does not warrant continued investment in the project, particularly given the availability of other commenting venues.”
NLM also write that “all comments are archived on our FTP site.” A CSV file is available at this location. So is it good for anything?
You know the drill by now. Grab the tweets. Generate the report using RMarkdown. Push to Github. Publish the report.
This time it’s the Australian Bioinformatics & Computational Biology Society Conference 2017, including the COMBINE symposium. Looks like a good time was had by all in Adelaide.
A couple of quirks this time around. First, the rtweet package went through a brief phase of returning lists instead of nice data frames. I hope that’s been discarded as a bad idea :) There also seem to be additional columns, new column names and list-columns in the output from the latest
search_tweets(), so there goes my previous code…
Second, given that most Twitter users have had 280 characters since about November 7, is this reflected in the conference tweets?
With thanks to Andrew Lonsdale for clearing up my confusion and pointing me to Twitter extended mode, the answer is “yes, somewhat”. Plenty of tweets are still hitting the 140 limit though: time to update those clients?