Ebola, Wikipedia and data janitors

Sometimes, several strands of thought come together in one place. For me right now, it’s the Wikipedia page “Ebola virus epidemic in West Africa”, which got me thinking about the perennial topic of “data wrangling”, how best to provide public data and why I can’t shake my irritation with the term “data science”. Not to mention Ebola, of course.

I imagine that a lot of people with an interest in biological data are following this story and thinking “how can I visualise the numbers for myself?” Maybe you’d like to reproduce the plots in the Timeline section of that Wikipedia entry. Surprise: the raw numbers are not that easy to obtain.

2014-09-26 note: when Wikipedia pages change, as this one has, code breaks, as this code has; updates maintained at Github
Continue reading

Reasons to love the Web #999

Every day, I’m amazed by the information ecosystem that we call the WWW and how it has changed forever the way we educate ourselves.

Today’s illustration. I spent part of last weekend strolling through the beautiful rainforest of Brisbane Forest Park, a mere hour’s drive from the city. On the track at Maiala I heard a very bizarre noise, high in the misty canopy. The sound was a blend of fighting cats and crying children, yet strangely musical. It was a new sound to me but the cat-like aspect was a give-away, since I was aware of a species called the green catbird.

Back at home, I consulted the trusty Simpson and Day’s Birds of Australia. It described a sound similar to what I had heard but of course, bird sounds don’t translate to written English very well. So I headed off to the appropriate Wikipedia entry. It’s not one of the more compehensive pages but in the external links includes:

Green Catbird audio recording at Freesound

I played the sound – it was exactly what I had heard. What’s more the page is tagged, geotagged and part of a wonderful resource called the Freesound project – a collaborative database of Creative Commons licensed sounds.

So in the space of a few hours I lifted my spirits in the great outdoors, heard something new, tracked it down on the Web and discovered a bunch of new, interesting related information. That’s the Web at its best; integrating seamlessly with your daily life to enhance what you see around you. When it works, it’s an almost Zen-like experience.

More wikis in biology

Hot on the heels of WikiProteins comes:

Huss, J.W. III et al. ( 2008 )
A Gene Wiki for Community Annotation of Gene Function
PLoS Biol 6(7): e175 | Open Access

Which anyone can read, because it’s open access. It’s a realistic assessment of community annotation, focusing on the creation of gene stubs for editing within Wikipedia. Early reaction at the OpenHelix blog and a thread at FriendFeed.

Thanks to Andrew Su, who was kind enough to send me a preprint.

Update: more FriendFeed threads via this search