ISMB coverage on Twitter? It’s possible there was…

Peter writes:

I wonder if part of the drop off is live bloggers moving to platforms like Twitter? I can tell you it seemed like there were almost as many tweets for one SIG (#bosc2011) as for the whole of #ISMB / #ECCB2011, and I personally didn’t post anything to FriendFeed but posted lots on Twitter.

Well, there’s a problem with using Twitter for analysis of conference coverage. Let’s try searching for ISMB-related tweets using the twitteR package:

library(twitteR)
ismb <- searchTwitter("ismb", 1000)
length(ismb)
# [1] 30

oldertweets

If we can't archive, how can anyone else?

30? Are we using twitteR properly? Running the same search at the Twitter website gives roughly the same results, plus this unhelpful message.

I like Twitter – as a real-time communication tool. As a data archive? Forget it.

4 thoughts on “ISMB coverage on Twitter? It’s possible there was…

  1. Epi_Junkie

    Twitter shouldn’t be used as a data archive – it’s not for that, and relying on that is…problematic. If you’re looking to do Twitter analysis, I’m pretty sure the best approach is still to be constantly scraping and storing Tweets for later analysis on your own systems.

    1. nsaunders Post author

      Yup, the “daily scrape” could work well. What amuses me is talk of things like archiving every tweet at the Library of Congress. I’m pretty sure that not even Twitter have a complete archive – they’ve certainly lost plenty of mine without explanation.

  2. Peter

    If you set it up early, daily scrape service like seems to work pretty well, e.g. http://twapperkeeper.com/hashtag/bosc2011

    However, if setup too late it missed the past tweets: http://twapperkeeper.com/hashtag/ISMB

    This boils down to the Twitter search API not going back very far (and the time period it covers seems to have got shorter and shorter as Twitter has grown). Same issue Iddo had here http://bytesizebio.net/index.php/2011/07/22/ismb-2011-tweets/

    In this situation, the best you can do is mine the time lines of known or likely users of the search terms – that lets you go back further than the main search API. In this case using the posters in Iddo’s list would be a good start.

    However, clearly the further back you’re trying to go in twitter, the harder it will be. And for #ISMB / #ECCB2011 which were only about 3 weeks ago, it may already be too late :(

  3. Larry (IEOR Tools)

    I found this problem of trying to find archived Twitter feeds as well? So what is this telling us? Is Twitter messages irrelevant? Does Twitter think that the cost of storage outweighs the knowledge and discovery of text mining the old tweets? I have an idea for using a Twitter app but I don’t know where to find archived information. It looks like I might have to scrape up the data for a year in order develop.

Comments are closed.