Tag Archives: friendfeed

Analysis of ISMB coverage at FriendFeed: 2008 – 2011

ISMB/ECCB 2011 was held between July 15-19 this year and as in previous years, FriendFeed was used to cover the meeting.

Last year, I wrote a post about how to use R to analyse the coverage. I was planning something similar for 2011 when I thought: we have 4 years of ISMB at FriendFeed now – why not look at all of them?

So I did. Read on for the details.
Read the rest…

Farewell FriendFeed. It’s been fun.

I’ve been a strong proponent of FriendFeed since its launch. Its technology, clean interface and “data first, then conversations” approach have made it a highly-successful experiment in social networking for scientists (and other groups). So you may be surprised to hear that from today, I will no longer be importing items into FriendFeed, or participating in the conversations at other feeds.

Here’s a brief explanation and some thoughts on my online activity in the coming months.
Read the rest…

APIs have let me down part 2/2: FriendFeed

In part 1, I described some frustrations arising out of a work project, using the Array Express API. I find that one way to deal mentally with these situations is to spend some time on a fun project, using similar programming techniques. A potential downside of this approach is that if your fun project goes bad, you’re really frustrated. That’s when it’s time to abandon the digital world, go outside and enjoy nature.

Here then, is why I decided to build another small project around FriendFeed, how its failure has led me to question the value of FriendFeed for the first time and why my time as a FriendFeed user might be up.
Read the rest…

APIs have let me down part 1/2: ArrayExpress

The API – Application Programming Interface – is, in principle, a wonderful thing. You make a request to a server using a URL and back come lovely, structured data, ready to parse and analyse. We’ve begun to demand that all online data sources offer an API and lament the fact that so few online biological databases do so.

Better though, to have no API at all than one which is poorly implemented and leads to frustration? I’m beginning to think so, after recent experiences on both a work project and one of my “fun side projects”. Let’s start with the work project, an attempt to mine a subset of the ArrayExpress microarray database.
Read the rest…

Lazy post: a Life Scientists best-of

When stuck for ideas or pressed for time, a blogger can always fall back on a round-up of activity from elsewhere on the web. Yes, it’s time for a “best of the past 14 days” from the FriendFeed Life Scientists group.

Just a slight twist to make it more exciting (?) – we’ll automate the process using the API and a little Ruby.
Read the rest…

ISMB/ECCB 2009 reports

Great to see more reports describing the use of online tools to cover scientific meetings. Here are the publications, from PLoS Computational Biology:

Live Coverage of Scientific Conferences Using Web Technologies.
doi:10.1371/journal.pcbi.1000563

Live Coverage of Intelligent Systems for Molecular Biology/European Conference on Computational Biology (ISMB/ECCB) 2009.
doi:10.1371/journal.pcbi.1000640

And here’s Ally a.k.a the robo-blogger on Social Networking and Guidelines for Life Science Conferences.

Looks like we’ve started a trend, long may it continue at future meetings.

How to: archive data via an API using Ruby and MongoDB

I was going to title this post “How to: archive a FriendFeed feed in MongoDB”. The example code does just that but (a) I fear that this blog suggests a near-obsession with FriendFeed (see tag cloud, right sidebar) and (b) the principles apply to any API that returns JSON. There are rare examples of biological data with JSON output in the wild, e.g. the ArrayExpress Gene Expression Atlas. So I’m still writing a bioinformatics blog ;-)

Let’s go straight to the code:

#!/usr/bin/ruby

require "rubygems"
require "mongo"
require "json/pure"
require "open-uri"

# db config
db  = Mongo::Connection.new.db('friendfeed')
col = db.collection('lifesci')

# fetch json
0.step(9900, 100) {|n|
  f = open("http://friendfeed-api.com/v2/feed/the-life-scientists?start=#{n}&num=100").read
  j = JSON.parse(f)
  break if j['entries'].count == 0
  j['entries'].each do |entry|
    if col.find({:_id => entry['id']}).count == 0
      entry[:_id] = entry['id']
      entry.delete('id')
      col.save(entry)
    end
  end
  puts "Processed entries #{n} - #{n + 99}", "Database contains #{col.count} documents."
}

puts "No more entries to process. Database contains #{col.count} documents."

Also available as a gist. Fork away.

A quick run-through. Lines 4-6 load the required libraries: mongo (the mongodb ruby driver), json and open-uri. If you don’t have the first two, simply “gem install mongo json_pure”. Of course, you’ll need to download MongoDB and have the mongod server daemon running on your system.

Lines 9-10 connect to the database (assuming a standard database installation). Rename the database and collection as you see fit. Both will be created if they don’t exist.

The guts are lines 12-25. A loop fetches JSON from the FriendFeed API, 100 entries at a time (0-99, 100-199…) up to 9999. That’s an arbitrarily-high number, to ensure that all entries are retrieved. Change “the-life-scientists” in line 14 to the feed of your choice. The JSON is then parsed into a hash structure. In lines 17-23 we loop through each entry and extract the “id” key, a unique identifier for the entry. This is used to create the “_id” field, a unique identifier for the MongoDB document. If a document with _id == id does not exist we create an _id key in the hash, delete the (now superfluous) ‘id’ key and save the document. Otherwise, the entry is skipped.
At some point the API will return no more entries: { “entries” : [] }. When this happens, we exit the block (line 16) and print a summary.

That’s it, more or less. Obviously, the script would benefit from some error checking and more options (such as supplying a feed URL as a command line option). For entries with attached files, the file URL but not the attachment will be saved. A nice improvement would be to fetch the attachment and save it to the database, using GridFS.

Possible uses: a simple archive, a backend for a web application to analyse the feed.

Reblog this post [with Zemanta]