Automated bioinformatics discovery through social networking?

Bear with me – this is going to become a bioinformatics post in a few paragraphs.

I’m a slow adopter when it comes to social networking sites. There an an awful lot of them and not enough hours in the day. I don’t go near a site unless someone that I know and trust tells me it’s a good thing. Once there, I only stay if I find it useful and/or enjoyable. So what makes a useful social networking site?

Simply this:

It helps me find new information using my existing information

A couple of examples. I enjoy posting my photos to Flickr, but not so as I can look at them. I have them on my machine already. Flickr is great because it uses information about me (my photos) to find other photos of interest to me, in a wide variety of user-friendly ways: groups, tags, geographical location and so on.

Another example – Facebook. The key feature of Facebook is its applications (and API, should you wish to develop your own). Now, I could list my CD collection on my profile, but why? Not as a reference for me – I can look at them on the shelf. Not for your interest either, although you may conceivably say “hey, cool CDs!” No, the only reason to include music in my profile is so as I can discover new music in similar collections to mine. Incidentally, a lot of the CD/music apps on Facebook are a bit rubbish in this regard. Music fans, do yourself a favour by creating a account, install the app and let things scrobble away. Whilst I’m babbling about apps, Euan’s Bookshare is a great example of how a Facebook app should work (i.e. it connects people with books and people with people). Read about his initial experiences of the API.

So I got thinking – wouldn’t it be cool if bioinformatics web servers worked this way? Imagine going to the NCBI, EBI or wherever and seeing something like:

People who like Escherichia coli also like:

Salmonella typhimurium

Yersinia pestis

Haemophilus influenzae


You might also be interested in these proteins (MW 40 000 – 50 000, pI 4 – 5):

402149140; protein of unknown function UPF0118

402179280; putative exported protein

402185600; secretion protein HlyD

Or even:

These people also searched for GO accession GO:0050421 (nitrite reductase (cytochrome) activity):
Neil Saunders | tickle neil | see neil’s publications | request neil’s data |

How would this work? Like other networks, it would have to be fed with information from users. Dare we imagine a “ for bioinformatics” app, scrobbling our hard drives for fasta files, R scripts and other biological data?

9 thoughts on “Automated bioinformatics discovery through social networking?

  1. I was actually just thinking about something like this the other day… I think that a social network for science (maybe starting with bioinformatics) would be quite useful. I wrote a little blog post about it, basically saying that the profiles of scientists would show their skills/experience as well as publications and maybe even links to data. Instead of “friends” it would be collaborators. Maybe something like this would increase collaboration?

    Personally, I think that bioinformaticians are the some of the only people interested in actively developing and using new software (because we are computer geeks anyways, for the most part). So, I would say it is pretty much useless right now to try and cater to the scientific community at large with tools like this.

  2. it’s closed off from the general web
    Well, that’s a user privacy issue. Facebook isn’t meant to be open; it’s a place to exchange information with people that you know and trust. If you want open discussion and information, you do it somewhere else.

  3. Hmmm … sounds like Nature Network needs an API so anyone can develop Nature Network ‘applications’ in a similar way to Facebook.

    I wonder how ‘scrobbling’ of PubMed searches could be implemented at the client side ? Maybe an indexing plugin for Google Desktop could check the web history and browser cache, and lodge that data with a central server ?

    I think the general ‘culture of secrecy’ amongst many scientists regarding unpublished work would make the hard disk scraping extremely unpopular. Flagging particular items or parts of the filesystem as private/share could solve this, but would probably be altogether too much risk and to much work more the large majority of users. Sharing could be restricted to trusted groups of collaborators, but then this begins to defeat the purpose of finding other scientists that you don’t know with similar interests. scrobbling works since the consequences of someone finding out your musical taste are unlikely to be earth shattering. The consequences of being scooped through sharing too much, for a scientist, are a bigger deal. So while I really like the idea, I think on many levels it is tricker to implement in a way that would be widely adopted in the current climate.

    I could say more, but Die Hard 4.0 becons :)

  4. That is a funny idea :). Nature Desktop Search (NSD), scraping your data from your pc and networked drives and making it available on Nature Network ;). “NDS has identified your SBML/MIAME/etc compliant files and they are available at the site for common use”.

    The most immediate application of this could be papers. People who read/bookmarked/blogged about this paper were also interested in these others. Your contacts in the social network recently showed interest in these papers (options, show only if X people flag paper or in my listed topics of interest).

  5. I’d be more interested in:

    We noticed you have a mutation in YPL462. You may be interested in facial dysmorphia.

    or how about

    Your middle finger is in the top 1% of the population. You may be interested Ehlers-Danlos or books about Ron Jeremy.

