The end of Google Reader: a scientist’s perspective

Since 2005, I have started almost every working day by using one Web application – an application that occupies a permanent browser tab on my work and home desktop machines. That application is Google Reader.

If you’re reading this, you’re probably aware that Google Reader will cease to exist from July 1 2013. Others have ranted, railed against the corporate machine and expressed their sadness. I thought I’d try to explain why, for this working scientist at least, RSS and feed readers are incredibly useful tools which I think should be valued highly.

Read the rest…

Friday fun with: Google Trends

Some years ago, Google discovered that when people are concerned about influenza, they search for flu-related information and that to some extent, search traffic is an indicator of flu activity. Google Flu Trends was born.

bronchitis

Google Trends: bronchitis

Illness is sweeping through our department this week and I have succumbed. It’s not flu but at one point, I did wonder if my symptoms were those of bronchitis. Remembering Google Flu Trends, I thought I’d try my query for “bronchitis” at Google Trends, where I saw the chart shown at right.

Interesting. Clearly seasonal, peaking around the latest and earliest months of each year. Winter, for those of you in the northern hemisphere.

Next:

  • select USA and Australia as regions
  • download the data in CSV format (I chose fixed scaling), rename files “us.csv” and “aus.csv”
  • edit the files a little to retain only the “Week, bronchitis, bronchitis (std error)” section

Fire up your R console and try this:

library(ggplot2)
us <- read.table("us.csv", header = T, sep = ",")
aus <- read.table("aus.csv", header = T, sep = ",")
# add a region column
us$region <- "usa"
aus$region <- "aus"
# combine data
alldata <- rbind(us, aus)
# add a date column
alldata$week <- strptime(alldata$Week, format = "%b %d %Y")
# and plot the non-zero values
ggplot(alldata[alldata$bronchitis > 0,], aes(as.Date(week), bronchitis)) + geom_line(aes(color = region)) + xlab("Date")

bronchitis2

Google Trends: bronchitis, USA + Australia

Result shown at right: click for the full-size version.

That’s not unexpected, but it’s rather nice. In the USA peak searches for “bronchitis” coincide with troughs in Australia and vice-versa. The reason, of course, is that search peaks for both regions during winter, but winter in the USA (northern hemisphere) occurs during the southern summer (and again, vice-versa).

There must be all sorts of interesting and potentially useful information buried away in web usage data. I guess that’s why so many companies are investing in it. However, for those of us more interested in analysing data than marketing – what else is “out there”? Can we “do science” with it? How many papers are published using data gathered only from the Web?

Create your own Google Scholar RSS feed

Google Scholar is a useful tool and now has a dedicated blog. The first post is dedicated to email alerts.

It’s unimaginable, in 2010, that an alert service would not provide an RSS feed, so I can only assume that this feature will appear “in due course”. In the meantime, a quick Google search for create rss feed from website lead me to 7 Tools To Make An RSS Feed Of Any Website. I quickly tested them all and I agree with the author of the article: Feed43 is the winner.

The process for creating a Google Scholar feed is a little complex. Here’s my first attempt.

Update: interesting FriendFeed thread, where people point out that (a) scraping Google Scholar is quite likely to fail and (b) this is not the same as an alert, since results are not ordered by date.
Read the rest…

Dear Google

I think you’re a pretty good company. I like many of your products and use them daily, for work and at home. I admire many of your innovations and technical solutions.

But this Buzz thing. You’ve really messed up. Two points:

  1. Social networks should always be opt-in. Never, never opt-out. I choose whether to join in the first place. If I do join, I choose who to connect with, what to share and who can see it. And I expect complete control over the entire process, from the outset.
  2. My list of email contacts is not a social network. It’s a list of people with whom I’ve corresponded by email at least once. That’s all they have in common. Furthermore, there’s a big difference between them exposing their public profiles and me exposing their presence in my address book.

I am normally an enthusiastic, early-adopter of new web tools and a pretty “tech-savvy” individual. Yet Buzz has me confused, annoyed and eager to disable it as fast as I can. It’s not me, it’s you.

I hope that you put more thought into how your next release might impact your users.

It’s true: you can’t believe everything that you read on the Web

An oft-repeated cliché is that “you can’t believe what you read on the Web.” Of course, you can’t believe what you read anywhere: it’s up to individuals to assess the quality and reliability of information, regardless of the source. That said, it can be alarming to sit back and watch the speed with which errors propagate in cyberspace. Yesterday, I watched this unfold in a few short hours:

  1. I (and others) bookmark a link to a project called gpeerreview, hosted at Google Code
  2. A blog post (since corrected) states that the search giant has been working on a peer review tool
  3. My bookmark appears at FriendFeed, where we discuss the incorrect attribution of the project to Google
  4. Another blog post on Google Peer Review appears
  5. Links and comments about gpeerreview start popping up all over FriendFeed, some of which suggest it is a Google project

The great thing about the Web though, is that it corrects itself just as rapidly. With a few well-placed comments, some discussion at FriendFeed and the best solution – an email enquiry to the project developer (well done Richard!), the phrase “Google Peer Review” was consigned to the error basket.

I’m not pointing the finger or criticising anyone here. Unless you develop software, you’re unlikely to be aware of Google Code and the URL/site design do make it look like a “content owned by Google website”. Just be aware: when writing, to be sure of your facts and when reading, to critically assess and not blindly accept.

Google Health Live

The much-vaunted Google Health is online.

There’s a good early review at TechCrunch. Expect further coverage from bloggers who cover personalised medicine issues; you know who they are.

Questions that occur to me are: (1) how much personal information do you need to enter for the service to be useful; (2) how much will users be willing to enter? This will be a real test of the degree to which people trust Google with personal information.

Google’s appengine

Been following the Google appengine release via FriendFeed and Twitter (with thanks to @mndoci, @rvidal). A few resources:

In brief: appengine provides infrastructure for developers to host web applications using the same tools that Google uses (notably GFS and a scalable data store termed BigTable). What’s more – there is a free service and anyone with a Google account can sign up for the beta test. Any downsides? It’s Python-only for now, but that’s expected to change.

This is exciting news. I would really like to see some bioinformatics developers try it out and report back with their experiences.

Evolution of an idea

It’s great to sit back and watch ideas and software unfold.

Just over a year ago, Euan asked whether anyone was employing AJAX in graphical genome browsers. The old-style “reload on refresh” browsers (UCSC, Gbrowse, Ensembl) were starting to look a bit Web 1.0.

This sparked plenty of discussion, including a pointer to X:Map: a very nice alternative view of Ensembl data using the Google Maps API (update: and of course ajax-ification of Gbrowse).

Jump forward to today and thanks to Euan’s del.icio.us feed via FriendFeed, I discover Genome Projector, which takes the zoom-able Google Maps idea to a new level.

And that’s how social networks let you discover stuff. Brilliant.