Archive for ‘uncategorized’

July 13, 2012

Comments at journal websites: just turn them off

A couple of years ago, I noted that some journals were not making the process of commenting on articles especially easy. My latest experience suggests that little has changed.
Read the rest…

February 27, 2010

Posts that never made it

Three blog posts have been sitting in my drafts folder for a year. Inspired by Andrew’s post on posts that never made it, I’d like to describe them briefly, before I hit “delete” and move on.
Read the rest…

January 9, 2009

Words fail me

As I’m a biologist, rather than an inorganic chemist or a mineralogist, I don’t have much (well, any) need to look at crystal structures of simple inorganic compounds. Just as well…

…our story begins at Twitter, where David Bradley asks:

Anyone know where to find crystal structures of sodium hypochlorite and sodium bisulfate (cif files or similar) ? #science #crystal

Never thought about it, you say, but surely it can’t be very difficult. So you head to Google and try searches such as “inorganic crystal structure database”. Where you unearth two main players: the Inorganic Crystal Structure Database (ICSD) and the Cambridge Structural Database (CSD). Both are private, requiring registration, login and in one case, installation of an X-client.

Coming from bioinformatics where comparable resources such as the PDB are freely-available via web interfaces, I find this utterly perplexing. Why do these research communities stand for it? Is anyone developing free, open alternatives?

October 30, 2008

When information retrieval goes…weird

Bar-tailed godwit

Bar-tailed godwit

This is a little odd – the tale of the publication that isn’t.

Update: the “missing article” surfaced in my RSS reader on Nov 1; here’s the link

Read the rest…

October 17, 2008

Giant panda genome: mapped or sequenced?

I’m with Ogden Nash who said:

I love the baby giant panda,
I’d welcome one to my veranda

This week, I learned via Keith that Chinese scientists announced the completion of the giant panda genome. An impressive achievement, given that the project was announced in March this year, but what exactly has been completed? Has the genome been sequenced – that is, there are strings of A, C, G and T covering most chromosomes, or mapped – that is, the approximate chromosomal location of most genes determined? The media seem unsure.

And so on. Here’s a Google News search with more hits.

So what has been achieved – sequencing or mapping? If the former, is it really complete (I doubt this) or draft – and if draft, what kind of quality? And where are the data? Nothing in the genome project section of NCBI as yet.

September 18, 2008

Not as many structures as you might think

In the midst of preparing a talk for next Monday. It occurred to me that perhaps we don’t see more protein structure-based prediction in bioinformatics because – there aren’t enough structures.

pdbstats

pdbstats

Sure, the PDB has grown a lot in the past 5 years or so and 53 103 structures (as of now) looks impressive. However, if you’re interested in protein-protein interaction, you want at least 2 chains: which more or less halves the dataset. If you want two different protein chains, you lose almost another 75%. Let’s specify a reasonable minimum resolution for X-ray diffraction data and there go ~ 3 000 entries. We probably don’t want multiple, similar proteins so let’s remove sequence identity at a redundancy of 90%. We’re left with about 2% of the original PDB, which might be useable for looking at interactions.

No wonder that most bioinformatics focuses on sequences and high-throughput interaction data.

Follow

Get every new post delivered to your Inbox.

Join 2,202 other followers