A lot of questions at BioStar begin along these lines:
Where can I find…?
I am looking for a resource…?
Is there some database…?
I tweeted some concerns about this:
Many #biostar questions begin “I am looking for a resource..”. The answer is often that you need to code a solution using the data you have.
Chris tweeted back:
@neilfws Lit. or Google search is first step, asking around is the next logical step. (Re-)inventing wheels is last. Worth asking, IMHO.
We had a little chat and I realised that 140 characters or less was not getting my point across (not for the first time). What I was trying to say was something like this.
Chris is quite right; searching, then asking for existing solutions is the correct first approach. However, the tone of certain questions makes it sound as though some people believe that there must be a ready-made resource for any given situation, or for their exact circumstances. For example, a moments thought would make it clear that you are unlikely to find, just lying around on the Web:
- A list of DNA sequence accessions for a gene from your list of organisms
- A set of secondary structure predictions for your list of proteins
What you will find on the Web are larger datasets from which you can extract your subset of interest – and the tools to do it. In the examples above, this entails:
- Retrieving identifiers for your organisms from a taxonomy database, linking them to identifiers of DNA sequences and filtering for your gene
- Retrieving the sequence of your proteins, performing secondary structure prediction either locally or remotely and parsing the results
In other words: know the data sources, know the right tools and you can always sculpt a solution for your own situation.
Good web search skills are an essential part of the bioinformatics toolkit, but they don’t define the job. Real bioinformaticians write code.


