A lot of questions at BioStar begin along these lines:
Where can I find…?
I am looking for a resource…?
Is there some database…?
I tweeted some concerns about this:
Many #biostar questions begin “I am looking for a resource..”. The answer is often that you need to code a solution using the data you have.
Chris tweeted back:
@neilfws Lit. or Google search is first step, asking around is the next logical step. (Re-)inventing wheels is last. Worth asking, IMHO.
We had a little chat and I realised that 140 characters or less was not getting my point across (not for the first time). What I was trying to say was something like this.
Chris is quite right; searching, then asking for existing solutions is the correct first approach. However, the tone of certain questions makes it sound as though some people believe that there must be a ready-made resource for any given situation, or for their exact circumstances. For example, a moments thought would make it clear that you are unlikely to find, just lying around on the Web:
- A list of DNA sequence accessions for a gene from your list of organisms
- A set of secondary structure predictions for your list of proteins
What you will find on the Web are larger datasets from which you can extract your subset of interest – and the tools to do it. In the examples above, this entails:
- Retrieving identifiers for your organisms from a taxonomy database, linking them to identifiers of DNA sequences and filtering for your gene
- Retrieving the sequence of your proteins, performing secondary structure prediction either locally or remotely and parsing the results
In other words: know the data sources, know the right tools and you can always sculpt a solution for your own situation.
Good web search skills are an essential part of the bioinformatics toolkit, but they don’t define the job. Real bioinformaticians write code.
I have to say that whilst I find the Q&A sites an extremely useful resource, the huge numbers of questions posed which can be answered easily with a Google search and a little common sense is becoming distressing.
I have found I am less willing to answer questions on Seqanswers.com because of the recurrence of the same types of problems (“how do I get coordinates of my SNPs from BAM file etc. etc.”)
“Teach a man to fish” etc.
This can be frustrating. I think the BioStar community has recently become a lot better at dealing with the trivial or irrelevant questions – basically, by closing them if questioners fail to respond to requests for improving the question. It takes a lot of effort though.
It’s like anything else – half the battle is in knowing the right questions to ask.
The other half is knowing what to do with the answer.
I think Q&A sites, especially the friendly folks on BioStar, are a good way to get started.
And usually the questions about software errors or code are relegated to a lower degree. Good that I left Biostar some time ago, and only sporadically check the main page.
These sort of questions were exactly why we wrote BioMart a few years back. I think that we satisfied a lot of users, but far from 100%.
I hope that BioStar has helped to promote BioMart; a lot of answers point users to that resource. It’s one of the best tools for those “given a list of X, return a list of Y” queries that are so common.
Pingback: Tweets that mention Real bioinformaticians write code | What You’re Doing Is Rather Desperate -- Topsy.com
Pingback: Real bioinformaticians …
I have to say I’m in total agreement with pretty much everything you wrote, though I might have a couple caveats :).
yes, real bioinformaticians write code (thank goodness), the bulk of research biologists probably have to rely on existing tools and databases (thanks to bioinformaticians)
Though I think Google is an excellent place to start, from my personal and professional experience, it can only answer the question a minority of the time, or finding the answer can be quite frustrating.
So, I agree with this statement whole heartedly: “know the data sources, know the right tools and you can always sculpt a solution for your own situation.”
Amen.
BioMart, UCSC Genome Browser, Galaxy, etc, etc are excellent tools and data sources and could probably answer about 80% of most posed questions :). But my caveat would be that knowing the data sources and right tools can be a bit of a daunting task.
Learning what you need to know in bioinformatics can certainly be daunting. But then, science isn’t for for the easily daunted :-)
No like you’re being provocative or anything :)
I guess you could also say that real scientists do experiments; this is really what is at issue here, the unwillingness to explore a problem deeply. Everything already has an answer right ?
As for reinventing the wheel: in general it is a bad idea and asking around first is smart. But I would argue that you can’t really contribute until you understand an existing solution, and to do that often you have to re-implement. It is a learning/discovery process.
Pingback: Real bioinformaticians write code, real scientists… | The OpenHelix Blog
Pingback: Real bioinformaticians write code « Gas station without pumps