In this post: a brief summary of what I got up to, work-wise, in 2016 and my plans for a rather different 2017. The short version: it’s goodbye bioinformatics and hello educational data science!
I guess I’ve been around bioinformatics for the best part of 15 years. In that time, I’ve seen almost no improvement in the way biologists handle and use data. If anything I’ve seen a decline, perhaps because the data have become larger and more complex with no improvement in the skills base.
It strikes me when I read questions at Biostars that the problem faced by many students and researchers is deeper than “not knowing what to do.” It’s having no idea how to figure out what they need to know in order to do what they want to do. In essence, this is about how to get people into a problem-solving mindset so as they’re aware, for example that:
- it’s extremely unlikely that you are the first person to encounter this problem
- it’s likely that the solution is documented somewhere
- effective search will lead you to a solution even if you don’t fully understand it at first
- the tool(s) that you know are not necessarily the right ones for the job (and Excel is never the right tool for the job)
- implementing the solution may require that you (shudder) learn new skills
- time spent on those skills now is almost certainly time saved later because…
- …with a very little self-education in programming, tasks that took hours or days can be automated and take seconds or minutes
It’s good (and bad) to know that these issues are not confined to Australian researchers: here is It’s time to reboot bioinformatics education by Todd Harris. It is excellent and you should go and read it as soon as possible.
September 2, 2002
So what new skills will postdocs need to ensure that they don’t become science relics? The answer is math, statistics, and knowledge of a scripting language for computers.
— The Scientist, “Bioinformatics Knowledge Vital to Careers.” 16(17): 53.
February 8 2012
But two other skills are increasingly necessary: expertise in computer-programming languages designed to aid manipulation of large data sets, such as R, Perl or Python, and the ability to use these languages to analyse large amounts of data quickly.
— Nature, “Biostatistics: Revealing analysis.” 482: 263–265.
A lot of questions at BioStar begin along these lines:
Where can I find…?
I am looking for a resource…?
Is there some database…?
Many #biostar questions begin “I am looking for a resource..”. The answer is often that you need to code a solution using the data you have.
Chris tweeted back:
@neilfws Lit. or Google search is first step, asking around is the next logical step. (Re-)inventing wheels is last. Worth asking, IMHO.
We had a little chat and I realised that 140 characters or less was not getting my point across (not for the first time). What I was trying to say was something like this.
Read the rest…
An oft-repeated cliché is that “you can’t believe what you read on the Web.” Of course, you can’t believe what you read anywhere: it’s up to individuals to assess the quality and reliability of information, regardless of the source. That said, it can be alarming to sit back and watch the speed with which errors propagate in cyberspace. Yesterday, I watched this unfold in a few short hours:
- I (and others) bookmark a link to a project called gpeerreview, hosted at Google Code
- A blog post (since corrected) states that the search giant has been working on a peer review tool
- My bookmark appears at FriendFeed, where we discuss the incorrect attribution of the project to Google
- Another blog post on Google Peer Review appears
- Links and comments about gpeerreview start popping up all over FriendFeed, some of which suggest it is a Google project
The great thing about the Web though, is that it corrects itself just as rapidly. With a few well-placed comments, some discussion at FriendFeed and the best solution – an email enquiry to the project developer (well done Richard!), the phrase “Google Peer Review” was consigned to the error basket.
I’m not pointing the finger or criticising anyone here. Unless you develop software, you’re unlikely to be aware of Google Code and the URL/site design do make it look like a “content owned by Google website”. Just be aware: when writing, to be sure of your facts and when reading, to critically assess and not blindly accept.
Given my passion for online science networking, it’s surprising that I’ve never given a talk on the subject . So a big thank you to William who invited me over to his institute for an informal chat about the topic with a small group of staff.
I learned that:
- A good quote from an internet guru goes down well
- Everyone loves an xkcd cartoon
- Many biologists still don’t know what an RSS feed is
My slides are embedded, below or visit Slideshare – best viewed full screen.
1. Oh wait, I work in a university
See the slides
Announced first via FriendFeed (of course), Moshe from JoVE is circulating an email with exciting news. I can’t do better than to quote it here:
JoVE is the first and only video-publication to be included in these databases maintained by the National Library of Medicine (NLM). The decision was made by the NLM advisory committee, Literature Selection Technical Review Committee, which is composed of the authorities in the field of biomedicine, such as researchers, physicians, editors, health science librarians and historians. This committee evaluates the scientific quality of publications and typically approves only 20-25% of the applications.
Inclusion in PubMed/MEDLINE is a big milestone for JoVE, and for the scientific publishing in general. It demonstrates the official acceptance of new approaches to science communication, such as video online, by the scientific community. Overall, it will increase the interest of the scientists to communicate their findings in video, making biological sciences more transparent and efficient.
Well done to the JoVE team.
A interview with the head of a bioinformatics software company got Tiago thinking about how much time biologists should devote to computing. Deepak also has a few ideas on the topic. This one is always a favourite in the “biologists versus bioinformaticians” debate and here’s my $0.02.
Read the rest. . .