Monday, November 11, 2013

So You Want To Be a Computational Biologist, Or A Bioinformatician?

Spend a few minutes reading this very nice piece in Nature Biotechnology by Nick Loman (@pathogenomenick) and Mick Watson (@BioMickWatson) (whom I had the pleasure of hearing a few weeks ago in Toronto).  This is probably the most salient point:
You're a scientist, not a programmer.  The perfect is the enemy of the good. Remember you are a scientist and the quality of your research is what is important, not how pretty your source code looks. Perfectly written, extensively documented, elegant code that gets the answer wrong is not as useful as a basic script that gets it right.
Having said that, once you're sure your core algorithm works, spend time making it elegant and documenting how to use it. Use your biological knowledge as much as possible—that's what makes you a computational biologist.
I've often heard students and researchers debating merits of one computer language versus another, whether Object Oriented code is 'better' than functional programming, but in biology these kinds of debates usually miss the point.  Biology is about biology, not about technology.

Let me make a sweeping generalization about, you, the computational biologist.

One of the main things you're going to try to do is to use your computational abilities to identify what the next steps are in a wet lab to validate what you're observing in the data.  It doesn't matter how you arrive at the conclusion to do a certain experiment, as long as you identify that next step in the line of thinking. 

If you throw in a mix of statistical skills, you, as the budding computational biologist, will be able to rapidly identify the most suspicious and/or interesting pieces of data in a huge mish-mash of data.  If your fellow bench biologists take up enough of your experimental pitches, and they work out from time to time, you're probably doing very well.

That's not to say write horrible code.  Well written code is probably essential if you're ever going to ask others for help debugging your scripts. But since the end point is to inform experiments, it doesn't matter whether you use double spaces or tabs to indent your code.

All that aside, I've spent about a decade trying to decide on the differences between 'computational biologists' and 'bioinformaticians'.  It's much harder that you think.

The difficulty lies in that the two roles are often interchangeable within the same people; some days, you'll be a computational biologist, others, a bioinformatician.  It doesn't matter which hat you wear, but it's important to understand that the two roles require two different ways of thought.

But don't take my word for it, take Russ Altman's:
Computational biology = the study of biology using computational techniques.  The goal is to learn new biology, knowledge about living sytems.  It is about science.
Bioinformatics = the creation of tools (algorithms, databases) that solve problems.  The goal is to build useful tools that work on biological data.  It is about engineering.
Much like the case of computational and experimental biologists, these two roles are different but complimentary ones that contribute to each other in many research programs.  Just realize which one you're supposed to be playing!