What is a bioinformatician

I’ve been participating in an interesting conversation on linkedin, which has re-opened the age old question of what is a bioinformatician, which was inspired by a conversation on twitter, that was later blogged.  Hopefully I’ve gotten that chain down correctly.

In any case, it appears that there are two competing schools of thought.  One is that bioinformatician is a distinct entity, and the other is that it’s a vague term that embraces anyone and anything that has to do with either biology or computer science.  Frankly, I feel the second definition is a waste of a perfectly good word, despite being a commonly accepted method.

That leads me to the following two illustrations.

How bioinformatics is often used, and I would argue that it’s being used incorrectly.:

bioinformatics_chart2

And how it should be used, according to me:

bioinformatics_chart1

I think the second clearly describes something that just isn’t captured otherwise. It covers a specific skill set that’s otherwise not captured by anything else.

In fact, I have often argued that bioinformatician is really a position along a gradient from computer science to biology, where your skills in computer science would determine whether you’re a computational biologist (someone who applies computer programs to solve biology problems) or a bioinformatician (someone who designs computer programs to solve biology problems). Those, to me, are entirely different skill sets – and although bioinformaticians are often those who end up implementing the computer programs, that’s yet another skill, but can be done by a programmer who doesn’t understand the biology.

bioinformatics_chart3

That, effectively, makes bioinformatician an accurate description of a useful skill set – and further divides the murky field of “people who understand biology and use computers” – which is vague enough to include people who use an excel spreadsheets to curate bacterial strain collections.

I suppose the next step is to get those who do taxonomy into the computational side of things and have them sort us all out.

Handy little command for upgrading python libraries…

About three weeks ago I googled for a quick tutorial on how to upgrade all of the libraries being used by python – and came up completely empty handed. Absolutely nothing useful turned up, which I found rather frustrating. The Python installer (pip) should certainly have an “upgrade all” function – but if it does, I couldn’t find it. If anyone comes across such a thing, I’d love to hear about it.

This morning, on my bike in to work, I realized I could hack a very quick command line together to make it work:

sudo pip freeze | awk '{FS = "==";print $1}' | xargs -I {} sudo pip install {} --upgrade

Nothing to it! It iterates one by one and upgrades all of the installed software. When a package is up to date, it’s clearly indicated, and when it’s not, it tries to upgrade, rolling back if it’s unsuccessful. I’ve noticed that many of the upgrades failed because of an out of date numpy package, so you may want to upgrade that first. Also, Eclipse isn’t too happy with the process, as it will detect the changes and freak out a bit – you might want to exit anything using or depending on the python libraries (such as django web server) first.

Of course, beware that this may involve re-compiling a fair amount of code, which means it’s not necessarily going to be fast. (Took about 15 minutes on my computer, with quite a few out of date libraries)

An Open Post-Doc Position

From time to time, I hear of an open position, which I’m happy to post on my blog.  If I were hunting for a post-doc position, I’d be tempted to check out this one in the Ramsey Lab at the University of Oregon Oregon State University in Corvalis, Oregeon. A quick excerpt:

You will have a key role in the lab’s research in gene regulatory networks in innate immune cells, developing integrative algorithms and applying them to analyze genomic, epigenomic, and transcriptomic data. The job is an exciting opportunity to combinestate-of-the-art methods in machine learning and statistical network inference to improve our molecular network understanding of the innate immune system and its roles in diseases. More broadly, our research program aims to develop new methods for integrating “omics” datasets with an emphasis on high-impact applications in biomedicine.

If you are interested, you can find out more on the lab’s web page: http://lab.saramsey.org/#Join

 

Great primer on the why and how of genome sequencing

I’m often asked to explain the human genome project, or sequencing in general when discussing what I do with those outside of the field.  I’d like to think I’m not bad at explaining it in lay terms, either.

On the other hand, there’s now a video that does a VERY good job of this, written by Mark J. Kiel, from the University of Michigan.  The illustrations are a great mix of simplicity and detail, that captures the essence of the process while not omitting the actual science.  It’s pretty impressive and well worth the 5 minutes it takes to watch it. You can also catch the full thing on Youtube: