A stab at the future of bioinformatics

I had a conversation the other day about where bioinformatics is headed, and it left me thinking about it for the past few days.  Generally, the question was more about whether bioinformatics (and biotechs) are at the start of something big, or whether this is all a fad.  Unfortunately, I can’t tell the future, but that doesn’t mean I shouldn’t take a guess wild stab in the dark.

Some things are clear because some things never change.  Unless armageddon is upon us or aliens land, we can be sure that sequencing will continue to get cheaper until it hits bottom – by which I mean about the same cost as any other medical test. (At which point, profit margins go up while sequencing costs go down, of course!)  But, that means that for the foreseeable future, we should expect the volume of human sequencing data to continue to rise.

That, naturally, translates pretty directly to an increase in the amount of data that needs to be processed.  Bioinformatics, unlike many other fields, is all about automation and discovery – and in this case, automation is really the big deal.  (I’ll get back to discovery later.)  Pipelines that take care of the human data are going to be more and more valuable, particularly when they add value to the automation and interpretation.  (Obviously, I should disclose that I work for a company that does this.)  I can’t say that I see this need going away any time soon.  However, doing it well requires significant investment and (I’d like to think) skill.  (As an aside, sorry for all of the asides.)

Clearly, though, automation will probably be a big employer of bioinformaticians going forward.  A great pipeline is one that is entirely invisible to the people using it, and keeping a pipeline for the automation of bioinformatics data current isn’t an easy task.  Anyone who has ever said “Great! We’re done building this pipeline!” isn’t on the cutting edge.  Or even on the leading edge.  Or any edge at all.  If you finish a pipeline, it’ll be obsolete before you can commit it to your git repository.

But, the state of the art in any field, bioinformatics included, is all about discovery.  For the most part, I suspect that it means big data.  Sometimes big databases, but definitely big data sets.  (Are you old enough to remember when big data in bioinformatics came in a fasta file, and people thought perl was going to take over the world?)  There are seven billion people on earth, and they all have genomes to be sequenced.  We have so much to discover that every bioinformatician on the planet could work on that full time, and we could keep going for years.

So yes, I’m pretty bullish on the prospects of bioinformaticians in the future.  As long as we perceive knowledge about ourselves is useful, and as long as our own health preoccupies us – for insurance purposes or diagnostics – there will be bioinformatics jobs out there.  (Whether there are too many bioinformaticians is a different story for another post.)  Discovery and re-discovery will really come sharply into focus for the next few decades.

We can figure out some of the more obvious points:

  • Cancer will be a huge driver of sequencing because it changes over time, and so we’ll constantly be driven to sequence again and again looking for markers or subpopulations. It’s a genetic disease and sequencing will give us a window into what it’s doing where nothing else can.  Like physicists and the hunt for subatomic particles, bioinformaticians are going to spend the next hundred years analyzing cancer data sets over and over and over.  There are 3 billion bases in the human genome, and probably as many unique variantions that make a cell oncogenic. (Big time discovery)
  • Rare disease diagnostics should become commonplace.  Can you imagine catching every single childhood disease within two weeks of the birth of a child?  How much suffering would that prevent?   Bioinformaticians will be at the core of that, automating systems to take genetic councillors out of the picture. (discovery turning to automation)
  • Single cell sequencing will eventually become a thing…. and then we’ll have to spend the next decade figuring out how the heck we should interpret it.  That’ll be a whole new field of tools. (discovery!)
  • Integration with medical records will probably happen.  Currently, it’s far from ideal, but mostly because (as far as I can tell) electronic medical records are built for doctors. Bioinformaticians will have to step in and have an impact.  Not that we haven’t seen great strides, but I have yet to hear of an EMR system that handles whole genome sequencing.  (automation.)
  • LIMS.  ugh. It’ll happen and drain the lives from countless bioinformaticians.  No further comment necessary. (automation)

At some point, however, it’s going to become glaringly obvious that the bioinformatics component is the most expensive part of all of the above processes.  Each will drive massive cost savings in healthcare and efficiency, but the actual process of building the tools doesn’t scale the same way as the data generation.

Where does that leave us?  I’d like to think that it’s a bright future for those who are in the field.  Interesting times ahead.

4 thoughts on “A stab at the future of bioinformatics

  1. IMHO the key point is, and has always been, statistics. That is by nature the science behind data, and that is what we have in small, moderate or huge volumes. The automation of processing, crunching and even analyzing and interpreting things up to some level is possible, but understanding what is done, why, how, and pursuing the ability to generate valid answers to the right biological questions, and ensuring that those right questions are asked, signal the point between what we could call ‘technical bioinformatics’, and something else which is a delightful mix of ´exact’ and ‘life’ sciences, and again in my opinion, what we should be aiming for.

    • Totally agree. A bioinformatician who doesn’t have a basic understanding of statistics is like a carpenter without a saw. You can build, but it’s not going to be pretty.

      However, I don’t think that will dictate the future of the field. Automating processing, to me at least, means automating the statistics as well. I don’t see that you can put any dividing line between processing and statistics – they are both integral parts of the same thing.

      • Absolutely, seamless automation is highly desired, and will become mandatory sooner than later, but in our experience the ‘human expert’ eye and mind keeps being crucial to identify the potential caveats associated with many aspects, given the potentially disastrous consequences of making the wrong statistical assumptions when dealing with such complex data.

        But, ain’t that a fun challenge to address ? :)

        • Hah – I’ve made the same point as well, but you’re right that I could be clearer.

          I should have refined the point to say that we should be automating everything except for the decision making up until we understand the decision well enough to automate that as well. The bioinformatician should know enough biology to be the human expert, and then to understand the reasons behind the decision so that that feature too can be automated. That way humans only have to deal with the unknown as long as it remains unknown. (That transition can take years, but it does eventually happen…)

          And yes, that IS the moving goal posts of bioinformatics!

Leave a Reply

Your email address will not be published. Required fields are marked *