Dear Affymetrix: You Suck.

[EDIT: In case it's not obvious, I've written the title to get the attention of the Affymetrix, as I don't think they've read the contents of my blog posts, despite their picking me as a poster boy for their propaganda - not because I think they actually suck...]

It has come to my attention that the people at Affymetrix have taken to using a screen shot of my blog to argue that bloggers have prematurely called for the death of microarrays – and their proof to the contrary is that they’re still around.

Affymetrix, you may have noticed that my title for this blog is strongly worded – and that’s because I don’t think you’ve bothered to read the actual posts.  In fact, there are two of them:

My entire argument revolves around the fact that sequencing is getting cheaper – and thus, the concept of doing targeted sequencing via a platform for which sequencing costs are NOT getting cheaper is a ridiculous concept in the long run.  Furthermore, I didn’t say that arrays should disappear entirely, but rather that they should just be dropped for cutting edge research.

There are a few exceptions for areas where you willfully want to blind yourself: Diagnostics is probably the most prominent, and is likely a growing sector in to which Affymetrix will be able to expand and grow.

Thus, Affymetrix, your continued existence is not a counter-argument.

In fact, your continued success is also not a counter argument.

There will always be a place for microarrays, but that place is not going to be cutting edge research.  Really, how long do you think exon capture experiments are going to last, when the cost of 30X WGS hits $500?

Seriously, Affymetrix, if you’re going to ridicule me for my opinions, take the time to read what my opinions are.

And, by the way, Affymetrix, have you seen this picture? (Thanks Daniel!)

By the way, if anyone knows the cost per base of sequencing for an Affymetrix chip, obtaining the same dynamic range and error profile as an Illumina platform run, I would love to know that number.

20 thoughts on “Dear Affymetrix: You Suck.

  1. In defense of Affymetrix, I wish they were a sequencing company! I have to say, I love next-generation sequencing data, and I periodically go into the lab and stroke our HiSeqs. I think they are beautiful and I love working with the data. However, to play Devil’s advocate….

    I speak as head of a lab that operates both the Illumina HiSeq and Affymetrix systems. So let’s start by getting some numbers on the board. If you look at the

    ENCODE RNA-Seq standards (http://www.genome.ucsc.edu/ENCODE/protocols/dataStandards/RNA_standards_v1_2011_May.pdf) they suggest 30 million reads per sample if you

    just want to count gene expression and 200 million reads per sample if you want to know your sample inside out (alternate splicing, lowly expressed transcripts

    etc). So to get 30 million reads you can run 6 samples in a lane of HiSeq; to get 200 million reads you can run one sample per lane. So RNA-Seq costs (in our lab)

    roughly £400 per sample for gene counts and roughly £2000 per sample for in depth analysis. These costs do not include the bioinformatics, which for arrays is simple and for RNA-Seq is not. Comparatively, running an Affy array costs about £150. So depending on what kind of analysis you want to do, current Affy arrays cost between a third and a sixteenth of what it costs to do RNA-Seq (ignoring bioinformatics costs, which we shouldn’t).

    Bear in mind that doing an Affymetrix experiment with 16 biological replicates is far more relevant than doing an RNA-Seq experiment with one replicate, in my opinion. NGS seems to have forgotten about biological replication.

    Now, you may or may not know about the Affymetrix GeneTitan instrument. This instrument takes two 96-well plates. In each well, you can have an array of 700,000 spots. That’s 192 * 700,000 assays in a few hours. I have a feeling that performing gene expression by array is about to get cheaper by an order of magnitude, which changes the numbers above even more.

    Moving on, I have to say affy gene expression data is beautiful – highly reproducible, tight standard deviations etc. There is also 15 years worth of statistical research behind it, meaning that we understand gene expression data very well. There are a number of very well characterised analysis techniques, easy to run and well understood.

    Contrast that with RNA-Seq data, where DESeq, edgeR and Cufflinks all come up with completely different p-values, and you realise that, as yet, we do not understand RNA-Seq counts data anywhere near as well as we understand array data. You can then add in the emerging evidence that when one multiplexes Illumina RNA sequencing, one introduces bias (http://genome.cshlp.org/content/21/9/1506.full; http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3203936/). I have also seen faily convincing evidence from UEA, where they sequenced random oligos, that showed that Illumina sequencing itself introduces biases. You have to start to think that actually, RNA-Seq, wonderful, beautiful, amazing technology that it is, is not quite the answer to everything.

    When people come to me with array experiments, I ask them why they don’t want to do RNA-Seq. I love RNA-Seq. But arrays definitely have their place. I’m not a luddite; I can see the writing on the wall for arrays. But I do fear we are throwing away something we know works, for something we don’t yet understand as well as we need to.

    Disclaimer: these thoughts are my own, and do not represent the views of Roslin, or ARK-Genomics

    • Generally, I’m not going to argue with most of your points, as I love playing devil’s advocate as well, but I do want to respond strongly to one of your points:

      “You may not know about the Affymetrix GeneTitan instrument [...]”

      You’re right, I didn’t know about it – but your concluding point is that it provides a 1 order of magnitude improvement… which isn’t a small thing, really…

      But look at the chart above: How many orders of magnitudes does the illustration cover? And how often do they occur? The only way Affy can stay competitive is to offer a competitive number of them at a competitive frequency.

      I just don’t see that happening.

  2. Thanks for an excellent response Mick. Our (USA) costs do not show as dramatic of a difference as your costs but, never-the-less, Affy is cheaper than a 30M read RNAseq and much less than a full-blown RNAseq. Our first-pass bioinformatics is “free” (in that we do not charge for it) although it should be counted into the total cost and here too I will agree that the RNAseq analysis is harder and thus should cost more.

    The point about reliability, reproducible results and good statistics is an excellent one.

    The only reason my facility is tending towards RNAseq is that we do a lot of de-novo plant and animal work. For anything more characterized Affy seems to be a better choice.

    BTW: I am *so* tired of that Moore’s law vs. sequencing cost chart. It is comparing apples to oranges. Moore’s law is simply about the size of transistors. Nothing to do with cost. If you want to say that sequencing costs has outperformed an exponential then do that. Don’t invoke Moore. As per Wikipedia:
    ———-

    Moore himself, who never intended his eponymous law to be interpreted so broadly, has quipped:

    “Moore’s law has been the name given to everything that changes exponentially. I say, if Gore invented the Internet, I invented the exponential.”

    • For the record, I also hate the Moore’s law vs sequencing comparison – but it illustrates a very key point: Other industries have a constant, gentle slope with which their basic unit of information increases. Sequencing has moved by fits and starts for a few years, but has completely outstripped that “gentle” exponential change.

      I think the problem is that Affy’s price per base is going to be undercut by next gen sequencing a heck of a lot sooner than they are prepared for – particularly if they’re running around claiming that their continued existence is proof that I’m wrong.

      When the train is coming right at you, the only solution is to move off the tracks – not to yell at the top of your lungs that you’re still standing still.

  3. I suppose I should really reply to the original post instead of replying to Mick’s reply and then bringing up a personal rant. :-)

    The original post said:
    “… Furthermore, I didn’t say that arrays should disappear entirely, but rather that they should just be dropped for cutting edge research. …”

    The above implies that there is a single “cutting edge”. I am not sure if I agree with this. If “cutting edge” means finding out new features that aren’t on chips or not being capable of being discovered by chips, then, sure, NGS is good. However as Mick implies the ability to run multiple biological samples with good statistics behind it can lead to interesting and new discoveries. That could also be considered “cutting edge”. Or simply diagnostics.

    The original post continues: “By the way, if anyone knows the cost per base of sequencing for an Affymetrix chip, obtaining the same dynamic range and error profile as an Illumina platform run, I would love to know that number.”

    I think that we need to focus on “cost per sample” and not per base. If NGS needs 30M reads for a half-way useful sample while Affy gets by with lots less, then “cost per base” is not very relevant. As for the actual number you might think that being part of a sequencing facility I could come up with some numbers. However they are proving to be difficult to retrieve and compare. But I will keep looking.

    • Thanks for the comment, Rick,

      I totally agree that there isn’t just one cutting edge, but I would ask the follow up question of which one Arrays are on? Exon capture is great, but really, is that pushing the limits of NGS? I would argue not, as it’s likely to be just a stepping stone on the way to better, more complete methods.

      Unfortunately, I disagree with “Cost per sample”, as a good metric, unless your methods are sequencing the same number of bases. If I sequence 2 genes, versus sequencing 20,000 genes, the cost per sample might be identical, but the information content is dramatically lower.

      Perhaps a better metric would be price per variant, but then it should include Structural Variants, alternate splicing, and all of the information you get “for free” out of an RNA seq run.

  4. In agriculture, arrays are still fitting the need of cutting edge research but only in (as yet still) poorly characterized genomes. Bovine is an exception that I think fits the fejes postulate but many other genomes are not yet well known. The more these genome researchers gain access to low cost sequencing to hammer out the genome structure, the more the array work will move to a screening application in agriculture as well. Bovine is the new human (like orange is the “new black”–sorry maybe that was last year) but for wheat? Not so much.

    • heh… wheat will be the new arabidopsis soon, too, I’m sure. Won’t be long till there’s a reference scaffold for many different plants, even if the polypoidy is a complicating factor. Longer reads will probably help sort them out quickly.

  5. It’s always funny to see scientists be offended by marketing campaigns or web site citations. I don’t blame them, I hear there is a slew of scientists who would be happy to be wined an dined to re-peat such useless mantras as, ”the instrument is the chip” or ”The new Moore’s law” or ”DO you want a free T-Shirt”. The reality is that most scientists rarely have a full view of the corporate vision of a company, they stick to one side and roll with it. Case in point, cost of sequencing. The fact is, running a human genome could be free for all I care, yet it does only one thing, create a new information jam that is 10,000 fold bigger than hiring cheap labor to change buffers and make e-pcr with a benchtop toy. That is of course, when scientist do not view the other realities, a company, such as the ones you mention be it AFFY, ILMN or Life need to feed their CEOs hunger for cash, for example, Life CEO makes 35 million a year by pretty much firing people every month. So, he will likely engage in telling you that everyone and is grandmother needs an NGS because of the free cost but he will also remind you, in some subversive ways, how he must continue to grow. Everytime the cost of sequencing goes down, 100s of people get fired, 1000 of samples stack up and die an early death on computers and frankly, little gets done. other than constantly increase that firepower to new standards. In Canada for example, because of the way our great CFI funding is set up, there are many PGM on benches but only a handful actually running (yes, canadians love toys but do not use them very much).

    It doesn’t really serve human health and absolutely is not a model for clinical adoption. Case in point, most of your PGM will probably become obsolete in about 6 months but no rep will tell you that, for they need to squish every dollars and cent into their quarter in a very short term vision. They are probably hounding you at this very second, knocking at your door, waiting to see your reaction at a new promo they have with a free something… So you play the game, for the bestest cheapestest lowestest costing machine, until a new one comes up. Unless of course, you stay patient and keep using standard proven technology at somewhat of a higher cost but with a bigger scientific and biological reference base. Just keep in mind that the rep who didn’t get that PO for the new shinny toy probably got destroyed and removed by his 35 millions dollar CEO… So scientists being pissed of at manufacturers, I dunno, a lot of them sure play the game…

    • I understand your comment, but I fail to see your point.

      Personally, I resent being misrepresented – it’s as simple as that. If any company wants to highlight something I’ve said, they’re welcome to do so, as long as they don’t violate my copyright, or misrepresent what I’ve said.

      Whether they fire sales people for doing their job poorly or not is immaterial to this particular argument – however, sales is a tough job, and I respect those who do it ethically.

  6. Seems to me, that you actually believe the chart on the costs of sequencing or the penny pinching philosophy… I remember back in the days, when oligos were made on site, super high quality, and now this worthless Walmart industry where reps bring almost nothing to the table but a battle of cents per nucleotide. This is somewhat the path that your argument takes, cheaper is better but the reality here is that the volumes expected with cheaper NGS have simply not evolved. True costs are not in the toys, they are what you make of the data. So I feel your chart falls right into the hands of NGS manufacturers, I have seen that chart being used by many of them, I’m not seeing you write that Life or ILMN suck because of the ridiculous predictions they make. Yet they certainly suck in their own way…
    Scientists are becoming bulimic and I don’t see a whole lot of good sound solid breakthroughs out there…
    I understand your point of view, but I feel scientists should start being worried of trends based on costs. This is a dangerous argument and one that is often created by the industry.

    • Hi Jonny,

      The chart is real, so I’m not sure that it matters whether I believe in it or not. This isn’t religion, it’s science – and the numbers tell you something.

      Really, if you want to claim that 4 orders of magnitude of cost reduction isn’t going to drive a significant change in any industry, I don’t think there’s much to discuss. Besides, the quality of the sequencing has also improved dramatically since 2007. To say otherwise is just denying reality.

      And, as for a lack of breakthroughs, I’m really not sure what journals you’re reading. I’ve seen plenty of them, and they’ve been fast and furious in the fields of cancer and Mendelian diseases.

      Clearly, this isn’t just a matter of cost-per-base, but even superficially, as a metric, it gives you a chance to illustrate some of the changes driven by NGS.

  7. Funny, then, that the NGS forecast is now considered saturated, what of you make of this if it is cutting edge ? Aren’t we now into sequencing maggots and stuff like that ? What will follow the table top sequencer ? A watch sequencer, a pendant sequencer ? A sequencer tong ? I’m sure Pac Bio could fit a sequencer in a tong ? No ?

    • NGS is saturated? Where on earth did you hear that?

      Whatever source you have, they’ve clearly failed to see the bigger picture. As the price drops on sequencing, research institutes will continue to upgrade to access the better price/quality products that are being produced – they can’t afford not to if they want to continue publishing.

      Medical and Diagnostics uses of NGS are clearly on the horizon, which is still an untapped market.

      Your claim that NGS is saturated is highly similar to the early 1943 quote: “I think there is a world market for maybe five computers.” – Thomas Watson (1874-1956), Chairman of IBM.

      Watson changed his tune as new segments of the market opened up.

      • I have been genotyping the CFTR gene for 20 years… So I like to quote Francis Collins and all the amazing GWAS papers of the last decades: ”I see a patient who, in 2010, is prescribed a prophylactic drug regimen based on the knowledge of [his or her] personal genetic data” .
        Not even on the radar screen.

          • Anthony, you have to be serious, even the Jet Propulsion Lab has had a better return then that ? I am a scientist but we need to stop living in this parallel world where millions of dollars are used to keep ‘publishing’. Hence my point that we are guilty of corporate marketing in a way, scientists have created ridiculous expectations, wall street style.

        • Ok, I think I’m done with this conversation. You tell me there are no drugs on the horizon that make use of genomic data, and when I give you an example, you just say “not good enough”… which has pretty much been your response every time I refute one of your points.

          Now you want me to argue about the expense of publishing data? How do you even begin to think that’s relevant to this topic?

          Seriously, what is your point here? Are you just trying to troll me?

    • I don’t call people a troll unless they’re trolling – but I still don’t know that I’d take an array over NGS, as your choice of platform should be highly pendent on the question being asked. Do you want to look for something you already know about, or are you trying to find out something new? Suggesting you know what platform you’d prefer without discussing the question is near (but not quite) troll-like behaviour.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>