>[I took these notes on a scrap of paper, when my laptop was starting to run low on batteries. They’re less complete than most of the other talks I’ve taken notes on, but should still give the gist of the talks. Besides, now that I’m at the airport, it’s nice to be able to lose a few pieces of scrap paper.]
Introducing the HiSeq 2000(tm)
– redefining the trajectory of sequencing
– Jared from Marketing
Overview of machine.
– real data of Genome and transcriptome
– more than 2 billion base pairs per run
– more than 25Gb per day
– uses line scanning (scan in rows, like a photocopier, instead of a whole picture at once, like a camera)
– now uses “dual surface engineering”: image both the top and bottom surface, which means you have twice as much area to form clusters
– Machine holds two individual flow cells
– flow cells are held in by a vacuum
– simple insertion – just toggle a switch through three positions – an LED lights up when you’ve turned it on.
– preconfigured reagenets – bottles all stacked together: just push in the rack
– touch screen user interface
– “wizard” like set up for runs
– realtime metrics available on interface – even an ipod app (available for ipad too..)
– multimedia help will walk you through things you may not understand.
– major focus on ease of use
– it has the “simplest workflow” of any of the sequencing machines available
– tile size reduced [that’s what I wrote but I seem to recall him saying that the number of tiles is smaller, but the tiles themselves are larger?]
– 1 run can now do a 30x coverage for a cancer and a normal (one in each flow cell.)
– 2 methylomes can be done in a week
– you could do 20 RNA-Seq experiments in 4 days.
– error rates and feel of data are similar if not identical to the GAIIx.
– from a small sampling of experiments shown it looks like error rate is very slightly higher
– Demonstrated 300Gb/run, more than 25Gb per day at release
– PET 2×100 supported.
– Software is same for GAII [Although somewhere in the presentation, I heard that they are working on a new version of the pipeline (v 1.6?)… no details on it, tho.]
Eliot Margulies, NHGRI/NIH Sequencing
– talking about projects today for the undiagnosed disease program
– basically same as in his earlier talk [notes are already posted.]
– use cross match to do realignment of reads that don’t map first time
– use MPG scores
[In a technology talk, I didn’t want to take notes on the experiment itself… mainly points are on the HiSeq data.
Data set: concordance with SNP Chips was in the range of 98% for each flow cell, 99% when both are combined (72x coverage)
– Speed: Increased throughput
– more focus on biology rather than on tweaking pipelines and bioinformatic processing. (eg, biological analysis takes front seat.)
Working on a project for Body Map 2.0 : Total human transcriptome
– 16 tissues, each PET 2x50bp, 1x75bp
– $8,900 for 1x50bp
– multiplexing will reduce cost further.
– if you only need 7M reads, you could mutliplex 192 samples (on both cells, I assume), and the cost would be $46. (including seqeuncing, not sample prep.
[which just makes the whole cost equation that much more vague in my mind… Wouldn’t it be nice to know how much it costs to do the whole process?]
[Many examples of how RNA-seq looks on HiSeq 2000 ™]
– output has 5 billion reads, 300Gb of data.
Present a graph
– amount of sequence per run.
– looks like a “hockey stick graph”
[Shouldn’t it be sequence per machine per day? It’d still look good – and wouldn’t totally shortchange the work done on the human genome project. This is really a bad graph…. at least put it on a log scale.]
In the past 5 years:
– 10^4 scale in throughput
– 10^7 scale up in parallelizations
Buzzwords about the future of the technology:
– “Democratizating sequencing”
– “putting it to work”