>AGBT 2010 – Complete Genomics Workshop

>Complete Genomics CEO:

Mission:
– sequence only human genomes – 1 Million genomes in the next 5 years
– build out tools to gain a good undertanding of the human genome
– done 50 genomes last year
– Recent Science publication
– expect to do 500 genomes/month

Lots of Customers.
– Deep projects

Techology
– don’t waste pixels,
– use ligases to read
– very high quality reads – low cost reagents
– provide all bioinformatics to customers

Business
– don’t sell technology, just results.
– just return all the processed calls (snps, snv, sv, etc)
– more efficient to outsource the “engineering” for groups who just want to do biology
– fedex sample, get back results.
– high throughput “on demand” sequencing
– 10 centres around the world
– Sequence 1 Million genomes to “break the back” of the research problem

Value add
– they do the bioinformatics

Waves:
– first wave: understand functional genomics
– second wave: pharmaceutical – patientient stratification
– third wave: personal genomics – use that for treatment

Focus on research community

Two customers to present results:
First Customer:

Jared Roach, Senior Research Sceintist, Institute for Systems Biology (Rare Genetic disease study)

Miller Syndrome
– studied coverage in four genomes
– 85-92% of genome
– 96% coverage in at least one individual
– Excellent coverage in unique regions.

Breakpoint resolution
– within 25bp, and some places down to 10bp
– identified 125 breakpoints
– 90/125 occur at hotspots
– can reconstruct breakpoints in the family

Since they have twins, they can do some nice tests
– infer error rate: 1×10^-5
– excluded regions with compression blocks (error goes up to 1.1^-5)
– Homozygous only: 8.0×10^-6 (greater than 90% of genome)
– Heterozygous only: 1.7×10^-4

[Discussion of genes found – no names, so there’s no point in taking notes. They claim they get results that make sense.]

[Time’s up – on to next speaker.

Second Customer:
Zemin Zhang, Senior Scientist, Genentech/Roche (Lung Cancer Study)

Cancer and Mutations
[Skipping overview of what cancer is…. I think that’s been well covered elsewhere.]

Objective:
– lung cancer is the leading cause of cancer related mortality worldwide…
– significant unmet need for treatment

Start with one patient
– non small cell lung adenocarcinoma.
– 25 cigarettes/day
– tumour: 95% cancer cells

Genomic characterization on Affy and Agilent arrays
– lots of CNV and LOH
– circos diagrams!

– 131GB mapped sequence in normal, 171Gb mapped seq in tumour
– 46x coverage normal, 60x tumour
[Skipping some info on coverage…]

KRAS G12C mutation

what about rest of 2.7M SNVs?
– SomaticScore predicts SNV validation rates
– 67% are somatic by prediction
– more than 50,000 somatic SNV are projected

Selection and bias observed in the lung cancer genome by comparing somatic and germline mutations

GC to TA changes: Tobacco-associated DNA damage signature

Protection against mutations in coding and promoter regions.
– look at coding regions only – mutations are dramatically less than expected – there is probably strong selection pressure and/or repair

Fewer mutations in expressed genes.
– expressed genes have fewer mutations even lower in transcribed strand
– non-expressed genes have mutation rate similar to non-genic regions

Positive selection in subsets of genes
– KRAS is the only previously known mutation
– Genes also mutated in other lung cancers…
– etc

Finding structural variation by paired end reads
– median dist between pairs 300bp.
– distance almost never goes beyond 1kb.

Look for clusters of sequence reads where one arm is on a different chromosome or more than 1kb away
– small number of reads
– 23 inter-chr
– 56 intra-chr
– use fish + pcr
– validate results
– 43/65 test cases are found to be somatic and have nucleotide level breakpoint junctions
– chr 4 to 9 translocation
– 50% of cells showed this fusion (FISH)

Possible scenario of Chr15 inversion and deletion investigated.
[got distracted, missed point.. oops.]

Genomic landscape:
– very nice Circos diagram
– > 1 mutation for every 3 cigarettes

In the process of doing more work with Complete Genomics

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.