Title: Massively Parallel Sequencing of Exmomes and Transcriptomes in ClinSeq Participants.
Clinseq: large scale sequencing project of 1000 patients who have identified as phenotype for clinical symptoms of coronary disease. Started in Jan 2007, participants between 45-65 years old.
Nice slide illustrating balance between: Clinical data, genome breadth and # subjects. Hard to get all 3.
Project started with Targeted Gene approach, switched to Whole Exome and Whole Transcriptome. (403 exomes and 14 transcriptomes already done.)
Data analysis and workflow slide – Very similar to everyone else – and have a nextgen variant database. [no description given here for the db, unfortunately.] Erange and cufflinks used for processing reads.
Many novel variants are singleton – most do not show up in multiple data sets. [expected, I suppose, given what we see elsewhere. Only polymorphisms (not novel) saturate quickly, by definition.]
Focus on differential allele expression: when each copy of a chromosome carries different alleles, they may be expressed differently, and that may relate to disease.
Whole exome gives you ability to count reads and count freequency. [as you’d expect, really.] Distribution is generally similar (looks kinda like a normal distribution), stuff on the tails are allele specific expression.
High amount of correlation of allele frequency for both variants, but at greater than 100x, you see more variation.
Example gene: ERAP2, which has previously been published and known to have differential ASE.
- refining methodologies… [I think I missed something with this point.]
- ASE is reproducible
- implementing integrative computational approaches on participants on patients with both Exome and transcriptome data.