Whole genome sequencing of human herpesviruses
Anne Palser, Welcome Trust Sanger Inst., Sponsored by Agilent Technologies
Herpes virus review. dsDNA, enveloped viruses. 3 major classes, alpha, beta, gamma.
Diseases include Kaposi’s (KSHV-140kb genome) sarcoma, Burkitt’s lymphoma (EBV – 170kb genome).
Hard to isolate viruses to sequences. In some clinical samples, not all cells are infected. When you sequence samples, you get more human DNA than you do virus. Little known about genome diversity, All sequences come from cell lines and tumours. There is no wild type full genome sequence.
Target enrichment method used to try to enrich for virus DNA.
Samples of cell lines used. Tried 5 primary effusion lymphoma cell lines (3 have EBV, all 5 have KSHV) and 2 burkett lymphoma cell lines (EBV).
Custom baits designed using 120-mers, each base covered by 5 probes for KSHV. Similar done for EBV1 and EBV2. [skipping some details of how this was done.]
Flow chart for “SureSelect target enrichment system capture process” from agilent.com illustration.
Multiplexed 6 samples per lane. Sequenced on Illumina GaII.
Walk through analysis pipeline. Bowtie and Samtools used at final stages.
Specific capture of virus DNA.
- KSHV. 77-91% reads map to reference sequence. Capture looked good.
- EBV: 52-82% mapping to ref.
Coverage looks good, and high for most of the genome. Typical for viral sequencing.
SNPs relative to ref. sequence. 500-700 for KSHV, 2-2.5k for EBV relative to reference seq. Nice Circos-like figure showing distribution.
- Custom SureSelect to isolate virus dna from human dna is successful.
- full genome sequence viruses obtained.
- analysing snps and minority species present
- currently looking at saliva samples, looking estimate genomic diversity
- looking at clinical pathologies
- high throughput, cost effective, applicable as a method to analyse other pathogen sequences.