>Talking about gene regulation – has been well studied for a long time, but only recent on a genomic scale. The field still wants comprehensive, accurate, unbiased, quantitative measurements (DNA methylation, DNA binding protein, mRNA) and they want it cheap fast and easy to get.
Next gen has revolutionized the field: ChIP-Seq, mRNA-Seq and Methyl-Seq are just three of them. Also need to integrate them with genome-wide genetic analysis.
Many versions of each of those technology.
RNA-Seq: 20M reads give 40 reads per 1kb-long mRNA present as low as 1-2 mRNA per cell. Thus, 2-4 lanes are need for deep transcriptome measurement. PET + long reads is excellent for phasing, and junctions.
ChIP-Seq: transcription factors and histones.. but should also be used for any DNA binding protein. (Explanation of how ChIP-Seq works.) Using no-antibody control generally gives you no background [?] Chip without control gets you into trouble.
Methylation: Methyl-seq. Cutting at unmethylated sites, then ligate to adaptors and fragment. Size select and run. (Many examples of how it works.)
Studying human embryonic stem cells. (Cell lines are old and very different…. hopefully there will be new ones available soon.) Using it for Gene expression versus methylation status: When you cluster by gene expression, they cluster by pathways. The DNA methylation patterns did not correlate well, more along the line of individual cell lines than pathways. Thus, they believe it’s not controling the pathways.. but that could be an artifact of the cell lines.
26,956 methylation sites. Many of them (7,572) are in non CpG regions.
Another study: Studying Cortisol. Steroid hormone made by adrenal gland. Controls 2/3rds of all biology, helps restore homeostasis and affects a LOT of pathways: blood pressure, blood sugar, suppress immune system, etc. Fluctuates throughout the day. Pharma is very interested in this.
Levels are also tied to mood, etc.
Glucocorticoid receptor binds hormone in cytoplasm, translocates to nucleus. Activates and represses transcription of thousands of genes.
Chip-seq in A549: GR (-hormone): 579 peaks. GR (+ hormone): 3,608 peaks. Low levels of endogenous cortisol in the cell probably accounts for the background. (of peaks, ~60% are repressive, ~40% are inducing.) When investigating the motifs, top 500 hits really changes the binding site motif! No longer as set as originally thought – and led to discovery of new genes controled by GRE. Also show that there’s a co-occupancy with AP1.
[Method for expression quantization: Use windows over exons.]
Finally: a few more little stories. Mono-allelic transcription factor binding. Turns out to occur frequently, where only one allele is bound in ChIP, and the other is not binding at all. (in the shown case, turns out the SNP causes a methylation site, which changes binding.) Same type of event also happens to methylation sites.
Still has time: just raise the point of Copy Number Variation. Interpretation is very important, and can be skewed by CNVs. Cell lines are particularly bad for this. If you don’t model this, it will be a significant problem. Just on the verge of incorporating this.
They are going to 40-80M reads for RNA-Seq. Their version of RNA-Seq is good, and doesn’t give background. The deeper you go, the more you learn. Not so much with ChIP-Seq, where you saturate sooner.