AGBT talk: Zhong Wang, Joint Genome Institute

Title: Massive Metagenomic Discovery of Biomass-Degrading Genes and Genomes from Cow Rumen

First slide: “biofuels”, “cellulosic Ethanol”, and “genomics” [I think I see where this is going. This was the hot topic in 2003, but I haven’t heard much about it since.]

Overview of Lignocellulose structure and cellulase.  [My shorthand – lignocellulose is broken down by a whole series of enzymes, each which breaks down a different bond in the link depending on branch points, etc.  It is also a semi-crystal state, which has hard to break down.]

All cellulase we use industrially comes from one source: fungal source.]

[Oh my… fistulated cow.  I remember that from visiting universities when I was in high school. It’s a cow with a hole in it’s side so you can get into it’s stomachs any time.]

Using cow to digest switchgrass, and looking for microbes that do the breakdown.

[Odd, wouldn’t you want to do that with corn cellulose, which is plentiful and a wasteproduct of animal feed/etc?]

Did 3 billion reads, 300,000Mb  (1/4 TB of sequence).  Hoping to find new enzymes in this.  [On the other hand, wtf do you do with that much sequence?]  This was like a monster!

Taming the monster: prediction: needed huge hardware. [skipping this…] More cellulases found than other studies. A comparison of Carbohydrate Active Enzymes (CAZy) database.  More found in rumen than were collected in database between 1975-2009.

Diversity: very pretty picture of family tree of cellulases.  Found many new branches – and those found were highly diverged [ which makes sense to me, since the microbiome sequencing this morning said that gut bacteria were the only ones that were really most strongly diverged…. ]

Functional validation.  Panel of cellulase substrates, plus cow rumen enzymes.  Higher the activity, the more novel.

Did they get to the bottom of the metagenome? From saturation plot, it’s linear, never saturates out.

Image “Look what I found in the cow!”

Summary: a large number of cellulases were predicted, found and tested and many have excellent potential for new industrial uses.

Community complexity: cow is intermediate between extreme environments (mine water run off) and soil communities.

Assembly: Used Velvet, 1.93Gb sequences assembled. 47 scaffolds match NCBI, which is only 0.03%.  We know very little about this community.

[on a side note, does “fistulating” a cow change the gut flora community???  that would add other odd questions about the diversity of the cow, and particularly oxygen sensitive members of the community, but I guess those enzymes are mostly useless to us.]

Were able to estimate completeness of some assemblies – one example shown at 89.8% with “genome binning”.  With random binning, you do worse.

From cow genome, were able to assemble 15 good draft genomes. (1.8-3.3h Mb)

Did “Single cell genome sequencing”..  Match reads to assembled scaffold : from single organism.  So, it works.

Conclusion: despite super deep sequencing, were only able to assemble 15 genemes.  Pac Bio may help.  Have already tested some Pac Bio long reads, which do help further assemble.  90% of pac bio reads to validate and resolve outstanding assembly problems.

[Neat and though provoking talk!]

Question: Have you sampled other cows? (nope this was all from one cow!)

2 thoughts on “AGBT talk: Zhong Wang, Joint Genome Institute

  1. Pingback: Tweets that mention AGBT talk: Zhong Wang, Joint Genome Institute | --

  2. Pingback: Wrapping up AGBT | SNP Genotyping

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.