PgmNr Y3187: Assembling whole eukaryotic genomes from mixed microbial communities using Hi-C.

Authors:
I. Liachko 1 ; J. Burton 1 ; L. Sycuro 2 ; A. Wiser 2 ; D. Fredricks 2 ; J. Shendure 1 ; M. Dunham 1


Institutes
1) University of Washington, Seattle, WA; 2) Fred Hutchinson Cancer Research Center, Seattle, WA.


Keyword: Other Yeasts

Abstract:

De novo assembly of whole genomes from next-generation sequencing is inhibited by the lack of contiguity information in short-read shotgun sequencing. This limitation also impedes metagenome assembly, since one cannot tell which sequences originate from the same species within a population. We have overcome these bottlenecks by adapting a chromosome conformation capture technique (Hi-C) for the deconvolution of metagenomes and the scaffolding of de novo assemblies of complex genomes.

In modeling the 3D structure of a genome, chromosome conformation capture techniques such as Hi-C are used to measure long-range interactions of DNA molecules in physical space. These tools employ crosslinking of chromatin in intact cells followed by intra-molecular ligation, joining DNA fragments that were physically nearby at the time of crosslink. Subsequent deep sequencing of these DNA junctions generates a genome-wide contact probability map that allows the 3D modeling of genomic conformation within a cell. The strong enrichment in Hi-C signal between genetically neighboring loci allows the scaffolding of entire chromosomes from fragmented draft assemblies. Hi-C signal also preserves the cellular origin of each DNA fragment and its interacting partner, allowing for deconvolution and assembly of multi-chromosome genomes from a mixed population of organisms.

We have used Hi-C to scaffold high-quality genomes of animals, plants, fungi, as well as prokaryotes and archaea from very fragmented de novo assemblies.  We have also been able to use this data to annotate functional features of microbial genomes, such as centromeres in many fungal species including over a dozen yeasts. Additionally, we have applied our technology to diverse metagenomic populations such as craft beer, bacterial vaginosis infections, soil, and tree endophyte samples to discover and assemble the genomes of novel strains of known species as well as novel prokaryotes and eukaryotes.  This method's ability to reconstruct multi-chromosome genomes has led to discovery of novel yeast hybrids directly from mixed communities.

The high quality of Hi-C-based assemblies allows the simultaneous assembly and scaffolding of numerous unculturable genomes, placement of plasmids within host genomes, and microbial strain deconvolution in a way not possible with other methods.

- Burton JN, Liachko I, Dunham MJ, Shendure J. Species-level deconvolution of metagenome assemblies with Hi-C-based contact probability maps. G3. 2014, May 22;4(7):1339-46.

- Varoquaux N, Liachko I, Ay F, Burton JN, Shendure J, Dunham MJ, Vert JP, Noble WS.  Accurate identification of centromere locations in yeast genomes using Hi-C. Nucleic Acids Res. 2015, Jun 23;43(11):5331-9.