Shedding light on microbial dark matter: reference-based species identification in microbiomes.


Nicole Gay

Document Type


Publication Date

Summer 2016

JAX Location

In: Student Reports, Summer 2016, Jackson Laboratory


Sequence-based analysis of microbial communities within the human microbiome has yielded valuable insight into the microbial diversity and function of different body niches. Genomic reference-based methods of taxonomic classification have identified a great diversity of microbes that inhabit different regions of the human skin. However, a significant fraction of metagenomics reads fail to map to any reference genome, often exceeding the fraction of mappable reads, suggesting that significant biodiversity remains to be discovered. We call this unknown sequence space microbial ‘dark matter’. Here, I hypothesized that creating a UNIX-based pipeline for rapid construction of reference genome databases, which I then used to analyze skin microbiome metagenomics data. I observed a significant reduction in dark matter for the vast majority of skin samples tested as a direct result of including significantly more genomes in my reference database. Finally, I benchmarked my analyses against other reference-based algorithms for validation and observed strong correlation in abundant species, suggesting that our results are reproducible. The results from this work have uncovered a significant reservoir of previously uncharacterized biodiversity in the human skin microbiome together with a user-friendly pipeline for genome database compilation.

Please contact the Joan Staats Library for information regarding this document.