Faculty Research 2018

ReprDB and panDB: minimalist databases with maximal microbial representation.

Wei Zhou, The Jackson LaboratoryFollow
Nicole R Gay, The Jackson LaboratoryFollow
Julia Oh, The Jackson LaboratoryFollow

Document Type

Article

Publication Date

1-18-2018

JAX Source

Microbiome 2018 Jan 18; 6(1):15

Volume

Issue

First Page

Last Page

ISSN

2049-2618

PMID

29347966

DOI

https://doi.org/10.1186/s40168-018-0399-2

Grant

AI119231

Abstract

BACKGROUND: Profiling of shotgun metagenomic samples is hindered by a lack of unified microbial reference genome databases that (i) assemble genomic information from all open access microbial genomes, (ii) have relatively small sizes, and (iii) are compatible to various metagenomic read mapping tools. Moreover, computational tools to rapidly compile and update such databases to accommodate the rapid increase in new reference genomes do not exist. As a result, database-guided analyses often fail to profile a substantial fraction of metagenomic shotgun sequencing reads from complex microbiomes.

RESULTS: We report pipelines that efficiently traverse all open access microbial genomes and assemble non-redundant genomic information. The pipelines result in two species-resolution microbial reference databases of relatively small sizes: reprDB, which assembles microbial representative or reference genomes, and panDB, for which we developed a novel iterative alignment algorithm to identify and assemble non-redundant genomic regions in multiple sequenced strains. With the databases, we managed to assign taxonomic labels and genome positions to the majority of metagenomic reads from human skin and gut microbiomes, demonstrating a significant improvement over a previous database-guided analysis on the same datasets.

CONCLUSIONS: reprDB and panDB leverage the rapid increases in the number of open access microbial genomes to more fully profile metagenomic samples. Additionally, the databases exclude redundant sequence information to avoid inflated storage or memory space and indexing or analyzing time. Finally, the novel iterative alignment algorithm significantly increases efficiency in pan-genome identification and can be useful in comparative genomic analyses. Microbiome 2018 Jan 18; 6(1):15.

Recommended Citation

Zhou W, Gay N, Oh J. ReprDB and panDB: minimalist databases with maximal microbial representation. Microbiome 2018 Jan 18; 6(1):15

Link to Full Text

COinS

Faculty Research 2018

ReprDB and panDB: minimalist databases with maximal microbial representation.

Document Type

Publication Date

JAX Source

Volume

Issue

First Page

Last Page

ISSN

PMID

DOI

Grant

Abstract

Recommended Citation

Search

Browse

Links

Faculty Research 2018

ReprDB and panDB: minimalist databases with maximal microbial representation.

Authors

Document Type

Publication Date

JAX Source

Volume

Issue

First Page

Last Page

ISSN

PMID

DOI

Grant

Abstract

Recommended Citation

Share

Search

Browse

Links