Document Type

Article

Publication Date

7-12-2021

Publication Title

Nat Commun

Keywords

Chromosome Mapping, Gene Expression Regulation, Genetic Loci, Genetic Variation, Genetics, Population, Genome, Human, Humans, Minisatellite Repeats, Nucleotide Motifs, Quantitative Trait Loci

JAX Location

JGM

JAX Source

Nat Commun 2021 Jul 12;12(1):4250

Volume

12

Issue

1

First Page

4250

Last Page

4250

ISSN

2041-1723

PMID

34253730

DOI

https://doi.org/10.1038/s41467-021-24378-0

Abstract

Variable number tandem repeats (VNTRs) are composed of consecutive repetitive DNA with hypervariable repeat count and composition. They include protein coding sequences and associations with clinical disorders. It has been difficult to incorporate VNTR analysis in disease studies that use short-read sequencing because the traditional approach of mapping to the human reference is less effective for repetitive and divergent sequences. In this work, we solve VNTR mapping for short reads with a repeat-pangenome graph (RPGG), a data structure that encodes both the population diversity and repeat structure of VNTR loci from multiple haplotype-resolved assemblies. We develop software to build a RPGG, and use the RPGG to estimate VNTR composition with short reads. We use this to discover VNTRs with length stratified by continental population, and expression quantitative trait loci, indicating that RPGG analysis of VNTRs will be critical for future studies of diversity and disease.

Comments

Dr. Lee and Dr. Zhu are members of the consortium.

This article is licensed under a Creative Commons

Attribution 4.0 International License

Share

COinS