Faculty Research 2019

Cleaning Genotype Data from Diversity Outbred Mice.

Karl W Broman
Daniel M. Gatti, The Jackson LaboratoryFollow
Karen L. Svenson, The Jackson LaboratoryFollow
Śaunak Sen
Gary Churchill, The Jackson LaboratoryFollow

Document Type

Article

Publication Date

5-7-2019

Keywords

JMG

JAX Source

G3 (Bethesda) 2019 May 7; 9(5):1571-1579

Volume

Issue

First Page

1571

Last Page

1579

ISSN

2160-1836

PMID

30877082

DOI

https://doi.org/10.1534/g3.119.400165

Grant

GM070683

Abstract

Data cleaning is an important first step in most statistical analyses, including efforts to map the genetic loci that contribute to variation in quantitative traits. Here we illustrate approaches to quality control and cleaning of array-based genotyping data for multiparent populations (experimental crosses derived from more than two founder strains), using MegaMUGA array data from a set of 291 Diversity Outbred (DO) mice. Our approach employs data visualizations that can reveal problems at the level of individual mice or with individual SNP markers. We find that the proportion of missing genotypes for each mouse is an effective indicator of sample quality. We use microarray probe intensities for SNPs on the X and Y chromosomes to confirm the sex of each mouse, and we use the proportion of matching SNP genotypes between pairs of mice to detect sample duplicates. We use a hidden Markov model (HMM) reconstruction of the founder haplotype mosaic across each mouse genome to estimate the number of crossovers and to identify potential genotyping errors. To evaluate marker quality, we find that missing data and genotyping error rates are the most effective diagnostics. We also examine the SNP genotype frequencies with markers grouped according to their minor allele frequency in the founder strains. For markers with high apparent error rates, a scatterplot of the allele-specific probe intensities can reveal the underlying cause of incorrect genotype calls. The decision to include or exclude low-quality samples can have a significant impact on the mapping results for a given study. We find that the impact of low-quality markers on a given study is often minimal, but reporting problematic markers can improve the utility of the genotyping array across many studies.

Comments

This open access article is licensed under a Creative Commons Attribution 4.0 International License

Recommended Citation

Broman K, Gatti DM, Svenson KL, Sen Ś, Churchill G. Cleaning Genotype Data from Diversity Outbred Mice. G3 (Bethesda) 2019 May 7; 9(5):1571-1579

Download

Included in

Life Sciences Commons, Medicine and Health Sciences Commons

COinS

Faculty Research 2019

Cleaning Genotype Data from Diversity Outbred Mice.

Document Type

Publication Date

Keywords

JAX Source

Volume

Issue

First Page

Last Page

ISSN

PMID

DOI

Grant

Abstract

Comments

Recommended Citation

Included in

Search

Browse

Links

Faculty Research 2019

Cleaning Genotype Data from Diversity Outbred Mice.

Authors

Document Type

Publication Date

Keywords

JAX Source

Volume

Issue

First Page

Last Page

ISSN

PMID

DOI

Grant

Abstract

Comments

Recommended Citation

Included in

Share

Search

Browse

Links