Large-scale discovery of mouse transgenic integration sites reveals frequent structural variation and insertional mutagenesis.

Leslie Goodwin, The Jackson Laboratory
Erik Splinter
Tiffany Leidy-Davis, The Jackson Laboratory
Rachel Urban, The Jackson Laboratory
Hao He
Robert E Braun, The Jackson Laboratory
Elissa J Chesler, The Jackson Laboratory
Vivek Kumar, The Jackson Laboratory
Max van Min
Juliet Ndukum, The Jackson Laboratory
Vivek M. Philip, The Jackson Laboratory
Laura G Reinholdt, The Jackson Laboratory
Karen L. Svenson, The Jackson Laboratory
Jacqueline K White, The Jackson Laboratory
Michael Sasner, The Jackson Laboratory
Cathleen Lutz, The Jackson Laboratory
Stephen A. Murray, The Jackson Laboratory

The authors thank Brianna Caddle and Larry Bechtel for their technical assistance in isolating spleen cells and Kevin Peterson for his helpful and thoughtful comments on the manuscript.


Transgenesis has been a mainstay of mouse genetics for over 30 yr, providing numerous models of human disease and critical genetic tools in widespread use today. Generated through the random integration of DNA fragments into the host genome, transgenesis can lead to insertional mutagenesis if a coding gene or an essential element is disrupted, and there is evidence that larger scale structural variation can accompany the integration. The insertion sites of only a tiny fraction of the thousands of transgenic lines in existence have been discovered and reported, due in part to limitations in the discovery tools. Targeted locus amplification (TLA) provides a robust and efficient means to identify both the insertion site and content of transgenes through deep sequencing of genomic loci linked to specific known transgene cassettes. Here, we report the first large-scale analysis of transgene insertion sites from 40 highly used transgenic mouse lines. We show that the transgenes disrupt the coding sequence of endogenous genes in half of the lines, frequently involving large deletions and/or structural variations at the insertion site. Furthermore, we identify a number of unexpected sequences in some of the transgenes, including undocumented cassettes and contaminating DNA fragments. We demonstrate that these transgene insertions can have phenotypic consequences, which could confound certain experiments, emphasizing the need for careful attention to control strategies. Together, these data show that transgenic alleles display a high rate of potentially confounding genetic events and highlight the need for careful characterization of each line to assure interpretable and reproducible experiments.