Document Type


Publication Date




JAX Source

PLoS Comput Biol 2019 Aug 19; 15(8):e1007293







The Long interspersed nuclear element 1 (LINE-1) is a primary source of genetic variation in humans and other mammals. Despite its importance, LINE-1 activity remains difficult to study because of its highly repetitive nature. Here, we developed and validated a method called TeXP to gauge LINE-1 activity accurately. TeXP builds mappability signatures from LINE-1 subfamilies to deconvolve the effect of pervasive transcription from autonomous LINE-1 activity. In particular, it apportions the multiple reads aligned to the many LINE-1 instances in the genome into these two categories. Using our method, we evaluated well-established cell lines, cell-line compartments and healthy tissues and found that the vast majority (91.7%) of transcriptome reads overlapping LINE-1 derive from pervasive transcription. We validated TeXP by independently estimating the levels of LINE-1 autonomous transcription using ddPCR, finding high concordance. Next, we applied our method to comprehensively measure LINE-1 activity across healthy somatic cells, while backing out the effect of pervasive transcription. Unexpectedly, we found that LINE-1 activity is present in many normal somatic cells. This finding contrasts with earlier studies showing that LINE-1 has limited activity in healthy somatic tissues, except for neuroprogenitor cells. Interestingly, we found that the amount of LINE-1 activity was associated with the with the amount of cell turnover, with tissues with low cell turnover rates (e.g. the adult central nervous system) showing lower LINE-1 activity. Altogether, our results show how accounting for pervasive transcription is critical to accurately quantify the activity of highly repetitive regions of the human genome.


Open access under the terms of the Creative Commons Attribution License.