Variant Caller Performance on Diverse Strains: Exploring the significance of the reference genome on caller performance
Document Type
Article
Publication Date
Summer 2023
Keywords
JMG
JAX Location
In: Student Reports, Summer 2023, The Jackson Laboratory
Sponsor
Beth Dumont, PhD., Laura Blanco-Berdugo, M.S. and Alexis Garretson, M.S.
Abstract
Variant calling tools are able, with varying levels of success, to identify variation from a specified reference genome. Reference genomes are assembled for different species, generally using one strain or population, but are currently being assembled with greater specificity and with higher regard for the strains and populations for which they are relevant. In this study, we explore the impact of the reference genome used on variant caller performance. We compare the performance of 4 different variant callers when using two different reference genomes – a strain specific genome versus the standard mm39 reference genome. We find that variant callers perform much better when strain specific references are used, demonstrating the importance of population-specific reference genome assemblies for the best analysis of next-generation sequencing data. Further, we observed that among the callers tested, Freebayes was the most conservative with calls, while Mpileup recovered the most variants. The best caller for a project will therefore vary based on the researchers’ priorities – sensitivity versus recall etc.
Recommended Citation
Roberts, Aleisha, "Variant Caller Performance on Diverse Strains: Exploring the significance of the reference genome on caller performance" (2023). Summer and Academic Year Student Reports. 2763.
https://mouseion.jax.org/strp/2763