Document Type
Article
Publication Date
12-27-2023
Original Citation
Harvey W,
Ebert P,
Ebler J,
Audano P,
Munson K,
Hoekzema K,
Porubsky D,
Beck C,
Marschall T,
Garimella K,
Eichler E.
Whole-genome long-read sequencing downsampling and its effect on variant-calling precision and recall. Genome Res. 2023;33(12):2029-40
Keywords
JGM, Genomics, INDEL Mutation, Nanopores, Whole Genome Sequencing
JAX Source
Genome Res. 2023;33(12):2029-40
ISSN
1549-5469
PMID
38190646
DOI
https://doi.org/10.1101/gr.278070.123
Abstract
Advances in long-read sequencing (LRS) technologies continue to make whole-genome sequencing more complete, affordable, and accurate. LRS provides significant advantages over short-read sequencing approaches, including phased de novo genome assembly, access to previously excluded genomic regions, and discovery of more complex structural variants (SVs) associated with disease. Limitations remain with respect to cost, scalability, and platform-dependent read accuracy and the tradeoffs between sequence coverage and sensitivity of variant discovery are important experimental considerations for the application of LRS. We compare the genetic variant-calling precision and recall of Oxford Nanopore Technologies (ONT) and Pacific Biosciences (PacBio) HiFi platforms over a range of sequence coverages. For read-based applications, LRS sensitivity begins to plateau around 12-fold coverage with a majority of variants called with reasonable accuracy (F1 score above 0.5), and both platforms perform well for SV detection. Genome assembly increases variant-calling precision and recall of SVs and indels in HiFi data sets with HiFi outperforming ONT in quality as measured by the F1 score of assembly-based variant call sets. While both technologies continue to evolve, our work offers guidance to design cost-effective experimental strategies that do not compromise on discovering novel biology.
Comments
This article is subject to HHMI’s Open Access to Publications policy. HHMI lab heads have previously granted a nonexclusive CC BY 4.0 license to the public and a sublicensable license to HHMI in their research arti- cles. Pursuant to those licenses, the author-accepted manuscript of this article can be made freely available under a CC BY 4.0 li- cense immediately upon publication.