Document Type

Article

Publication Date

3-1-2026

Keywords

JGM, Humans, Protein Isoforms, Sequence Analysis, RNA, Cell Differentiation, Software, Human Embryonic Stem Cells

JAX Source

Nat Biotechnol. 2026;44(3):477-89.

ISSN

1546-1696

PMID

40461779

DOI

https://doi.org/10.1038/s41587-025-02633-9

Grant

R01HG011469 to P.T., Q.G., C.-L.W., V.S., R.S., A.C. and C.S.; R01HG006137 and R56AG081351 to N.R.Z.

Abstract

RNA sequencing has been widely applied for gene isoform quantification, but limitations exist in quantifying isoforms of complex genes accurately, especially for short reads. Here we identify genes that are difficult to quantify accurately with short reads and illustrate the information benefit of using long reads to quantify these regions. We present miniQuant, which ranks genes with quantification errors caused by the ambiguity of read alignments and integrates the complementary strengths of long reads and short reads with optimal combination in a gene- and data-specific manner to achieve more accurate quantification. These results are supported by rigorous mathematical proofs, validated with a wide range of simulation data, experimental validations and more than 17,000 public datasets from GTEx, TCGA and ENCODE consortia. We demonstrate miniQuant can uncover isoform switches during the differentiation of human embryonic stem cells to pharyngeal endoderm and primordial germ cell-like cells.

Creative Commons License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Share

COinS