A comprehensive long-read isoform analysis platform and sequencing resource for breast cancer.

Document Type

Article

Publication Date

1-21-2022

Publication Title

Sci Adv

Keywords

JGM, Alternative Splicing, Breast Neoplasms, Female, High-Throughput Nucleotide Sequencing, Humans, Protein Isoforms, Sequence Analysis, RNA, Transcriptome

JAX Source

Sci Adv 2022 Jan 21; 8(3):eabg6711

Volume

8

Issue

3

First Page

6711

Last Page

6711

ISSN

2375-2548

PMID

35044822

DOI

https://doi.org/10.1126/sciadv.abg6711

Grant

CA034196, GM133600, CA248137

Abstract

Tumors display widespread transcriptome alterations, but the full repertoire of isoform-level alternative splicing in cancer is unknown. We developed a long-read (LR) RNA sequencing and analytical platform that identifies and annotates full-length isoforms and infers tumor-specific splicing events. Application of this platform to breast cancer samples identifies thousands of previously unannotated isoforms; ~30% affect protein coding exons and are predicted to alter protein localization and function. We performed extensive cross-validation with -omics datasets to support transcription and translation of novel isoforms. We identified 3059 breast tumor–specific splicing events, including 35 that are significantly associated with patient survival. Of these, 21 are absent from GENCODE and 10 are enriched in specific breast cancer subtypes. Together, our results demonstrate the complexity, cancer subtype specificity, and clinical relevance of previously unidentified isoforms and splicing events in breast cancer that are only annotatable by LR-seq and provide a rich resource of immuno-oncology therapeutic targets.

Comments

We thank C. Lee and E. Liu for critically reading this manuscript and for support in the development of this analytical platform; T. Helenius for scientific editing; R. Maurya, J. Idol, and C. Y. Ngan and Genome Technologies Core at JAX-GM for help with LR-seq and RNAseq; members of the genomic core facility at the Icahn School of Medicine at Mount Sinai for help with LR-seq; PDX core at JAX-MG for providing samples; Research IT at JAX-GM for maintaining the website running the R/Shiny application; P. Singh, F. O’Neill, V. Ochoa, and members of the Anczuków laboratory for discussions. The results published here are in whole or part based upon data generated by TCGA managed by the NCI and NHGRI. Information about TCGA can be found at http://cancergenome.nih.gov. The GTEx Project was supported by the Common Fund of the Office of the Director of the National Institutes of Health, more details can be found at commonfund.nih.gov/GTEx. The datasets used for the analyses described in this manuscript were obtained from dbGaP at www.ncbi. nlm.nih.gov/gap

Share

COinS