Combinatorial Ranking of Gene Sets to Predict Disease Relapse: The Retinoic Acid Pathway in Early Prostate Cancer.

Document Type


Publication Date


JAX Source

Front Oncol 2017 Mar 15; 7:30




BACKGROUND: Quantitative high-throughput data deposited in consortia such as International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA) present opportunities and challenges for computational analyses.

METHODS: We present a computational strategy to systematically rank and investigate a large number (2(10)-2(20)) of clinically testable gene sets, using combinatorial gene subset generation and disease-free survival (DFS) analyses. This approach integrates protein-protein interaction networks, gene expression, DNA methylation, and copy number data, in association with DFS profiles from patient clinical records.

RESULTS: As a case study, we applied this pipeline to systematically analyze the role of ALDH1A2 in prostate cancer (PCa). We have previously found this gene to have multiple roles in disease and homeostasis, and here we investigate the role of the associated ALDH1A2 gene/protein networks in PCa, using our methodology in combination with PCa patient clinical profiles from ICGC and TCGA databases. Relationships between gene signatures and relapse were analyzed using Kaplan-Meier (KM) log-rank analysis and multivariable Cox regression. Relative expression versus pooled mean from diploid population was used for z-statistics calculation. Gene/protein interaction network analyses generated 11 core genes associated with ALDH1A2; combinatorial ranking of the power set of these core genes identified two gene sets (out of 2(11) - 1 = 2,047 combinations) with significant correlation with disease relapse (KM log rank p < 0.05). For the more significant of these two sets, referred to as the optimal gene set (OGS), patients have median survival 62.7 months with OGS alterations compared to >150 months without OGS alterations (p = 0.0248, hazard ratio = 2.213, 95% confidence interval = 1.1-4.098). Two genes comprising OGS (CYP26A1 and RDH10) are strongly associated with ALDH1A2 in the retinoic acid (RA) pathways, suggesting a major role of RA signaling in early PCa progression. Our pipeline complements human expertise in the search for prognostic biomarkers in large-scale datasets.

Front Oncol 2017 Mar 15; 7:30.