Transfer learning of a multitask deep neural network to predict enhancer activity of regulatory variants in the HCT116 cell-line


Frank Liu

Document Type


Publication Date

Summer 2022



JAX Location

In: Student Reports, Summer 2022, The Jackson Laboratory


Massively parallel reporter assay (MPRA) is a high-throughput experiment system that provides functional expression data for cis-regulatory elements (CRE) and their variants. These CREs are drivers of human complex traits and disease, and comprehension and prediction ability of CREs would bring patients closer to the dream of genomic medicine. In this work we expand upon MPRA-Cerberus, a sequence-to-activity multi-task convolutional deep neural network (CNN) that predicts MPRA activity of CREs with outstanding accuracy. We demonstrate that MPRA- Cerberus is adaptable to the addition of a new prediction task, namely expression quantitative trait loci (eQTL) fine-mapped in the Genotype-Tissue Expression (GTEx) library tested with MPRA in the human colorectal cancer cell-line HCT116, with high accuracy and model capture of transcription factor (TF) motif grammar. Reaching test set prediction correlation of ���� = 0.86 in HCT116, we utilize the model to generate de novo cell-line specific enhancers and sequence contribution scores to reveal motif insights, including AP-1/ZEB1 interactions, that were learned in the training process. We also evaluate the feasibility of future branch additions, presenting a comprehensive study on the use of MPRA-Cerberus in additional cell-lines to investigate regulatory polymorphism.

Please contact the Joan Staats Library for information regarding this document.