Benchmarking algorithms to infer chromatininteractions from single-cell ATAC-seq data

Document Type


Publication Date

Summer 2021

JAX Location

In: Student Reports, Summer 2021, The Jackson Laboratory


The problem of identifying the target genes of cis-regulatory elements (cis-REs), which control cells’ gene expression and play a role in disease progression, is a current challenge in epigenomics. In this project, I apply and benchmark the performance of a recently developed computational method, Cicero, which infers regulatory connections between cis-REs and their target genes (i.e., chromatin interactions) at single cell resolution from single-cell ATAC-seq (scATAC-seq) data.I describe a pipeline for pre-processing scATAC-seq data and applying Cicero, formulate appropriate performance metrics, and describe two efficient approaches to match interactions from Cicero with those in a reference interaction set or vice versa. I assess the efficacy of Cicero using human peripheral blood mononuclear cell (PBMC) scATAC-seq data and compare these inferences with chromatin inter-action data obtained via promoter capture Hi-C (PCHi-C). Depending on the chosen threshold parameters, Cicero achieves precision of up to 0.2 and recall of up to 0.4 on the PBMC data, after filtering roughly 90% of either data set (either by Cicero peaks or by PCHi-C baits) to interactions that the other technique can discover. Overall, there is a large degree of disagreement between the two techniques. The effects ofCicero and PCHi-C threshold parameter choices are discussed. Lastly, it is observed that running Cicero on peaks corresponding to CD4 T cells increases precision and decreases recall for this data.

Please contact the Joan Staats Library for information regarding this document.