Integrative model of genomic factors for determining binding site selection by estrogen receptor-α.

Roy Joseph
Yuriy L Orlov
Mikael Huss
Wenjie Sun
Say Li Kong
Leena Ukil
You Fu Pan
Guoliang Li
Michael Lim
Jane S Thomsen
Yijun Ruan
Neil D Clarke
Shyam Prabhakar
Edwin Cheung
Edison T Liu


A major question in transcription factor (TF) biology is why a TF binds to only a small fraction of motif eligible binding sites in the genome. Using the estrogen receptor-α as a model system, we sought to explicitly define parameters that determine TF-binding site selection. By examining 12 genetic and epigenetic parameters, we find that an energetically favorable estrogen response element (ERE) motif sequence, co-occupancy by the TF FOXA1, the presence of the H3K4me1 mark and an open chromatin configuration in the pre-ligand state provide specificity for ER binding. These factors can model estrogen-induced ER binding with high accuracy (ROC-AUC=0.95 and 0.88 using different genomic backgrounds). Moreover, when assessed in another estrogen-responsive cell line, this model was highly predictive for ERα binding (ROC-AUC=0.86). Variance in binding site selection between MCF-7 and T47D resides in sites with suboptimal ERE motifs, but modulated by the chromatin configuration. These results suggest a definable interplay between sequence motifs and local chromatin in selecting TF binding.