Enhancer occlusion transcripts regulate the activity of human enhancer domains via transcriptional interference: a computational perspective

Abstract Analysis of ENCODE long RNA-Seq and ChIP-seq (Chromatin Immunoprecipitation Sequencing) datasets for HepG2 and HeLa cell lines uncovered 1647 and 1958 transcripts that interfere with transcription factor binding to human enhancer domains. TFBSs (Transcription Factor Binding Sites) intersected by these ‘Enhancer Occlusion Transcripts’ (EOTrs) displayed significantly lower relative transcription factor (TF) binding affinities compared to TFBSs for the same TF devoid of EOTrs. Expression of most EOTrs was regulated in a cell line specific manner; analysis for the same TFBSs across cell lines, i.e. in the absence or presence of EOTrs, yielded consistently higher relative TF/DNA-binding affinities for TFBSs devoid of EOTrs. Lower activities of EOTr-associated enhancer domains coincided with reduced occupancy levels for histone tail modifications H3K27ac and H3K9ac. Similarly, the analysis of EOTrs with allele-specific expression identified lower activities for alleles associated with EOTrs. ChIA-PET (Chromatin Interaction Analysis by Paired-End Tag Sequencing) and 5C (Carbon Copy Chromosome Conformation Capture) uncovered that enhancer domains associated with EOTrs preferentially interacted with poised gene promoters. Analysis of EOTr regions with GRO-seq (Global run-on) data established the correlation of RNA polymerase pausing and occlusion of TF-binding. Our results implied that EOTr expression regulates human enhancer domains via transcriptional interference.


INTRODUCTION
Transcriptional interference (TI) encompasses cisregulatory processes involving two adjacent promoters, where the actual regulation is exerted via the act of transcription itself (1). Active RNA polymerases interfere with e.g. pre-initiation complex formation or prevent transcription factor (TF) binding (2)(3)(4). TI is correlated with the relative promoter strength: stronger promoters, which initiate transcription at higher (relative) frequency, exert greater impact on regulated downstream promoters (3,4). However, this type of transcriptional control might not be exclusive to TF/DNA interactions within eukaryotic promoter regions; rather all cellular activities that depend on TF/DNA binding could analogously be regulated via TI or related mechanism (5).
Recently, we identified TFbiTrs (Transcription Factor binding interfering Transcripts) within proximal promoter regions (PPRs: 1 kb upstream regions from the transcription start site [TSS]) of human protein-coding RefSeq (hg19) genes. These transcripts modulate TF/DNA binding via transcriptional interference within intersected PPRs (4). Enhancers are regulatory hubs of clustered TFBSs (9,(16)(17). Here, we analyzed the potential of TI to regulate the activity of human enhancers. GENCODE cDNA datasets were screened for transcripts that are associated with human enhancer domains and intersected TFBSs of lower relative TF-binding affinity in HepG2 and HeLa cell lines (18,19). These RNAs represent candidates of potentially occluding RNA transcription. Resulting cDNA datasets were further investigated to establish chromatin environments for the occluding transcripts, associated enhancers and regulated genes.
Local enrichments of H3K27ac or H3K9ac histone tail modifications indicate augmented activities of eukaryotic enhancer domains (20)(21)(22)(23)(24). Enhancer domains of our datasets were characterized by lower activities as revealed by reduced occupancy levels for either histone tail modification compared to genome-wide controls. In summary, reduced relative DNA binding affinities for transcriptintersected TFBSs and decreased enhancer activities suggested the detection of TI acting on human enhancer domains. Analysis of GRO-seq (Global run-on) datasets established a correlation of RNA polymerase pausing and decreased TF-binding affinities at intersected TFBSs (25). This suggested that the local competition between RNAPII and TF defines the underlying mechanism for TI acting on human enhancer domains. Accordingly, we designated these RNAs as 'Enhancer Occlusion Transcripts' or, in short, 'EOTrs'.
Interactome analysis for regions intersected by EOTrs via ChIA-PET (Chromatin Interaction Analysis with Paired-End-Tag sequencing) or 5C (Carbon Copy Chromosome Conformation Capture) revealed that enhancer domains associated with EOTrs preferentially interacted with poised gene promoters (26,27). These findings emphasized the potential of EOTrs to regulate gene expression within the human genome. All relevant features of EOTrs were compared and contrasted with the more established class of eRNAs within the same cell lines (12,15,28). To the best of our knowledge, we demonstrated, for the first time the regulation of enhancer domains via TI-related mechanism on a genome-wide scale and provided insights into regulatory networks established by EOTrs (5).

CEAS for enrichment of transcripts devoid of known function
The identification of transcripts with unknown function (GENCODE v2 GTF-files) was performed with CEAS (ciselement annotation system, v 1.0.2) and RefSeq/HAVANA exon annotations, to exclude cDNA contigs, which overlap with known human mRNAs (Supplementary file Methods) (32).

Identification of EOTrs--intersection of candidate transcripts with domains of H3K36me3/H3K4me3 and H3K4me1 enrichment
RNA datasets for the identification of EOTrs were the result of two initial filtering steps to provide enrichment for npcR-NAs (non-protein coding RNAs) intersecting with human enhancer domains: (1) Intronic and intergenic regions corresponding to polyadenylated and non-polyadenylated long RNAs (GSE26284) were overlaid with domains of H3K36me3/H3K4me3 (GSE29611) enrichment, using BEDTools' intersectBed command (33,34). Results were statistically evaluated with the Genome Structure Correction (GSC) to test the significance of the observed intersection (Supplementary File 2 Tables S1a-d) (35). Jaccard indices were calculated for the 200 bp flanking regions surrounding (i.e. up-and downstream) the annotated 3 termini of GENCODE cDNA contigs (Supplementary file Methods, Supplementary File 1 Figure S1). The detection of H3K4me3 peaks aided in the identification of promoter regions (1 kb upstream to cDNA-contigs) for the selected transcripts.  Figure S1) (36,37).

Analysis of EOTr PPRs (proximal promoter regions)
Proximal promoters for EOTr candidates were identified within 1 kb upstream regions relative to the representative CAGE cluster peak via the analysis of H3K4me3 enrichments (22,36,45) (Supplementary File 1 Figure S1). Intersections were computed with the BEDTools' (v 2.14.3) intersectBed command (GSE29611) (33). Resulting H3K4me3 enrichments were compared to those obtained for the corresponding 1 kb downstream regions for the same set of TSSs via the GSC to test the significance of the identified H3K4me3/PPR associations (Supplementary File 1 Figure S1, Supplementary File 2 Table S5a-d).

Analysis of differential expression for RNA-Seq and ChIPseq datasets across cell lines
We utilized edgeR (v 3.24.0) and DiffBind (v 2.10.0) for the analysis of differential TF-binding, histone modifications and RNA expression (29)(30)(31). edgeR implements generalized linear models, which were utilized for the quantification of effects associated with candidate transcription.  Table S1) (15,(51)(52). These combined epigenetic features were utilized to scan human HepG2 and HeLa cell lines for the de novo identification of eRNAs. The directionality of eRNA transcription was deduced from the orientation of intersecting CAGE tags and RNA-Seq cDNA data sets that were utilized for eRNA identification (Supplementary File 3 section 3.1 and Table S1).

Analysis of transcription factor binding and relative binding affinities
URLs to all datasets are provided in Supplementary File 2 Table S1.2.

Measurement of relative transcription factor/DNA binding affinities
Sequence to affinity prediction tools (STAP) were utilized to quantify the effects of EOTr expression on TF-binding (18). STAP provides as output: (i) The binding parameter (in-FactorIntMat), which represents a relative measure of how strongly a TF binds to its corresponding binding sites: values greater than 1 signify 'favorable', i.e. stronger, interactions, and less than 1 'unfavorable', i.e. weaker, binding. (ii) The maxBindingWts parameter, which represents the PWM (position weight matrix) scores of analyzed TFBSs and (iii) the Pearson's correlation coefficient for predicted and observed binding scores (expRatios) (18). Analysis of relative affinities of TF/DNA-binding within EOTr and non-EOTr regions was conducted as previously described with the following minor modifications (4).  (19).

Thresholds of EOTr expression for occlusion of TF-binding
STAP command line option '-dt' aided in the identification of EOTrs associated with sites of favorable or unfavorable TF-binding affinities (4,18). Sorting of expression levels for candidate transcripts associated with favorable and unfavorable binding enabled the identification of expression thresholds, which were the minima required to effectively occlude TF-binding.

GRO-seq (global run-on sequencing) data
GRO-seq monitors the distribution of active RNA polymerase via the detection of cDNA peaks along the entire genome (25,(55)(56) Correlation of the proportion of significant AE sites was carried out using a binomial P < 0.05 test for EOTr+ and EOTr-alleles. Active allelic EOTrs were defined by H3K4me1+H3K27ac over-representation. Analysis of differential binding for H3K27ac within EOTr+/EOTr-allelic variants was performed via DiffBind (v 2.10.0) (Supplementary file Methods).

ChIA-PET and 5C data analysis
URLs to all datasets are provided in Supplementary File 2 Table S1.5. Genome-wide ChIA-PET and 5C data for HeLa and HepG2 cell lines are accessible via GSM970288 and GSM970211. Interactome targets (pairs of interacting enhancers and promoters) were identified using TargetFinder (62).

ChIA-PET analysis for HeLa cells.
For HeLa, ChIA-PET analysis comprises: (i) linker filtering, (ii) short read mapping, (iii) PET (paired-end tag) classification, (iv) binding site identification and (v) interaction cluster analysis with the ChIA-PET tool v2 (63). The interaction library was derived from RNAPII. ChIA-PET interaction clusters were intersected with EOTr BED regions in HeLa. For subsequent classification, all interactions within EOTr-overlapped regions were scanned for specific enrichment of promoter or insulator marks (ChIP-seq peaks for H3K4me3 and CTCF, respectively). The EOTr interactome was visualized with Circos v 0.69 (64). ChIA-PET data was normalized for differential peak enrichment and genomic proximity using Mango v 1.2.0 and default parameters (Supplementary file Methods).

Statistical data analysis
Results were analyzed via the Genome Structure Correction tool (Supplementary file Methods) (35). Functional relationships between any two sets of genomic features are statistically defined on the basis of their proximity/overlap/nearness to each other. Deviations from expected distributions potentially are indicative of biologically relevant associations or non-associations. Datasets in HeLa and HepG2 cell lines were split into intronic and intergenic domains as distribution of analyzed features (size of intronic and intergenic enhancers along with the overlap of other features, e.g. H3K27ac and P300) and their frequency within the genome vary and are dependent on the length of inspected genomic regions. Jaccard indices (the analysis of which is part of the GSC tool package) were estimated as the number of intersections between any two genomic features, divided by their union. The larger the coefficient, the more similar two datasets are in terms of local overlap. Genomic features within analyzed regions were compared to Jaccard indices for the same feature within control regions (domains and associated features are illustrated in Supplementary File 1 Figure S1). Probability values calculated via permutation tests indicated for each region whether the actual overlap was smaller (TRUE) or larger (FALSE) than what would be expected by chance. Results for the relative distance Kolmogorov-Smirnov test, absolute distance test, Jaccard indices and Kullback-Leibler as well 2 -tests are detailed in Supplementary File 2 and the main text (35,67).

Data visualization
All boxplots were drawn with DiffBind (v 2.10.0). Notches indicate 95% confidence intervals (CI) for the median, calculated as ±1.58 × IQR/ √ n. IQR is the interquartile range or distance between the first and third quartiles, where n is the number of cells (68). The first and third quartiles relate to the lower and upper hinges of the boxplots (the 25th and 75th percentiles). The upper and lower whiskers extend from the hinge to ±1.5 * IQR of the hinge. Different contrast groups were devised for analysis of histone modification combinations (metric in Supplementary File Methods) within EOTr+ and control groups (EOTr-and eRNAs). Three parameters were considered for data representation via Boxplots for contrast group calculations: (1) Difference of pair-wise group (2) Differences of group mean and (3) Difference of group difference.

EOTr candidate datasets and RNA-Seq
Deep sequencing datasets for polyadenylated and nonpolyadenylated RNAs in HepG2 and HeLa cell lines were assembled with GENCODE v2 GTF files and quantified with edgeR (30) ( Figure 1A-C). For enrichment of transcripts devoid of known functions, only those RNAs, which did not intersect with available gene annotations, entered the analysis (Table 1). We incorporated CPAT to exclude transcripts, which displayed significant protein coding potential ( Figure 1A) (0.364 (Coding Potential ≥ 0.364 indicates coding sequence, CP < 0.364 indicates non-protein coding sequences) (10)(11)39). These results were contrasted with training datasets consisting out of RefSeq (hg19) CDS (coding sequence) exons (Supplementary File 1 Figure S2). Finally, 12 093 and 13 191 transcripts in HepG2 and HeLa cell lines defined the input datasets for this survey ( Table  1). The resulting cDNA contigs represented polyadenylated

H3K36me3/H3K4me3 enrichments encompassed regions of candidate transcripts
These preselected GENCODE datasets were further analyzed for enrichments of H3K4me3/H3K36me3 histone tail modifications. Variations of this procedure were previously utilized for detecting long npcRNAs (34,69). H3K36me3 overrepresentations mark domains of active transcription; in addition, this histone tail modification preferentially is associated with exons of actively transcribed genes (70). This analytical procedure, therefore helped to exclude in-tronic degradation products from input datasets. We analyzed Jaccard indices for the intersection of H3K36me3 histone tail modifications with 200bp flanking regions surrounding the candidate TTS (    H3K36me3 and H3K4me3 enrichment (Table 1). Notably, H3K36me3 and H3K4me3 enrichments are not detectable in eRNA-intersected domains ( Figure 1D-G), which generally are associated with enhancer domains of higher activity. Our analytical procedure, therefore, provided additional means to exclude RNAs that arguably are not related to transcriptional interference. These data implied that EOTr candidates are bona fide long npcRNAs, which possess their own promoter for independent transcription (71,72).

EOTr candidate datasets overlapped with human enhancer domains
We computed the local intersection of candidate transcripts and enhancer domains as revealed by combined H3K4me1/p300 enrichments (22,(36)(37)73 Table S1]; both types of enhancers were characterized by the absence of H3K36me3).

CEAS and TFBSs intersected by EOTr candidates
Analysis of TI requires count data to quantify RNA expression and to monitor its impact on TF-binding (4,46). We used the ENCODE repository and CEAS to restrict datasets to candidate EOTrs that intersected TFBSs with available ChIP-seq data for HepG2 and HeLa cell lines (32,74). Finally, about 45% (intronic 20%; intergenic 25%) of our preselected candidate datasets in HepG2 and 63% HeLa (intronic 38%; intergenic 25%) cell lines entered the next stage of analysis.

STAP analysis for relative TF/DNA binding affinities for c-Myc, c-Jun and BRCA1 in candidate EOTr regions
TI is associated with lower relative binding affinities for TF-BSs intersected by candidate transcripts (4). We compared TF-binding in relation to RNA expression within candidate EOTr regions with affinities for the same TF in enhancer domains devoid of our transcript datasets (EOTr-). Capped and polyadenylated candidate RNAs, which acted via TI were selected and are referred to as 'Enhancer Occlusion  Transcripts' (Table 1). STAP tools for the analysis of relative TF-binding affinities revealed that the majority of EOTrintersected TFBSs, in HepG2 and HeLa cell lines displayed 'unfavorable' binding (Table 3, Supplementary File 2 Table  S6 for intergenic regions) (18). Indeed, depending on investigated cell line and TF, relative binding affinities (inFac-torIntMatscores) were up to 100-times weaker in EOTr domains compared to control regions. Associated PWMs were computed separately for case and control datasets to ensure that detectable differences in TF-binding were not the result of potentially skewed sequence compositions or binding sites of particularly low affinity in case of EOTrs (compare columns 1 and 2 in Table 3). We also utilized the KLtest (Kullback-Leibler) to quantify the difference between distributions for TFBSs between case and control data (Table 4) (67,75). In summary, these results confirmed that the composition of TFBSs within either dataset were similar (Figure 2). Major differences in TF-binding between both datasets were therefore attributable to the presence of potentially interfering RNA expression (given that the size and number of the investigated EOTr and non-EOTr sites were the same), which suggested the detection of TI acting on the intersected TFBSs ( Figure 3A and B for intronic EOTrs, Sup-plementary File 1 Figure S3 for intergenic EOTrs).

Analysis of relative TF/DNA binding affinities for TFs c-Myc, c-Jun and BRCA1 within EOTr loci across cell lines
Effects related to EOTr expression were also directly inferred from the calculation of relative TF-binding affinities for sets of identical TFBSs and as a function of occluding transcription (4). Collections of EOTrs, which were expressed in a cell line specific manner (i.e. either in HepG2 or HeLa), served as input data for this analysis (EOTrs with cell line specific expression: HepG2 = 1598 and HeLa = 1909). We ensured that the corresponding TFBSs in both cell lines resided within regions of H3K4me1 and histone acetyltransferase p300 enrichment, suggesting that the analyzed genomic domains acted as bona fide enhancers in both cell lines (36,73).
This approach allowed the approximation of TF/DNA interactions in correlation to EOTr transcription, moni- tored across cell lines. STAP/TRAP analysis tools for quantification of relative TF-binding affinities for the same TF-BSs in the absence or presence of EOTr expression (Supplementary file Methods Figure 1C, Table 5 for intronic EOTrs, Supplementary File 2 Table S7 for intergenic EOTrs) revealed 'unfavorable' binding for cell lines and TFBSs intersected by EOTrs. Reduced ChIP-seq signal intensities as observed by the analysis of differential TF-binding for TF-BSs with overlapping EOTrs confirmed these observations (Supplementary file Methods Figure 1C, Figure 4A-C).

Threshold levels for EOTr expression and occlusion of TFbinding
The occlusion of TF/DNA interactions by EOTrs also implied that there might be expression thresholds, which are minimally required for effective TI (4). Depending on cell line and occluded TF, STAP analysis revealed that EOTrs expressed below log 2 2.53-3.6 (HepG2) and 3.8-4.2 (HeLa) CPM did not cause the occlusion of TF-binding. Interestingly, for the non-EOTr datasets (eRNAs) expression values ranged between log 2 0.42-1.70 (HepG2) and 0.53-1.21 CPM (HeLa) (Supplementary File 3 section 3.2 and Table  S2).

Proximal promoter regions of EOTrs reveal histone tail modifications indicating transcriptional activity
Combined enrichments of histone tail modifications H3K4me3 and H3K27ac or H3K4me3 and H3K27me3 characterize active or poised promoters, respectively (37,49). The vast majority of EOTr PPRs, i.e. 89% in HepG2 (1465/1647; intronic 48%; intergenic 41%) and 92% in HeLa (1810/1958; intronic 56%; intergenic 36%) was enriched for H3K27ac, a result, which is strongly indicative of active promoters and, in turn, transcription ( Figure 5A and B, left panels for intronic EOTrs, Supplementary File 1 Figure S5A and B for intergenic EOTrs). The association of EOTr PPRs and domains of H3K27ac enrichment was statistically corroborated via the GSC. Jaccard indices (Jaccard P-value <0.01) for intronic and intergenic transcripts underscored that the intersection of H3K27ac and EOTr PPRs was specific (Supplementary  File 2 Table S8a-d). In summary, these results reconfirmed that our transcript datasets contained active promoters within 1 kb upstream regions. As an additional control, we compared local enrichments for predictive promoter marks including RNA polymerase in 1 kb up-and downstream regions from EOTr TSS and TTS, respectively ( Figure 5A and B, right panels for intronic EOTrs, Supplementary File 1 Figure S5C and D for intergenic EOTrs). No enrichments could be established downstream of the TTS.

Proximal promoter regions of eRNAs display distinct characteristics from EOTrs
Epigenetic footprints of EOTr PPRs were also compared to potential eRNA promoter regions in HepG2 and HeLa cell lines. The corresponding 1 kb upstream regions of eRNAs did not display chromatin environments indicative of active transcription (Supplementary File 3 Figure S1) (15,42). This is consistent with the absence of CAGE clusters preselected by HMMs within the 1 kb upstream regions of eRNAs. The detection of H3K4me3/H3K27ac landscapes within proximal promoter regions of EOTrs also--albeit indirectly--indicated that transcription for these RNAs   Figure S1). We concluded that EOTrs and eRNAs represent distinct classes of enhancer-associated transcripts.
is initiated outside of intersected enhancer domains. Enhancer RNAs, on the other hand, resided entirely confined within these domains of H3K4me1 enrichment and, hence, are considered to be "enhancer-derived" (7,9,15). In summary, epigenetic footprints for PPRs of eRNAs and EOTrs set these two classes of enhancer-associated transcripts apart, thereby implying alternative routes of RNA generation (Pande et al., in preparation, Supplementary file Methods Figure 1A, Supplementary File 3 Figure S1).

RNA polymerase pausing: indication of enhancer occlusion
RNA deep sequencing captures transcript steady state levels within living cells (76). Concentrations, as revealed by the analysis of RNA expression, represent the outcome of several--in part even competing--processes acting on RNA: transcription, processing and degradation. Technologies that specifically monitor the act of transcription provide the most appropriate tools for analysis of TI acting on intersected TFBSs. Many of these methods are based on the original protocol for nuclear run-on assays but represent technical advancements (77); in particular, by incorporating RNA high-throughput sequencing to allow genomewide analysis. Global nuclear run-on assays coupled with cDNA deep sequencing (GRO-seq) enabled the analysis of RNAPII distributions within EOTr-transcribed domains (25). Complementary DNA peaks representing local enrichments of RNAPII within EOTr-transcribed domains were considered to be indicative of RNA polymerase pausing (55,78). As a control, we utilized the ChIP-seq analysis to determine occupancy levels for phosphorylated RNA polymerase II within the same EOTr-intersected enhancer domains (79,80). Both approaches resulted in the same bimodal distribution of RNAPII with a major peak surrounding the EOTr TSS and a second peak intersecting the occluded TFBSs ( Figure 6A, B, E and F). Potentially, TF-BSs that overlapped with paused RNA polymerases, are less accessible to TF-binding. Intersections of occluded TFBSs and paused RNAPII were compared to enrichments within immediate flanking regions to demonstrate the specificity of our results. In order to quantify transcriptional paus- . Analysis of ChIP-seq peaks for the intersected TFs provided an indirect measure for the affinity of analyzed TFBSs. We uncovered a reverse correlation between RNAPII pausing and TF-binding ( Figure 6C and D). These findings are consistent with the portrayed mechanism of TI and suggested that local competitions of RNAPII and TF potentially cause reduced affinities at intersected sites.

Poised and active enhancers: EOTrs occupy domains of lower activity and TI exerted via EOTrs represents a locally confined mechanism
Enrichments of H3K9ac and/or H3K27ac modifications signify active enhancer domains (24,37,81). Analysis of EOTr-associated enhancers with ENCODE ChIP-seq datasets, revealed significantly lower occupancy levels for both modifications compared with domains devoid of EOTr expression. This was most prominent in case of the enrichments for H3K27ac histone tail modifications and might also be a consequence of lower p300 occupancy levels within EOTr-associated domains ( Figure 7A and B for intronic EOTrs, Supplementary File 1 Figure S6A and B for intergenic EOTrs).
In summary, reduced relative TF-binding affinities were accompanied by diminished enhancer activities (Supplementary file Methods Figure S1B). As anticipated, the comparison of EOTr to eRNA-associated enhancer domains  Figures S4a and b). Therefore, eRNA but not EOTr expression correlated with activities of intersected enhancer domains. Confirming results were also obtained by the analysis of enhancer activities for EOTrassociated domains across cell lines (see below for intronic EOTrs, Supplementary File 1 section 1.1 and Figure S7 for intergenic EOTrs).
TI acting by the occlusion of TFBSs represents a locally confined mechanism. In the case of EOTr-associated enhancers, on average only ∼60-67% (intronic: HepG2 and HeLa) to 70-73% (intergenic: HepG2 and HeLa) of the entire domain were intersected by our transcript datasets. For the same set of EOTr-associated enhancers, H3K27ac and H3K9ac enrichments were consistently lower for transcriptoverlaid domains compared to immediate flanking regions (which are devoid of RNA expression). In line with the mechanism of TI, effects related to EOTr expression were locally confined and barely detectable outside the tran-scribed regions. The significance of this finding was statistically evaluated with the aid of 2 -tests. We analyzed the number of H3K27ac starting peaks for the same enhancer domains within EOTr and non-EOTr regions (i.e. domains downstream from EOTr TTS but contained within the same enhancer). The results underscored the significance (P < 0.01) of lower activities associated with EOTr-intersected domains compared to surrounding regions devoid of interfering transcription (Supplementary File 2 Table S9a-d).

Histone acetyltransferase p300 binding to DNA at EOTrassociated enhancer domains across cell lines
Histone acetyltransferase p300 deposits H3K27ac marks on histones of active enhancers (9,36,81,82). The connection of p300 binding and H3K27 acetylation for EOTr-intersected enhancer domains was further investigated across cell lines. The positive association of eRNAs with domains of higher p300 and H3K27ac enrichment is well established (81,82). Therefore, eRNAs with cell line specific expression served Notched boxplots for H3K9ac and H3K27ac enrichments monitored with p300 peaks as reference for enhancer domains associated with intronic EOTrs (EOTr+) and non-EOTr (EOTr-) regions in HepG2 (A) and HeLa (B) cell lines. H3K27ac and H3K9ac occupancies were lower for enhancers associated with EOTrs compared to non-EOTr control regions (for definition see main text). Therefore, EOTr-intersected enhancer domains exhibited lower activity. Differential binding across cell lines: enrichments of p300 and H3K27ac were calculated with domains associated with eRNAs or EOTrs in HepG2 cells and the same domains devoid of intersecting transcipts in HeLa cells. Detected fold changes revealed that binding of p300 (C) and H3K27ac (D) was elevated in eRNA-associated domains. However, in case of EOTr-intersected enhancers within the same cell lines, we observed the reverse correlation: in the absence of EOTrs p300 (E) and H3K27ac (F) binding was significantly increased. In summary, EOTrs and eRNAs represent functionally distinct classes of enhancer-associated transcription (multiple testing correction p300-HepG2→HeLa n = 1027 and HeLa→HepG2 n = 1412, Log base 2-fold change (L2FC) and P-values corrected for multiple testing (q) HepG2 ≥ HeLa, q-value = 2.3 × 10 −2 , HeLa ≥ HepG2, q-value. = 2.6 × 10 −3 , H3K27ac-HepG2→HeLa n = 1347 and HeLa→HepG2 n = 1762, Log base 2-fold change (L2FC) and P-values corrected for multiple testing (q) HepG2 ≥ HeLa, q-value = 4.8 × 10 −2 , HeLa ≥ HepG2, q-value. = 5.2 × 10 −3 ). Fold changes: +1 = up-regulated, 0 = not differentially regulated and -1 = down-regulated. as a control. Analysis of differential binding across cell lines for p300 and H3K27ac indicated that EOTrs preferentially resided within domains of lower activity. Domains devoid of EOTrs were enriched in p300 and concomitantly displayed higher activities ( Figure 7C-F). As anticipated, the reverse correlation was observed for the eRNA-associated enhancers in control datasets. Potentially, the higher p300 occupancy levels in the event of eRNAintersected domains are responsible for the higher activities as revealed by H3K27ac.
These findings agreed with the fact that eRNA transcription did not occlude TF-binding within intersected domains ( Supplementary File 3 sections 3.3 and 3.4) and once more established that eRNAs and EOTrs represent distinct classes of enhancer-associated transcription, which possess entirely different regulatory potential.

Allele-specific RNA expression to monitor effects of EOTr expression on enhancer activities
In order to further demonstrate the regulatory impact of EOTrs on human enhancer domains, we monitored effects caused by allele-specifically expressed EOTrs (60). With H3K27ac enrichments as analytical read out, enhancer activities for alleles occupied by EOTrs were compared with those devoid of occluding RNA expression (24). This design reflected the outcome of EOTr transcription on virtually the same enhancer domain and cell line. Only domains that displayed H3K4me1/p300 enrichments for both alleles qualified as input data. For RNA deep sequencing data, allelespecific expression and single nucleotide variants (SNVs) were called with GATK and MAMBA tools (Materials and Methods) (60,83). We identified for 47% (intronic 21%; intergenic 26%) and 53% (intronic 28%; intergenic 25%) of EOTr-containing enhancer domains in HepG2 and HeLa cell lines transcripts with allele specific expression. The isolation of 1527 (HepG2 = 814, HeLa 713) SNVs within EOTr-associated enhancer domains permitted the distinction of enrichments for histone tail modifications between individual EOTr alleles. Results for differential H3K27ac binding across alleles, i.e. in the absence or presence of EOTrs, once more suggested the association of lower enhancer activities for alleles intersected by EOTr datasets (Figure 8 and Supplementary file Methods Figure 1D). This further supported our model that EOTrs act via TI at human enhancer domains.

Interactome (ChIA-PET and 5C) analysis of occluded enhancers
Enhancers and linked promoters reside jointly embedded in networks of, often cell line specific, long-range chromatin interactions (84). Interactome maps enable the characterization of these enhancer domains and their interaction partners. We established the corresponding interactomes for HepG2 and HeLa cell lines with ENCODE 5C and ChIA-PET datasets for EOTr-associated enhancers and their corresponding RefSeq gene (hg19) interaction partners (27,(85)(86)(87). The vast majority of enhancer domains intersected by EOTrs, i.e. 92% in HepG2 (1515/1647; intronic 51%; intergenic 41%) and 80% in HeLa (1566/1958; intronic 46%; intergenic 34%) were recovered within these interactome maps and preferentially revealed intra-chromosomal looping interactions (intra-chromosomal to inter-chromosomal contact ratio was 5:2). Subsequent investigation of epigenetic footprints associated with these interacting domains discovered enrichments for H3K4me3. This result is strongly indicative of eukaryotic promoters (Materials and Methods, GSC: Supplementary File 2 Tables S10 and 11). The association was statistically evaluated via the GSC (Jaccard index for H3K4me3 compared to CTCF < 0.01). The results confirmed the specificity of our findings (For enhancers associated with cell line specific EOTrs: we identified in HepG2 2343 interacting RefSeq gene promoters [out of these 1327 with intronic enhancers and 1016 with intergenic enhancers] and 3548 interacting RefSeq gene promoters in HeLa [out of these 1812 with intronic enhancers and 1736 with intergenic enhancers]). Therefore, EOTr-associated enhancers participate in the formation of intra-chromosomal looping interactions with human gene promoters.

Analysis of allele-specific interactomes: the impact of EOTr expression on chromatin looping interactions
Chromatin interactomes for EOTr loci were investigated in order to identify potential effects related to TI on the formation of enhancer-promoter (E/P) networks. Enhancer domains associated with allele-specific EOTr expression and the corresponding non-EOTr alleles (i.e. enhancer domains devoid of EOTrs for control) were input for this analysis (HepG2-728, HeLa-816). This approach bypassed cell line specific influences on enhancer activities that poten-tially hamper the analysis across cell lines, and revealed more interacting loci for enhancer alleles devoid of EOTrs (E/P loops EOTr alleles HepG2-1, HeLa-2; non-EOTr alleles HepG2-11, HeLa-23). As a consequence of TI, the resulting interactome complexities, i.e. the number of corresponding E/P interactions, were compromised ( Figure 9, 2 -test, P-value < 0.01).

RefSeq gene promoters that interact with enhancer domains overlapping EOTrs are predominantly poised
As demonstrated, enhancer domains associated with EOTrs displayed lower activities compared to enhancers devoid of EOTrs. We therefore investigated whether this decrease was accompanied by lower promoter activities of linked protein coding RefSeq genes (hg19). ChIA-PET and 5C analysis allowed the identification of interacting gene promoters for HeLa and HepG2 cell lines. Corresponding PPRs were scanned for histone tail modification H3K4me3 in combination with either H3K27ac or H3K27me3, indicative of active or poised promoter states, respectively (45). For proximal promoter regions, which were linked to EOTr-regulated enhancer domains, enrichments of H3K27me3 histone tail modification were uncovered, thereby implying that the corresponding gene promoters (HepG2 = 87%; intronic 52% and intergenic 35% and HeLa = 92%; intronic 58% and intergenic 34%) were transcriptionally silent (Supplementary File 1 Figure S8 for intergenic EOTrs). These results were analyzed with the aid of the GSC to validate the statistical significance of the identified intersection (  Figure S9 for intergenic EOTrs).  These results demonstrated how the regulation of enhancer domains via TI is transduced to the connected gene promoters. Confirming results were also obtained with the analysis of differential expression for the same set of RefSeq genes. Messenger RNA expression levels were significantly lower for RefSeq genes that interacted with EOTr-regulated enhancer domains ( Figure 10E and F). For intronic EOTr-associated enhancers, ChIA-PET/5C (in HeLa and HePG2 cell lines) analysis revealed that subsets of these intronic enhancers interacted with host gene (host genes are genes with introns containing EOTrregulated enhancer domains) promoters (63% intronic enhancers in HeLa and HepG2). Interestingly, even these host gene promoters displayed preferentially poised characteristics (Supplementary File 2 Table S13). The resulting network comprising EOTr-associated enhancer and regulated target gene are summarized in Figure 11.

DISCUSSION
Enhancer domains within multicellular eukaryotes often are transcribed, leading to different classes of potentially regulatory RNAs (6,9,12,14,88). The expression levels of these transcripts are positively correlated with the activity of corresponding enhancer domains and interacting genes (6,(23)(24). Therefore, enhancer RNAs and 'RNAs with enhancer-like function' increase enhancer activity and in turn gene expression. However, for many of these transcript classes, the underlying mechanisms and functions remain subject to further analysis. Enhancer domains represent complex arrays of TFBSs and participate in the formation of pre-initiation complexes for RNA polymerase II transcription. Potentially, RNA polymerase template switching (between promoter and enhancer) could cause spurious transcriptional initiation and, in turn, lead to the generation of these enhancer-derived RNAs. In this context, it is noteworthy that only a minor fraction of elongating RNAPII is associated with conventional promoter dependent transcription (89). A finding, which might suggest that RNA polymerase template switching is a major source of spurious transcription. Notably, eRNAs are not associated with bona fide promoter structures (Supplementary File 3 Figure  S1). Therefore, whether and to what extent these enhancerassociated RNAs are by-products of enhancer activity and as such represent products of spurious transcription, still remains to be investigated (9,(89)(90).
Recently, we uncovered TFbiTrs, transcripts that occlude TF/DNA binding in proximal promoter regions (PPRs) and, in turn, inhibit gene expression (4). Key features of TI encompass the reduction of TF/DNA binding in regions intersected by occluding transcription (3,4). Here, we addressed whether enhancer-associated transcription could also underlie transcriptional interference and reduced activities of intersected domains (5). This process would ultimately lead to the inhibition or reduction of gene expression. We utilized STAP/TRAP analysis for the quantification of relative TF-binding affinities (4,(18)(19). Relative TFbinding affinities were calculated for the same TF with TF-BSs in EOTr (case) and non-EOTr (control) (i.e. TFBSs in enhancers without EOTrs) datasets. The outcome of this analysis depends upon the actual TFBSs and their distri-bution within both datasets. Therefore, PWMs for TFBSs in EOTr and non-EOTr domains were independently analyzed with KL tests and WebLogo. The high similarities of TFBSs in EOTr and non-EOTr datasets implied that major differences in TF-binding were attributable to EOTr expression acting on otherwise identical TFBSs. Direct analysis of TF-binding as an effect of potentially occluding RNA expression required the comparison of binding affinities for the same set of TFBSs in the absence or presence of EOTrs. For this purpose, we examined EOTrs with cell line and even allele-specific expression. In most EOTr regions, TFbinding was significantly less effective--as revealed by unfavorable binding--compared to the corresponding sites devoid of EOTrs (i.e. across two different cell lines). Apart from EOTr expression, cell line specific heterochromatin states might contribute to the observed TF-binding affinities (91). For the same set of enhancers and cell lines, we compared H3K27me3 enrichments as a marker of facultative heterochromatin, without observing significant differences (Supplementary File 1 section 1.2 and Figure S10). Consequently, altered heterochromatin states between cell lines did not impact on the analysis of relative TF-binding affinities via STAP.
Enhancer activities are positively correlated with H3K27ac enrichments (22,24). The analysis for EOTrassociated enhancer domains, in relation to H3K27ac occupancy levels, was conducted in three-fold: (i) within HepG2 or HeLa cell lines in comparison to non-EOTr enhancers, (ii) across cell lines for the same enhancers and (iii) for allelic pairs. Our results consistently indicated that enhancers associated with EOTrs exhibited significantly lower activity levels. Therefore, EOTr expression is not only reflected by reduced relative binding affinities; in addition, we identified that TI is even correlated with activity levels of intersected enhancer domains.
Pervasive transcription within genomes generates sites of overlapping or interleaved transcription (9,92). The enhancer landscape occupies a substantial fraction of the intronic or intergenic sequence space. The presented analysis was conducted with only three TFs, which were utilized to identify occluding RNA transcription within the human genome. The results, therefore, represent a subset of all occluding transcripts. Potentially, there are many more EOTrlike transcripts, and we propose that TI or related mechanisms could even account for a considerable fraction of the RNA 'dark matter transcription' (93,94).
We propose that EOTrs, as enhancer regulating transcripts, are highly flexible molecular tools and as such could serve many functions depending on the regulated enhancer module or occluded TF. Future analysis with more comprehensive datasets will enable to address their functional relevance in broader detail. Also, gene expression in HepG2 and HeLa cell lines is substantially different from noncancer tissues (95,96). The extrapolation of EOTr functions to non-cancer cells or intact organisms is therefore tentative.
In case of TI, the act of transcription itself is accompanied by the local competition between TFs and RNAPII for DNA binding within overlapping regions. This mechanism could be the underlying cause of lower TF-binding affinities in EOTr regions. To test these assumptions, we inspected the correlation of RNA polymerase pausing and TF-binding within EOTr-intersected domains (55,(79)(80). The identification of reverse correlations for RNA polymerase pausing and TF-binding argues in favor of TI as the defining mechanism of EOTr action. The concept of RNA decoys for transcription factors is less likely to explain EOTr functions because RNAs are diffusible macromolecules (90,97,98). Accordingly, EOTrs would sequester available TFs and globally reduce the effective concentration of free nuclear TFs. This hypothetical mode of EOTr/TF action would alter TF-binding even in non-EOTr regions. Consequently, there should be no detectable difference for TF-binding at EOTr-intersected TFBSs and sites within domains devoid of EOTrs. Similarly, lower occupancy levels for H3K27ac histone tail modifications were strictly confined to domains associated with EOTrs. This local restriction of apparent EOTr-related effects is consistent with TI as opposed to mechanisms based on sequestration, which are not limited to certain domains or binding sites. To gain further insights into the regulation of human enhancer domains via transcription, we analyzed and compared two entirely different transcript classes: EOTrs, which act via TI and eRNAs (for identification and analysis of eRNAs in HepG2 and HeLa cell lines, Supplementary File 3 sections 3.1-3.6) that map to domains of higher enhancer activity (23)(24)37,49). Initially, this observation might seem contradictory, prompting questions as to why eRNAs do not occlude TF-binding in ways similar to EOTrs. TI (or related mechanisms) depends on the relative promoter strength of occluding RNAs and only promoters that initiate transcription at sufficiently high rates are capable to occlude TF-binding (3,4). Once more, we used STAP/TRAP analysis tools for the identification of expression thresholds that cause TI. Expression levels of eRNAs within HepG2 and HeLa cell lines ranked substantially below these critical thresholds (Supplementary File 3 section 3.2 and Table S2). This finding might, at least in part, explain why EOTrs but not eRNAs are capable of occluding TF-binding within the analyzed domains. Congruently, relative binding affinities for TFBSs within eRNA domains suggested favorable binding compared to EOTrs as well as genome-wide controls (Supplementary File 3 sections 3.4 and 3.5). Lower expression levels of eRNAs were also mirrored by the epigenetic landscapes that typically were associated with this class of transcripts: For instance, the substantially lower occupancy levels for H3K36me3 within transcribed regions. In addition, the 1 kb upstream regions of eRNAs lacked H3K4me3 signatures (Figure 2 and Supplementary File 3 Figure S1). EOTrs, on the other hand, displayed epigenetic key features of long npcRNA transcription and hence differ substantially from the enhancer derived eRNA transcripts (9,15,34).
Decreased activities for EOTr-associated enhancers, were correlated with reduced enhancer/promoter interactions as revealed by ChIA-PET and 5C. Fewer interacting loops were identified for EOTr-intersected enhancers per regulated promoter than for enhancer domains devoid of the occluding transcription. Therefore, even enhancer/promoter interactions are reduced in number and might be inhibited by interfering EOTr expression (Supplementary File 2 Table S14a and b). Notably, RefSeq gene promoters that interacted with EOTr-associated enhancer domains displayed poised characteristics. This association, however, is entirely reversed when the same PPRs were analyzed in the absence of regulatory EOTr expression. In this case, enhancers and corresponding RefSeq gene promoters were embedded in domains of transcriptional activity. In contrast, RefSeq genes linked to enhancer domains associated with eRNAs consistently ranked amongst the highly expressed genes (Supplementary File 3 section 3.6 and Figure S5). To the best of our knowledge, this analysis represents the first genome-wide investigation of TI-related mechanisms acting on human enhancer domains (Supplementary File 1 Figure  S11). EOTrs add a further level to the intricate regulation of eukaryotic networks of gene expression.