The transcription factor network of E. coli steers global responses to shifts in RNAP concentration

Abstract The robustness and sensitivity of gene networks to environmental changes is critical for cell survival. How gene networks produce specific, chronologically ordered responses to genome-wide perturbations, while robustly maintaining homeostasis, remains an open question. We analysed if short- and mid-term genome-wide responses to shifts in RNA polymerase (RNAP) concentration are influenced by the known topology and logic of the transcription factor network (TFN) of Escherichia coli. We found that, at the gene cohort level, the magnitude of the single-gene, mid-term transcriptional responses to changes in RNAP concentration can be explained by the absolute difference between the gene's numbers of activating and repressing input transcription factors (TFs). Interestingly, this difference is strongly positively correlated with the number of input TFs of the gene. Meanwhile, short-term responses showed only weak influence from the TFN. Our results suggest that the global topological traits of the TFN of E. coli shape which gene cohorts respond to genome-wide stresses.


INTRODUCTION
Gene regulatory networks (GRNs) receive, process, act upon and send out information, while being robust to random fluctuations. How signals targeting one to a few genes are processed is relatively well understood (1,2). Meanwhile, many cellular environments fluctuate (sometimes unpredictably) in nutrient availability, pH, temperature, salts, community of other cells or species they live with, etc., which may cause genome-wide stresses. We investigate how GRNs produce chronologically ordered responses to genome-wide perturbations, while robustly maintaining homeostasis.
Evidence suggests that genome-wide stresses initially perturb hundreds to thousands of genes (3) but are quickly processed. As a result, after a transient period, only specific gene cohorts of tens to a few hundred genes (4,5) (usually sharing common feature(s)) participate in the responsive short-, mid-and long-term transcriptional programs (6). For example, when Escherichia coli suffers a cold shock, a specific cohort exhibits a fast, short-term response (∼ 70 genes), while another has a longer-term response (∼ 35 genes). Other genes remain relatively passive (7,8). Since cells exhibit predictable, temporally ordered, beneficial phenotypic changes, these response programs have likely been positively selected during evolution.
It has been shown that global regulators (GRs) (9)(10)(11)(12), DNA supercoiling (13) and small RNAs (sRNAs) (14), among other, can select large cohorts of stress-specific, responsive genes. It was also reported that 60-90% of E. coli genes respond to changing growth conditions following a constant global scaling factor (15). Further, measurements using fluorescent reporters and small circuits (16,17) showed that the effects of RNA polymerase (RNAP), and other GRs, can be separate from the effects of input transcription factors (TFs) during genomewide responses. Nevertheless, establishing whether and how the global topology and logic of transcription factor networks (TFNs) affect genome-wide responses remains challenging (18), despite successes in establishing that genegene interactions generate gene-gene correlated dynamics (19)(20)(21)(22).
To investigate the influence of the topology and logic of TFNs on large transcriptional programs, we study what occurs following genome-wide perturbations. We consider that, in E. coli, the concentration of the key genome-wide regulator, the RNAP, naturally differs with medium composition (23). Also, it is well established how transcription kinetics differs according to RNAP concentration (24,25). We expect that these changes can have genome-wide effects. We thus obtain increasingly dilute media to alter systematically and rapidly (26,27) the abundance of RNAP (illustrated in Figure 1A), and measure the genome-wide, shortand mid-term changes in transcript abundances.
Decreases in RNAP abundance should, in the shortterm, cause quick genome-wide decreases in transcription rates, and thus in RNA abundances ( Figure 1B 1 , Supplementary Results section Expected effects of shifting RNA polymerase concentration on a gene's transcription dynamics). Such shifts, likely diverse in magnitudes, should then cause downshifts in the corresponding protein abundances. Thus, in the case of input TFs, their 'output' genes will, later on, be affected as well ( Figure 1B 2 ), causing further (here named 'mid-term') changes in RNA abundances (Figure 1B 4 ). Meanwhile, the short-term changes most likely are only affected by the genes' individual features affecting their responsiveness to RNAP, since the protein abundances have not yet changed significantly.
We focus on the mid-term changes in single-gene expression levels. Specifically, we hypothesize that, in the shortterm, a reduction in RNAP concentration will be followed by a reduction in most genes' expression levels. Subsequently, in the mid-term, genes with one input TF will have their RNA abundance either further decreased or, instead, increased, depending on whether their input TF is an ac-tivator or a repressor, respectively. Meanwhile, the average magnitude of the mid-term response of genes with multiple input TFs should be correlated with the difference between the numbers of their activator and repressor input TFs. This difference is here named 'bias' in the regulatory effect (i.e. activation or repression) of the input TFs of a gene. A schematic and a predictive model of this regulatory mechanism of the genome-wide single-gene mid-term responses are shown in Figure 1B and C, respectively. At the single-gene level, the magnitude of the mid-term changes in RNA abundances should be influenced by the magnitude of the shift in RNAP and in input TFs concentrations, as well as by the specifics features of each gene and its input TFs (bindings affinities, etc.). However, many of these features are largely unknown. As such, here we only study empirically if the average responses of gene cohorts can be explained by the mean bias in the regulatory effect of their input TFs, along with the magnitude of the shift in RNAP concentration ( Figure 1C 2 ). In detail, we interpret the data on the genome-wide kinetics based on the information on the TFN structure (logic and topology) ( Figure  1B 3 ).
We use E. coli to validate this hypothesis since its gene expression mechanisms have been largely dissected and the kinetics of transcription, translation, and RNA and protein degradation are well known (26,32,33). Also, its TFN is extensively mapped, with RegulonDB (34) informing on ∼ 4700 TF interactions between ∼ 4500 genes (and on their activating or repressing regulatory roles). Consequently, since we know the regulatory network a priori, instead of using the data on gene expression for network inference, we use it solely to quantify the genes' responsiveness with respect to the TFN topology and logic. We then investigate whether the mid-term responsiveness to shifting RNAP concentration is in accordance with the presently known topology and logic of the TFN, as hypothesized (Figure 1B 1 -B 4 ). Supplementary Table S1 has a description of the variables used throughout the manuscript.
In summary, here we show that changes in RNAP concentration due to medium dilutions are followed by short-and mid-term genome-wide changes in RNA abundances (obtained by time-lapsed RNA-seq). These RNA changes, globally, cannot be explained by potential influences from GRs, factors, (p)ppGpp, non-coding RNAs, or post-translation regulators (e.g. due to lack of correlation with their output genes, lack of RNA changes, etc.). Instead, we find that genes directly linked by TF interactions show correlated changes in RNA abundances. Further, the average magnitude of the mid-term responses of gene cohorts can be explained by the mean bias in the regulatory effect of their genes' input TFs (obtained from RegulonDB and shown to be correlated to the number of input TFs, K TF ). Also influential is the magnitude of the shift in RNAP concentration (obtained by flowcytometry and western blot) and operons and transcription units (TUs) organization (from RegulonDB). Meanwhile, short-term responses are less influenced by TFs. Finally, we show the same phenomenon for opposite shifts in medium concentration that cause the same shifts in RNAP concentration. and 68% CB (shadow) relative to control (LB 1.0x ) after diluting the medium (LB 0.5x ). Vertical red lines mark when the transcriptome measurements at 60, 125 and 180 min. Given the RNAP levels and the kinetics of RNA and protein abundances, these moments are named 'prior to RNAP changes' and 'short-', and 'mid-term' changes in RNA abundances. (B 1 ) Known TF-gene interactions (red and green lines, if repressing and activating, respectively) and genes with (pink) and without (blue) input TFs of E. coli. (B 2 ) Schematics of the expected effects of a local topology of activating (green) and repressing (red) input TFs on mid-term responses. Genes (balls) are coloured (blue, yellow, and green) according to the events in B 4 . (B 3 ) Data collected on the genome-wide kinetics as well as data collected on the TFN structure. (B 4 ) Following a medium dilution, intracellular RNAP concentrations (black arrow) decrease after a time lag, and RNA abundances (red arrow) will decrease accordingly. Compared to when at ∼ 0 min, the RNAP at ∼ 120 min and corresponding RNAs at ∼ 125 min should be lower (25,28). Given translation times (∼ 50 min (29-31)), at ∼ 175 min, the protein abundances, including input TFs, coded by the perturbed RNAs (green arrow) should differ as well. Fluctuations in these input TFs abundances will then propagate to nearest neighbour 'output' genes, further shifting their RNA abundances (blue arrow) depending on whether the input TF is an activator or a repressor. Finally, the yellow arrow represents (not measured) long-term changes (∼ 230 min or longer). We performed RNA-seq at ∼ 60 min (prior to RNAP changes), ∼ 125 min (short-term RNA changes), and ∼ 180 min (mid-term RNA changes, affected by input TFs). Finally, the green dashed line marks when the RNAP level already differs significantly from the control (see example Figure 1A). (C 1 ) Predictive model of the expected biases in sets of input TFs of individual genes. Considering TF-gene interactions as either repressions (regulatory effect of -1) or activations (regulatory effect of +1), the overall effect of a set of input TFs during these stresses should be predictable from the sum of the regulatory effect of the input TFs, named 'bias', (b). Regulatory effects obtained from RegulonDB. (C 2 ) Example average response (μ |LFC| from RNA-seq) at 180 min of gene cohorts with a given μ |b| and how they are expected to relate to the biases. Figures created with BioRender.com.

Bacterial strains, media, growth conditions and curves, and intracellular concentrations
We used wild type MG1655 cells as a base strain to study the transcriptome. In addition, we used an RL1314 strain with RpoC endogenously tagged with GFP (generously provided by Robert Landick) to measure RNAP levels, and 20 YFP fusion strains with genes endogenously tagged with the YFP coding sequence (25) to measure single-cell protein levels (Supplementary Table S4). Further, we used a strain carrying an rpoS::mCherry gene (generously provided by James Locke), shown to track RpoS (35), to measure RpoS levels. In addition, we measured the protein levels of the spoT gene, which is one of the genes responsible for (p)ppGpp synthesis (3), using the YFP fusion library. Finally, we measured single-cell levels of the crl gene using a low-copy plasmid fusion library of fluorescent (GFP) reporter strain (36).
From glycerol stocks (at -80 • C), cells were streaked on lysogeny broth (LB) agar plates with antibiotics and kept at 37 • C overnight. Next, a single colony was picked, inoculated into fresh LB medium and, kept at 30 • C overnight with appropriate antibiotics and aeration at 250 rpm. From overnight cultures (ONC), cells were diluted to 1:1000 in tailored LB media (see below) with antibiotics, incubated at 37 • C with aeration, and allowed to grow until reaching an optical density of ≈ 0.4 at 600 nm (OD 600 ).
To measure cell growth curves and rates, ONC of the RL1314 strain were diluted to an initial optical density at 600 nm (OD 600 ) of ≈ 0.05 into independent fresh media (LB 1.0x , LB 0.75x , LB 0.5x, LB 0.25x , LB 1.5x , LB 2.0x and LB 2.5x ). The cultures were aliquoted in a 24-well flat bottom transparent plate and incubated at 37 • C with continuous shaking in a Biotek Synergy HTX Multi-Mode Reader. Growth was monitored every 10 min for 10 h.
Finally, the RNAP and factor concentrations were estimated by measuring their average abundances from singlecell fluorescence levels (of RpoC-GFP and rpoS::mCherry, respectively, by flow-cytometry). Next, we divided that abundance by the mean single-cell area (from phasecontrast microscopy images of cell populations in the same condition), used as a proxy for cell volumes, to obtained concentrations in fluorescence intensity per pixel (not shown in the figures). In all cases, for each condition, we obtained images from, on average, 2500 cells (from three biological replicates). The images are provided in Supplementary Data.

Microscopy
To measure single-cell RNAP levels, ONC RL1314 cells were pre-inoculated into LB 1.0x , LB 0.75x , LB 0.5x and LB 0.25x media. Upon reaching mid-exponential growth phase, cells were pelleted by quick centrifugation (10000 rpm for 1 min), and the supernatant was discarded. The pellet was resuspended in 100 l of the remaining medium. Next, 3 l of cells were placed in between 2% agarose gel pad and a coverslip and imaged by confocal microscopy with a 100× objective (example images in Supplementary Figure S1). GFP fluorescence was measured with a 488 nm laser and a 514/30 nm emission filter. Phase-contrast images were simultaneously acquired. MG1655 cells were imaged to measure cell size in LB 1.0x , LB 0.75x , LB 0.5x , LB 0.25x , LB 1.5x , LB 2.0x and LB 2.5x media. Finally, MG1655 cells was also imaged in LB 1.0x during stationary growth. Finally, we imaged cells of the YFP strain library to assess if their morphology and physiology were consistent with healthy cells during measurements.

Flow-cytometry
We performed flow-cytometry of RL1314 cells to measure single-cell RNAP over time. ONC were diluted at 1:1000 into respective fresh media (LB 1.0x , LB 0.75x , LB 0.5x and LB 0.25x ) and grown as described in Methods section Bacterial strains, media,growth conditions andcurves, and intracellular concentrations. Flow-cytometry data was recorded every 30 min (three biological replicates), up to 210 min. Data was also captured in the mid-exponential phase (at 180 min), in the media studied (LB 1.0x , LB 0.75x , LB 0.5x, LB 0.25x , LB 1.5x , LB 2.0x and LB 2.5x ), with 3 biological replicates each. We used a similar protocol to perform flow-cytometry of several strains of the YFP library (25) in LB 1.0x and LB 0.25x (three biological replicates, Supplementary Table S4), including to measure single-cell SpoT levels in LB 1.0x , LB 0.75x , LB 0.5x , LB 0.25x , LB 1.5x , LB 2.0x and LB 2.5x (three biological replicates each).
Meanwhile, we measured single-cell levels of the crl gene in LB 0.5x at 0 and 180 min, using a strain from the GFPpromoter fusion library. Further, to measure RpoS levels, we performed flow-cytometry of cells of the MGmCherry strain in LB 1.0x , LB 0.75x , LB 0.5x and LB 0.25x during the exponential (180 min) and stationary growth phases (LB 1.0x ,14 h after pre-inoculation) . In these measurements, as well as the measurements above, we recorded FSC-H, SSC-H and Width, to be used as proxies for cell size and density (i.e. composition), as they are positively correlated with these features (37).
In addition, data from measurements of MG1655 cells were used to discount background fluorescence from cells of the MGmCherry and the YFP strains. Similarly, measurements of the W3110 strain were used to discount the background fluorescence from the RL1314 strain.
For performing flow-cytometry, 5 l of cells were diluted in 1 ml of PBS, and vortexed. In each condition, 50000 events were recorded. Prior to the experiments, QC was performed as recommended by the manufacturer. Measurements were conducted using an ACEA NovoCyte Flow Cytometer (ACEA Biosciences Inc., San Diego, USA) equipped with yellow and blue lasers.
For detecting the GFP and YFP signals, we used the FITC channel (-H parameter) with 488 nm excitation, 530/30 nm emission, and 14 l/min sample flow-rate with a core diameter of 7.7 m. PMT voltage was set to 550 for FITC and kept the same for all conditions. Similarly, to detect the mCherry sinal, we used PE-Texas Red channel (-H parameter) having an excitation of 561 nm and emission of 615/20 nm and sample flow-rate of 14 l/min, with a core diameter of 7.7 m. PMT voltage was set to 584 for PE-Texas Red and kept the same for all conditions. To remove background signal from particles smaller than bacteria, the detection threshold was set to 5000. All events were collected by Novo Express software from ACEA Biosciences Inc.

Protein isolation and western blotting
Western blotting was used to quantify relative RNAP levels of MG1655 cells and GFP levels of RL1314 RNAP-GFP cells (Supplementary Figure S2 and Table S2). Briefly, cells were diluted from ONC into respective fresh media and incubated at 37 • C with aeration and grown until reaching an OD 600 ≈ 0.4. Next, cells were harvested by centrifugation (8000 rpm for 5 min) and pellets were lysed with B-PER bacterial protein extraction reagent, added with a protease inhibitor for 10 min at room temperature (RT). Following lysis, centrifugation was done at 14 000 rpm for 10 min and the supernatant was collected. Next, the supernatant was diluted in 4X Laemmli buffer with ␤-mercaptoethanol and samples were boiled at 95 • C for 5 min.
Samples with ∼ 30 g of soluble total proteins were loaded on 4-20% TGX stain-free precast gels (Biorad). These proteins were then separated by electrophoresis and transferred on PVDF membrane using TurboBlot (Biorad). Next, membranes were blocked with 5% non-fat milk at room temperature (RT) for 1 h and probed with primary antibodies at 1:2000 dilutions (Biolegend) at 4 • C overnight. The antibodies used for the MG1655 strain were against RpoC (␤ prime subunit of RNAP), while for the RL1314 strain it was used antibodies against GFP. As a control we also subjected MG1655 cells to antibodies against GFP (Supplementary Figure S2B 2 ). HRP-secondary antibody (1:5000) treatment was then done (Sigma Aldrich) for 1 h at RT. Excess antibodies were removed by washing with buffer. The membrane was treated with chemiluminescence reagent (Biorad) for band detection. Images were obtained by the Chemidoc XRS system (Biorad) and band quantification was done using the Image Lab software (v.5.2.1).

RNA-seq
Sample preparation. RNA-seq was performed thrice, for decreasing [LB 0.75x , LB 0.5x , and LB 0.25x , at 180 min; LB 0.5x at 60 and 125 min] and for increasing (LB 1.5x , LB 2.0x and LB 2.5x , at 180 min) medium richness relative to a control (LB 1.0x ) (an independent control was used for each three sets of conditions). Cells from 3 independent biological replicates of MG1655 in each modified medium were treated with RNA protect bacteria reagent (Qiagen, Germany), to prevent degradation of RNA, and their total RNA was extracted using RNeasy kit (Qiagen). RNA was treated twice with DNase (Turbo DNA-free kit, Ambion) and quantified using Qubit 2.0 Fluorometer RNA assay (Invitrogen, Carlsbad, CA, USA). Total RNA abundance was determined by gel electrophoresis, using a 1% agarose gel stained with SYBR safe (Invitrogen). RNA was detected using UV with a Chemidoc XRS imager (Biorad).

Sequencing.
Part 1: For shifts from LB 1.0x to LB 0.75x , LB 0.5x and LB 0.25x , at 180 min. Sequencing was performed by Acobiom (Montpellier, France). The RNA integrity number (RIN) of the samples was obtained with the 2100 Bioanalyzer (Agilent Technologies, Palo Alto, USA) using Eukaryotic Total RNA 6000 Nano Chip (Agilent Technologies). Ribosomal RNA depletion was performed using Ribo-Zero removal kit (Bacteria) from Illumina. RNA-seq libraries were constructed according to the Illumina's protocol. Samples were sequenced using a single-index, 1 × 75 bp singleend configuration (∼ 10M reads/library) on an Illumina MiSeq instrument. Sequencing analysis and base calling were performed using the Illumina Pipeline. Sequences were obtained after purity filtering. For shifts from LB 1.0x to LB 1.5x , LB 2.0x and LB 2.5x at 180 min, samples were sequenced using a single-index, 2 × 150 bp paired-end (PE) configuration (∼ 350M raw paired-end reads per lane) on an Illumina HiSeq 4000 instrument. Image analysis and base calling were conducted with HiSeq Control Software (HCS). Raw sequence data (.bcl files) were converted into fastq files and de-multiplexed using Illumina bcl2fastq v.2.20. One mismatch was allowed for index sequence identification.

Part 2: For shifts from LB 1.0x to LB 1.5x , LB 2.0x and LB 2.5x at 180 min, and from LB 1.0x to LB
For shifts from LB 1.0x to LB 0.5x at 60 and 125 min, samples were sequenced using a single-index, 2 × 150 bp pairedend (PE) configuration (∼ 10M raw paired-end reads per lane) on an Illumina NovaSeq 6000 instrument. Image analysis and base calling were conducted with NovaSeq Control Software v1.7. Raw sequence data (.bcl files) were converted into fastq files and de-multiplexed using Illumina bcl2fastq v.2.20. One mismatch was allowed for index sequence identification.
Data analysis. Regarding the RNA-seq data analysis pipeline: (i) RNA sequencing reads were trimmed to remove possible adapter sequences and nucleotides with poor quality with Trimmomatic (38) v.0.36 (for data from sequencing part 1) and v.0.39 (for data from sequencing part 2). (ii) Trimmed reads were then mapped to the reference genome, E. coli MG1655 (NC 000913.3), using the Bowtie2 v.2.3.5.1 aligner, which outputs BAM files (39). (iii) Then, featureCounts from the Rsubread R package (v.1.34.7) was used to calculate unique gene hit counts (40). Genes with less than five counts in more than three samples, and genes whose mean counts are less than 10 were removed from further analysis. (iv) Unique gene hit counts were then used for the subsequent differential expression analysis. For this, we used the DESeq2 R package (v.1.24.0) (41) to compare gene expression between groups of samples and calculate p-values and log2 of fold changes (LFC) of RNA abundances using Wald tests (function nbinomWaldTest). P-values were adjusted for multiple hypotheses testing (Benjamini-Hochberg, BH procedure, (42)) and genes with adjusted P-values (False discovery rate (FDR)) < 0.05 were selected to be further tested as being differentially expressed (DE) (Methods section RNA-seq d).
For logistical reasons, the sequencing platform for the RNA-seq data in Methods section RNA-seq b differ from one another. Consequently, the data sets first mentioned in the Results sections Genome-wide mid-term responses correlate with shifts in RNAp concentration and Further increases in medium richness do not decrease RNAP concentration and RNA numbers also do not change, respectively, cannot be compared quantitatively nor be used to infer genespecific conclusions.
Finally, to analyse the data from LB 1.0x and LB 0.5x at 60 and 125 min and compare its results with the results from the data of Methods section RNA-seq b Part 1 at 180 min, their raw count matrices were merged. Also, only genes that passed the filtering were studied. The filtering removed genes with less than 5 counts in more than 6 samples, and genes whose mean counts were less than 10.
Moreover, we expect the overall sums of LFCs from each perturbation to equal zero since, in DEseq2, the median-ofratios normalization calculates the normalizing size factors assuming a symmetric differential expression across conditions (i.e. same number of up-and down-regulated genes) (43). Further, it fits a zero-centered normal distribution to the observed distribution of maximum-likelihood estimates (MLEs) of LFCs over all genes (41). Both steps (perhaps related) force the mean LFC to be 0.
LFC criteria for differentially expressed genes. From past methods (44)(45)(46), we classified genes as statistically significantly DE following perturbations, by setting a maximum FDR threshold for adjusted P-values (Methods section RNA-seq c) and a minimum threshold for the absolute LFC of RNA numbers of individual genes (|LFC|).
From the μ |LFC| of genes whose FDR > 0.05, named μ |LFC| (F DR > 0.05), we identified DEGs (DE Genes) as those that, in addition to having FDR < 0.05, also have |LFC| > μ |LFC| (F DR > 0.05). Specifically, we added the conditions: |LFC| > 0.4248 for LB 0.75x , > 0.4085 for LB 0.5x , > 0.4138 for LB 0.25x , > 0.2488 for LB 1.5x , > 0.2592 for LB 2.0x , and > 0.2711 for LB 2.5x , for accepting a gene as being significantly DE. Meanwhile, for the data in LB 0.5x at 60 and 125 min, we added the conditions: |LFC| > 0.2171 for LB 0.5x 60 min, and > 0.2977 for LB 0.5x 125 min. This allows removing genes whose FDR < 0.05 but have a negligible LFC. Noteworthy, in no condition did we remove from the set of DEG more than 5 genes by applying this rule.
RNA-seq vs Flow-cytometry. RNA and protein abundances are expected to be positively correlated in bacteria, since transcription and translation are mechanically bound (47)(48)(49). Further, most regulation occurs during transcription initiation (50), which is the lengthiest sub-process (24).
To validate that this relationship holds during the genome-wide stresses, we randomly selected a set of genes whose LFC's, as measured with RNA-seq, cover nearly the entire spectrum of LFCs observed genome-wide. Next, we measured their LFC in protein levels, using the YFP strain library (25) (Methods section Bacterial strains, media,growth conditions andcurves, and intracellular concentrations) and flow-cytometry (Methods section Flowcytometry), at 180 min after shifting the medium. The list of selected genes is shown in Supplementary Table S4. For the fold change of 1/8×, 1/4×, 1/2×, 1×, 2×, 4 × and 8×, we selected three genes whose LFC in RNA abundances is closest to that value (except for the 8× fold change, since only two genes were available). This range of values covers nearly the whole LFC spectrum observed by RNA-seq (Supplementary Figure S9).

Transcription factor network of Escherichia coli
We assembled a directed graph of the network of TF interactions between the genes present in our RNA-seq data, based on the data in RegulonDB v10.9 (34), as of 28 January 2022. We used all reported TF-gene, TF-TF, TFoperon and TF-TU interactions. These equally contribute to our network of gene-gene directed interactions. In detail, a TF or regulatory protein is a complex protein that activates/represses transcription of a TU upon binding to specific DNA sites. A TU is one or more genes transcribed from a single promoter. Similarly, an operon are one or more genes and associated regulatory elements, transcribed as a single unit.
The TFN graph was analysed using MATLAB (2021b) and Network Analyzer v.3.7.2 plug-in in cytoscape (51) to extract the following network parameters: number of nodes and directed edges, number of connected components, number of isolated nodes and self-loops, and single-gene in-and out-degree, edge-count, clustering coefficient, eccentricity, average minimum path length, betweenness and stress centrality, and neighbourhood connectivity. The statistics considered are shown in Supplementary Tables S5 and S19.

2-Sample T-test, 2-sample KS-test and 1-sample Z-test.
The 2-sample T-test evaluates the null hypothesis that the two samples come from independent random samples from normal distributions with equal means and unequal and unknown variances. For this, we have set a significance level of 10% significance level (P-value < 0.10) when applying the MATLAB function ttest2.
The 2-sample KS-test returns a test decision for the null hypothesis that the data from two data sets are from the same continuous distribution, using the two-sample Kolmogorov-Smirnov test. As above, we have set the null hypothesis at 10% significance level (P-value < 0.10).
The one-sample Z-test tests for the null hypothesis that the sample is from a normal distribution with mean m and a standard deviation . In this case, m and σ are estimated from the genes with K TF = 0. As above, we have set the null hypothesis at 10% significance level (P-value < 0.10).
Fisher test. The Fisher test evaluates the null hypothesis that there is no association between the two variables of a contingency table. We reject the null hypothesis at 10% significance level (P-value < 0.1), meaning that the variables are significantly associated.
Correlations between data sets. The correlation between two data sets with known uncertainties (standard error of the mean (SEM) in each data point) was obtained by performing linear regression fitting using Ordinary Least Squares. The best fitting line along with its 68.2% confidence interval/bounds (CB) and statistics was obtained as described in Supplementary Materials and Methods 1.4 of (52). In short, the uncertainty of each of the N empirical data points was represented by m points, resulting in n = N × m points. Each of these points is obtained by random sampling from a normal distribution whose mean () and standard deviation () equal the mean and error of the empirical data point, respectively. It was set m = 1000, as it was sufficient to represent the error bars of the actual data points. We obtained the coefficient of determination (R 2 ) and the root mean square error (RMSE) of the fitted regression line, and the p-values of the regression coefficients. The P-value of x (P-value 1 ) was obtained of a T-test under the null hypothesis that the data is best fit by a degenerate model consisting of only a constant term. If P-value 1 is smaller than 0.1, we reject the null hypothesis that the line is horizontal, i.e. that one variable does not linearly correlate with the other. When there are more than three data points, we also calculated regression coefficient of x 2 (P-value 2 ) of a T-test under the null hypothesis that the second order polynomial fit is no better than lower order polynomial fit, i.e. coefficient of x 2 = 0. If P-value 2 is smaller than 0.1, we reject the linear model favouring the quadratic.
To obtain the overall best non-linear fit (and its 68.2% CB) for the empirically measured datasets with uncertainties, Monte Carlo simulations (1000 iterations) were performed. In particular, to obtain Figure 2B, on each iteration, we randomly sampled each data point from a normal distribution whose mean and standard deviation are equal to the mean (actual value) and SEM of the corresponding empirical data point, respectively. Then a sigmoid (logistic) curve fitting (R P (2020). sigm fit (https://www.mathworks. com/matlabcentral/fileexchange/42641-sigm fit), MAT-LAB Central File Exchange. Retrieved 6 August 2020) was used to obtain the best fitting curve and its 68.2% CB for each iteration. Finally, the best fitting curve along with their 68.2% CB is obtained by averaging the respective values from the 1000 iterations.
Finally, to create null-models of how variable X affects variable Y, we performed random sampling without replacement of both X and Y datapoints. The number of samplings and the sampling size (number of samples in each sampling) are set to the maximum array size possible to us (∼ 45980 × 45980, 15.8 GB). The sampling size is set to 5% of the number of datapoints (size XY) and the number of samplings (K) is set according to Max size/(0.05 × size XY) where Max size = 45980/2. Next, for both X and Y, we combine the sampled datapoints in a vector (sample X, sample Y) and calculate the correlation between sample X and sample Y by linear regression fitting using ordinary least squares. To correct for over-representation of the original datapoints, we corrected the degrees of freedom to be (size XY -C), where C is the number of parameters. In detail, for the linear regression fitting, C equals to 2 (intercept and slope of best fitting line).

ANCOVA test to evaluate if two lines can be distinguished.
To evaluate if two lines are statistically different, we performed the analysis of covariance (ANCOVA) test (53). ANCOVA is an extension of the one-way ANOVA to incorporate a covariate. This allows comparing if two lines are statistically distinct in either slope or intercept. This is done by evaluating the significance of the T-test under the null hypothesis that both the slopes and intercepts are equal.

Effects of medium dilution on cell growth, morphology, and RNAP concentration
We first studied how the RNAP concentration changes with medium dilutions. Concentration of RNAP (as well as of other molecular species) was obtained as described in Methods section Bacterial strains, media, growth conditions and curves, and intracellular concentrations. From a control medium (LB 1.0× ), we moved cells to diluted media (LB 0.75× , LB 0.5× , and LB 0.25× , Methods section Bacterial strains, media,growth conditions andcurves, and intracellular concentrations). RNAP levels start changing ∼ 75 min later, based on a as yet to be identified mechanism, stabilizing at ∼ 165 min ( Figure 2B). Given this timing of events, measurements to assess the effects on the RNA population should be performed after ∼ 165 min.
We also considered that at ∼ 180 min ( Figure 2A) the cells are at late mid-log phase. Thus, measuring the effects of changing RNAP should occur prior to ∼ 180 min, since leaving the mid-log phase will involve significant, unrelated genome-wide changes in RNA abundances (54)(55)(56)(57). From the point of view of cell division, from the moment when the RNAP starts changing, up to the moment when we measure the short-and the mid-term changes in RNA abundances, on average, less than one cell cycle and less than two cell cycles should have passed, respectively.
Interestingly, this time moment (∼ 180 min) matches our predictions of when, on average, RNA abundances have changed due to changes in the abundances of both RNAP as well as direct input TFs. In detail, from the timing of the changes in RNAP ( Figure 2B) and from known rates of RNA and protein production and degradation in E. coli (25,28-31), widespread heterogenous short-term changes in RNA abundances should occur, on average, at ∼ 120-135 min after shifting the medium (at which moment the RNAP has already changed significantly). Changes in the corresponding protein abundances should then occur tenths of minutes later, i.e. at ∼ 160-175 min (29)(30)(31).
Soon after, we expect additional changes in RNA abundances, now due to changes in direct input TFs abundances. This second stage of events, here classified as 'midterm', should occur between ∼ 165 and 180 min. This is also when cells are in the late mid-log phase (Figure 2A), while cell growth rates do not yet differ between conditions ( Figure 2C) and cell sizes only differ slightly ( Figure 2F-H and Supplementary Figures S3 and S4, Methods sections Microscopy and Flow-cytometry). This is relevant since growth rates affect protein concentrations due to dilution in growth and division (58,59).
Finally, at 180 min, the 38 concentration is lower than at 0 min ( Figure 2I inset and Supplementary Figure S5), in agreement with previous reports (27,35,60,61), suggesting that the cells are not committed to the stationary growth phase. The same is observed for the Crl protein (Supplementary Figure S6). This protein contributes to the expression of genes whose promoter is recognized by 38 and is known to be at higher abundance during stationary phase (62), as confirmed here (Supplementary Figure S6).
Given the above, to capture the average mid-term effects of RNAP shifts, we measured the transcriptome at 180 min ( Figure 2A). This timing should allow discerning the average genes' behaviour under the influence of their local network of TF interactions, albeit the diversity in RNA and protein production and decay kinetics, etc. RNAP levels at that moment are shown in Figure 2D (flow-cytometry data) and 2E (flow-cytometry versus western blot data), Supplementary Figures S2A 1 and S2A 2 (western blot of RNAP) and Table S2 (absolute values extracted from western blot). Similar RNAP downshifts have been observed in natural conditions (63) and described in (23,26,27).
Finally, we performed an additional western blot to measure RNAP-GFP using antibodies against GFP alone. From Supplementary Figures S2B 1 and S2B 2 and Table  S2, the RNAP-GFP does not appear to be significantly degraded by cleavage, with the strongest bands being observed for molecular weights between 150 and 250 kDa. As such, these strongest bands should correspond to GFP (known to be 27 kDa (64)) fused with the ␤' unit (known to be 155 kDa (65,66)). Moreover, no clear bands appear in the region between 150 and 250 kDa for the MG1655 strain. Meanwhile, the weak band just above 25 kDa in some samples from RL1314 cells (particularly in LB 0.75x ) might correspond to GFP that has been cleaved off from the chimeric protein but, given that it is only a small fraction compared to the amount of RpoC-GFP in the same cells, one can conclude that its contribution to the total cell fluorescence signals is negligible.

Genome-wide mid-term responses correlate with shifts in RNAP concentration
Transcription rates are expected to follow the free RNAP concentration in a cell, rather than the total RNAP con-centration (which is the sum of the free RNAP with the RNAP engaged with the DNA). We here measured the total RNAP concentration. However, within the range of conditions studied, the fractions of free and DNA-bound RNAP remain rather constant (26). Therefore, the total RNAP is a good proxy for the free RNAP. Specifically, using modified strains and plasmids controlled by lac and tet mutant promoters (67-69), whose regulatory mechanisms have been dissected, it was shown that their transcription rates are linearly correlated with the total RNAP concentration (26). From here on, when mentioning RNAP concentration, we refer to the total RNAP concentration.
The increasing medium dilution and corresponding decreases in RNAP concentration ( Figure 3A) cause increasingly broad distributions of single-gene LFCs at 180 min (Supplementary Figures S7 and S8A-C and Table S3). Specifically, the mean absolute LFC (μ |LFC| ) of the 4045 genes and the number of DEGs increased with medium dilution ( Figures 3B and C).
These RNAs changes correlate with subsequent changes in proteins levels (Supplementary Figure S9, Methods sections Flow-cytometry and RNA-seq). This suggests that no significant translational or post-translational regulation is taking place in between the perturbation and the measurements, that would alter proteins abundances significantly.
Interestingly, while both μ |LFC| and DEGs numbers follow the RNAP concentration (Supplementary Figures S8D  and S10B), these relationships are not strictly linear (pvalue of 0.29, Supplementary Figure S8D). This suggests that, in addition to RNAP, the direct input TFs are also influential. In this regard, we note that the assumption of linearity in the absence of the influence of input TFs (observed and discussed in (26)) is only expected to occur within a narrow range of parameter values.
Notably, some of the genes may be also influenced by sources other than RNAP and direct input TFs, such as supercoiling buildup. Also, some input TFs other than the direct input TFs maybe be influential. However, we show evidence below that this does not affect the average results ( Figure 4C and Supplementary Figure S18).
We also performed RNA-seq prior to when most signals, generated by the shift in RNAP, propagated in the TFN. First, we measured LFCs at 60 min after diluting the medium (Figure 1). From Figure 2B, at this moment, RNAP abundances have not yet changed relative to the control. In agreement, the genome-wide μ |LFC| is very weak ( Figure 3D). We further performed RNA-seq at 125 min. At this moment, RNAP levels have already reduced significantly ( Figure 2B), but we do not expect input TF abundances to have changed significantly given protein production times (Figure 1). In agreement, |LFC|s at 125 min are stronger than at 60 min, but much weaker than at 180 min ( Figure 3D). We conclude that the mid-term changes in the TFN have not occurred yet (further evidence is provided below). Given this, from here onwards, we focus on the state of the TFN at 180 min.

Influences from regulators other than RNAP
We investigated whether other factors influenced the global response of the TFN. We considered GRs, factors, (p)ppGpp and non-coding RNAs. We assumed the classification of GR in (70,71) as an input TF that regulates a large number of genes that rarely regulate themselves and participate in metabolic pathways. Meanwhile, we did not account for promoters' close proximity (e.g. tandem formation), since a recent study (72) showed that, under similar stress, while close proximity causes transcription interference, it does not influence the genes' input TF regulation.
First, the RNA-seq shows large numbers of DEGs (> 1000 for the two strongest dilutions ( Supplementary Figure S10A)) as well as a linear correlation between these numbers and μ |LFC| (Supplementary Figure S10C). Thus, we argue that the responsive genes are not constrained to a specific cluster, such as genes responding to a GR other than RNAP (the most influential is, arguably, 70 with 1555 genes recognizing it, while other GRs control less than 510 genes each (34)).
Second, from the RNA-seq, we analysed the relative abundances of GRs, factors and of their output genes. From Supplementary Figures S26A and S26C, apart from rpoS (an input TF recognized by 321 genes) and flhC (an input TF recognized by 75 genes), GRs and factors did not change significantly (Supplementary Figure S26). Further, those two changes (rpoS and flhC) were positively correlated with the RNAP concentration ( Figure 2I inset and Supplementary Figure S5), not allowing to separate their effects. Noteworthy, alternative factors did not change significantly relative to 70 (Supplementary Figure S26E), which would have changed the competition for RNAP binding.
We thus failed to find evidence that the factors and GRs were influential, globally, in the mid-term responses. Supplementary Table S15 lists the conclusion for each specific GR and factor and Supplementary Figure S27 shows these results at 125 min.
We then investigated if (p)ppGpp could be influential since, under some nutrient starvation conditions, they affect ∼ 1000 genes by binding RNAP and altering its affinity for their promoters (3). Reports suggest that the effects are rapid (5-10 min (3)). In agreement, genes responsive The equations of the red fitting lines with '±' inform on the standard error of the slope. (C) Scatter plots between |LFC| of output and input genes distanced by a minimum path length L of 1, 2 and 3 input TFs (edges) in the TFN, respectively (data from LB 0.5x ). Only for L = 1 do the activities of output and input genes correlate (P-value 1 > 0 and R 2 > 0). (D) μ |LFC| of all genes, DEG, non-DEG, and cohorts of randomly selected genes of the same size ('same sized cohorts') for K TF = 0 to 7, using merged data from all shifts (LB 0.75x , LB 0.5x and LB 0.25x ). Black error bars are the SEM and red error bars are the 95% CB of the SEM. Best fitting lines and 68% CB obtained using FITLM (MATLAB). p-values, obtained using the null hypothesis that the data is best fit by a horizontal line, are not rejected at 10% significance level. (B) and (C) do not include a few data points to facilitate visualization. See Supplementary Figures S14 and S18 for complete data. Table S20). However, their mid-term responses at 180 min were no longer atypical and, instead, followed the RNAP changes. The expression of spoT, one of the genes responsible for ppGpp synthesis, also followed the RNAP (Supplementary Figure S28). As such, we could not establish a long-lasting global influence from (p)ppGpp in response to growth-medium dilution. Nevertheless, the LFCs of the 14 out of the 22 genes coding for rRNAs listed in RegulonDB did reveal atypical behaviors (Supplementary Table S22).

to (p)ppGpp (3) exhibited abnormal short-term responses (Supplementary
Finally, we searched for unique behaviors in sRNAs by analyzing the LFC of the 93 sRNAs reported in Regu-lonDB. Their behavior was not atypical, neither at 180 min after the perturbations (Supplementary Table S21), nor at 125 min. Further, we analysed if their output genes followed their behaviour. We found that the LFCs of genes directly regulated by the sRNAs were not correlated with their input TFs, neither at 125 min, nor at 180 min after the medium shifts. Specifically, of the 93 sRNAs, 37 of them have known output genes (in a total of 145 outputs). The RNA-seq data provided information on the LFC of 40 of the 145 outputs. Finally, we searched for linear correlations between the pairs of LFCs of sRNAs and their output genes, respectively, in the short-term (125 min in LB 0.5x ) and in the mid-term (180 min in LB 0.5x ). We found an R 2 of 0.03 (P value = 0.18) at 125 min and an R 2 of 0.05 (P value = 0.10) at 180 min, respectively. We thus cannot conclude that sR-NAs were influential during the short-and mid-term responses to the stresses.

Input TFs influence the transcriptional response
If the TFN influences the genes' mid-term response to the shift in RNAP concentration, this should cause genes with and genes without input TFs to behave differently. Particularly, since the latter should only be affected by the RNAP abundances.
In agreement, genes with input TFs had higher μ |LFC| than genes without input TFs ( Figure 4A, Supplementary Figure S13 and Table S6). Also, the |LFC| of output genes and of genes coding for their direct input TFs correlate statistically ( Figure 4B, Supplementary Figure S14 and Table  S7). Therefore, on average, TF-gene interactions affected the single-gene, mid-term responses as hypothesized (Figure 1).

Input TFs influence all genes within operons
When considering the TFN topology, we have accounted for TF-gene interactions both between the input TF and the first gene of an operon or TU. Further, we have also accounted for the interactions between the same input TF and the other genes of the operon or TU (illustration of TUs and operons in Supplementary Figure S11B, which follows the standard definition of a group of two or more genes transcribed as a polycistronic unit (1)).
If we had not account for all these interactions, we would have failed to correlate the activities of genes interacting with each other. For example, consider an operon consisting of genes X 1 and X 2 . Then, assume that gene A represses X 1 and X 2 , by repressing their common promoter. If X 1 is an input TF to gene C, while X 2 is an input TF to gene D, then gene A should indirectly affect both genes C and D. If we had ignored the interaction between A and X 2 , because it is not the first gene in its operon, we would explain why A affects C, but fail to explain why A affects D.
Further, many operons contain sets of genes whose RNAs code for subunits of the same protein complex (73,74). However, the opposite is also true. Moreover, the fraction of complexes encoded by proteins from different TU's is higher than those encoded from the same operon (75). Thus, we need to track interactions between input TFs and genes in any position in an operon or TU.
We tested if the positioning of the genes in the operon influenced their responsiveness to their input TFs. As a case study, we considered operons with 3 genes. These account for ∼ 21% of all operons with more than 1 gene (34). We found that the genes' positioning did not affect how they relate to the input TFs (Supplementary Figure S16). We obtained similar results for TUs (Supplementary Figure S17). The tests of statistical significance are shown in Supplementary Tables S9-S12.

Genes expressing TFs are correlated with their nearest neighbour output genes
Consider the interval between the shift in RNAP levels and the sampling for RNA-seq (Figure 1). From these, we hypothesized that, on average, at 180 min (i.e. ∼ 70 min after the RNAP changed relative to the control), mostly only genes directly linked by input TFs should exhibit correlated responses. Nevertheless, the genome-wide diversity in the kinetics of gene expression and in RNA and protein lifetimes will allow for correlations between genes more distanced in the TFN. The number of such correlations should decrease rapidly with the path length between the gene pairs considered.
Results in Figure 4C support this. Genes distanced by 1 input TF (L = 1, i.e. directly linked) have related |LFC|s, while genes distanced by two input TFs in the TFN have much less correlated responses (albeit still statistically significant). Finally, we found no correlations between the |LFC|s, of genes distanced by three input TFs (Supplementary Figure S18).
Noteworthy, the lack of correlation between genes separated by L > 1 could also be partially due to interference from the TFs of the 'intermediary' genes between the gene pairs. However, this is only a possibility when all input TFs involved can change in abundance in less than 60 min, which is likely uncommon in E. coli. This is supported by the RNA-seq data at 125 min after medium dilution (Supplementary Figure S19), when even direct input TFs and output genes are weakly correlated. This suggests that this shorter time interval was insufficient for most signals to have propagated between nearest neighbours (Supplementary Figure S19).

The number of input TFs, K TF , of a gene correlates to the magnitude of its transcriptional response
We investigated if the genes mid-term responses are sensitive to their K TF (Supplementary Figure S20A). When averaging the results from the three perturbations ( Figure 4D), we found that the average of the absolute LFCs, μ |LFC| , increases with K TF (Supplementary Figure S22 shows the goodness of the linear fits). The result was the same whether considering all genes or just the DEGs ( Figure 4D, Supplementary Figure S21 and Table S13). Further, it holds true even for non-DEGs ( Figure 4D), which justifies also considering these genes when studying the genome-wide effects. In agreement, we found no trend in the fraction of DEGs when plotted against K TF (Supplementary Figure S23). For comparison, neither at 60 min nor 125 min do the genes' response and their K TF correlate (Supplementary Figures S14 and S15 and Tables S7 and S8).
We verified that the relationship between μ |LFC| and K TF at 180 min is not an artifact caused by a decrease in cohort size with K TF . We used bootstrapping to obtain cohorts of randomly sampled genes with increasing K TF (10000 cohorts). We imposed a cohort size equal to the number of genes with K TF = 7 (27 genes). The new, estimated μ |LFC| was always within the SEM of the μ |LFC| of the cohorts of all genes ( Figure 4D). Finally, we again verified that considering only the first gene of each operon does not affect how μ |LFC| and K TF relate (Supplementary Figure S25).

The correlation between input and output genes responses decreases with the number of input TFs
Most input TFs discernibly affect the output genes (Supplementary Figure S14), except when K TF > 5 (perhaps due to saturation). Nevertheless, the correlation between input and output genes appears to be decreasing with K TF .
Nucleic Acids Research, 2022, Vol. 50, No. 12 6813 Namely, the average slopes of the fitted lines between |LFC| of the output and |LFC| of each input ( Supplementary Figure S14) decreased with the K TF of the output gene (Supplementary Figure S20B). Also decreasing was the R 2 between input-output pairs (Supplementary Table S7). This could explain why, when plotting |LFC| against the RNAP concentration, there is a weak trend towards increased slope with K TF (Supplementary Figures S20C and S21).

The variability in single-gene absolute LFC increases with K TF
We also investigated if the variability in |LFC|s, as quantified by its standard deviation σ |LFC| , relates with K TF. There should exist (at least) four sources of this variability: (a) RNA-seq measurement noise (76,77); (b) intrinsic and (c) extrinsic noise in gene expression (78,79), and (d) TF and non-TF dependent regulatory mechanisms.
Overall, we observed that, from a genome-wide perspective, σ |LFC| increases with K TF (Supplementary Figure  S24A) similarly to μ |LFC| , and the two values are also related (Supplementary Figure S24B). Examples of the variability are shown in Supplementary Figures S24C (genes with null K TF ), S24F (genes with two GRs, FNR and ArcA) and S24D and S24E (genes controlled by the GRs FIS or CRP) (see also Supplementary Table S14).

Other topological features of the TFN do not influence midterm responses
Globally, the TFN of E. coli has in-and out-degree distributions that are well fit by power laws (Supplementary Figures S12E 1 , S12E 2 , S12F 1 and S12F 2 ) (80,81). This may explain its relatively short mean path length (Supplementary Figure  S12G and Table S5).
Having established a relationship between the response kinetics and the indegree of the TFN, we next searched for correlations between |LFC| and other single-gene topological traits (Methods section Transcription Factor Network of Escherichia coli). We considered average shortest path length, betweenness, closeness and stress centrality, clustering coefficient, eccentricity, out-degree, neighbourhood connectivity and edge-count (51). Of these, only the clustering coefficient was statistically correlated with the |LFC| (Pvalue < 0.1) (Supplementary Table S20). However, it should not be influential, since the corresponding R 2 is nearly zero (R 2 = 0.01).

The numbers of activating and repressing input TFs differ in most genes
In our original hypothesis, the mid-term response (|LFC|) of a gene should follow from the bias in the numbers of activators and repressors in its set of input TFs ( Figure 1B 4 and Supplementary Figure S11A). In detail, we predicted that if the sum of the regulatory effect (r ) of the input TFs (i.e. bias b = | r |) is null (unbiased), then the gene should have weak or zero mid-term LFC. Also, the |LFC| should increase with b.
We tested this hypothesis by extracting information on the input TFs and corresponding r values for each gene from RegulonDB. We set r of an input TF to + 1 if it is activating, to -1 if it is repressing, and to 0 if it is unknown (Supplementary Figure S12B). Then, we obtained the absolute sum of the regulatory effect of the input TFs for each gene: |b|.
From the data in RegulonDB, while the gene-TF interactions that are repressions and activations exist in similar numbers, the numbers of repressor TFs exist in larger numbers ( Supplementary Figures S12A-C). Also, of the genes with input TFs, most (∼ 85%) have a non-zero |b| (Supplementary Figure S12D and Table S16).
This can explain why so many are mid-term responsive ( Figure 3C), even though the genome-wide numbers of activation and repression interactions are similar (Supplementary Figure S12B). It may also explain why genes with K TF ≥ 1 have higher |LFC| than genes with K TF = 0 ( Figure  4A).

The bias in the input TFs follows the number of input TFs
Using information from RegulonDB, we found that the mean bias, μ |b| , increases with K TF ( Figure 5A, light blue), except for K TF > 5, which includes only ∼ 64 out of 4045 genes (Supplementary Table S17). The same is observed if considering only the first gene of each operon (Supplementary Figure S29).
To test if these results were affected by local topological specificities, we employed an ensemble approach (Supplementary Results section Estimation of the expectedμ K TF andμ |b| using an ensemble approach), to reduce their influence (82). We sampled genes (with replacement) to form cohorts with a given average K TF (from 1 to 5, due to insufficient samples for higher K TF ). This made the relationship between μ |b| and K TF more stable ( Figure 5A). Thus, from here onwards, we use the ensemble approach to study the influence of the global logical and topological features on the response's dynamics to the RNAP shifts.

The bias of the sets of input TFs can explain the mid-term responses of individual genes
From the data in RegulonDB, using the ensemble approach (Supplementary Results section Estimation of the expectedμ K TF andμ |b| using an ensemble approach), we formed random cohorts of genes with an imposed average |b|. Next, from the mid-term RNA-seq data, we calculated the average μ |LFC| of the set of cohorts with a given μ |b| . We found that μ |LFC| increases with μ |b| ( Figure 5B).
Interestingly, μ |b| and μ K TF are strongly correlated in the TFN of E. coli ( Figure 5C). To assert which one controls μ |LFC| , we assembled cohorts differing in μ K TF , but not in μ |b| . In these, μ |LFC| does not increase with μ K TF ( Figure  5D). We also assembled cohorts differing in μ |b| , but not in μ K TF . In these, μ |LFC| increases with μ |b| ( Figure 5E). Thus, the increase of μ |b| with μ K TF (Figure 5C), is what explains the increase in μ |LFC| with K TF ( Figure 4D).
Finally, for comparison, we also investigated the relationship between μ |b| and μ K TF prior to the perturbation and in the short-term (at 60 min and at 125 min after shifting the medium, respectively Figure 1A). From Figure 5F, first, the  Table S17 shows the fractions of genes with equal |b| and K TF . Black error bars are the SEM, and red error bars are the 95% CB of the SEM. Dark blue bars not shown for K TF > 5 due to small sample sizes. (B) Mid-term μ |LFC| as a function of μ |b| , obtained using the ensemble approach (Supplementary Results section Estimation of the expectedμ K TF andμ |b| using an ensemble approach, Supplementary Figures S30 and S31 and Table S18). Each blue cross is the average outcome from up to 24750 cohorts of 10 genes. (C) μ |b| plotted against the corresponding μ K TF , mean of K TF of the cohorts in (B). The inset shows the inverse correlation plot for the cohorts in Supplementary Figure  S30, assembled based on μ K TF (Supplementary Results section Estimation of the expectedμ K TF andμ |b| using an ensemble approach). Shown are best fitting lines and 68% CB (shadow areas, barely visible), R 2 , RMSE, and P-value (Methods section Statistical tests c). (D) μ |LFC| of gene cohorts with increasing μ K TF , but constant μ |b| (from 1 to 5) (Supplementary Results section Estimation of the expectedμ K TF andμ |b| using an ensemble approach). (E) μ |LFC| of gene cohorts with increasing μ |b| , but constant μ K TF (from 1 to 5) (Supplementary Results section Estimation of the expectedμ K TF andμ |b| using an ensemble approach). (F) μ |LFC| as a function of μ |b| prior to RNAP changes (60 min) as well as the short-term (125 min) and the mid-term responses (180 min) to RNAP changes when shifting to LB 0.5x . Each blue cross is the average outcome from up to 24829 cohorts of 10 genes. In (D) and (E), the data is merged from the three conditions corresponding to (B). In all figures, the error bars are the SEM. Since the three conditions differ slightly in mean values ( Figure  5B), the SEM is larger than when observing each condition separately. μ |LFC| at 125 min is stronger than at 60 min. This agrees with the expectation that shifts in RNAP suffice to shift the |LFC| of many genes. Second, the μ |LFC| at 180 min is stronger than at 125 min. This agrees with our expectation that, at 125 min, input TFs numbers have not yet changed significantly in order to enhance the |LFC| of their output genes ( Figure 1).

RNA numbers follow the RNAP concentration, not the medium composition
We next increased growth medium richness, instead of diluting it (Methods section Bacterial strains, media,growth conditions andcurves, and intracellular concentrations). As before, we limited this to not alter growth rates significantly in the first 180 min ( Figure 6A and B), while altering RNAP levels ( Figure 6C).
As before ( Figure 4C), at mid-term, only genes directly linked by input TFs showed correlation in their |LFC| (Figure 6D 1 -D 3 and Supplementary Figure S32). This supports the previous assumption on the kinetics of transcription, translation, and signals propagation via shifts in input TFs numbers ( Figure 1).
Meanwhile, in contrast to above, shifting cells from LB 1.0x to the richer LB 1.5x medium was accompanied by a decrease in the RNAP concentration ( Figure 7A). This was followed by substantial alterations in the RNA populations, with a large number of DEGs and high μ |LFC| ( Figure 7B and C, respectively). As previously, in the mid-term, genes with input TFs reacted more strongly ( Figure 7C).
These results support the initial assumption that the changes in RNA abundances follow the RNAP concentration, rather than the medium richness.

Further increases in medium richness do not decrease RNAP concentration and RNA numbers also do not change
Finally, we further increased growth-medium richness (to LB 2.0x and to LB 2.5x ). This caused no significant change in RNAP levels and concentration ( Figures 6C and 7A). Here, we also did not observe significant changes in DEGs or μ |LFC| at mid-term, when compared with the LB 1.5x con- Data from the LB 2.5x condition. Shown are the best fitting line and its 68% CB (blue shadow), and the R 2 and RMSE of the fitted regression line, along with its P-value at 10% significance level under the null hypothesis that this line is horizontal. From (A) to (C), the black error bars are the SEM and red error bars represent the 95% CB of the SEM. dition ( Figure 7B and C, respectively). This is in agreement with the assumption that the shifts in the RNAP concentration caused the short-term changes in RNA abundances, which then caused the mid-term changes.
Finally, as before, μ TF |LFC| follows μ |b| ( Figure 7D and Supplementary Figure S33). Moreover, it does so almost identically in the three perturbations, as expected from the original assumptions (Figure 1).

DISCUSSION
We investigated if the mid-term responses to genome-wide perturbations of E. coli's TFN are mediated by its topology and logic. We diluted LB medium since this dramatically and reproducibly affects the RNAP concentration (26,27). The increasingly strong nature of the dilutions facilitated the verification of how the RNAP concentration and singlegene, mid-term |LFC|s related. We focused on mid-term transcriptional responses (Figure 1), since short-term responses are unlikely to have been influenced by the TFN due to protein folding and maturation times, etc. Meanwhile, long-term responses were most likely affected by the TFN. However, dissecting them would have been onerous, due to the complicating effects of loss, backpropagation, and co-alescence of possibly dozens of signals from origins other than direct input TFs.
We lack information on the affinity between each gene and their input TFs, on how the input TFs operate, and on how the de novo presence of an input TF alters the binding or activity of other input TFs on the same promoter. Thus, we would have failed to predict the behaviour of individual genes with accuracy. Instead, we predicted the responses of gene cohorts, since their behaviour is less influenced by particular single-gene features (other than cohort-specific features), which should average out at the cohort level. Further, as in (18), we were only able to correlate absolute LFCs of input and output genes ( Figure 4B), likely due to limitations in RNA-seq technology and the analysis, and/or missing information on the TFN. Nevertheless, the present information on input TFs and their regulatory effect sufficed to relate the TFN with the genes' response.
From the RNA-seq data on three time points, we provided evidence that both the TFN and the RNAP affect the results at mid-term (∼ 180 min), and not before that. In addition, while other factors also influenced genes' behaviour at mid-term, including single-gene features, they only had minor, local effects. In detail, first, we could not find evidence of GRs (including 38 ) and (p)ppGpp being material in the global mid-term behaviour (although (p)ppGpp may be significant in the short-term response). Second, we excluded the medium as directly influencing RNA abundances. Third, we excluded global network parameters, other than K TF , as being influential as well (none of them correlated to single-gene responses). Fourth, we did not find evidence for significant translational or posttranslational regulation. Namely, RNA and protein abundances correlated well, and so did the RNA levels of input TFs and of output genes. Finally, sRNAs did not respond atypically to the RNAP shifts neither in the short-term, nor in in the mid-term.
We have made six key observations on the influence of the logic and topology of the TFN on the mid-term response. First, genes without input TFs were less responsive. Second, the |LFC| of input and output genes correlated positively. Thus, we argue that, on average, input TFs enhanced the |LFC| of individual genes. Third, only nearest neighbour genes in the TFN consistently correlated in |LFC|'s. Thus, either the effects of the shift in RNAP only reached nearest neighbour genes or they 'dissipated' beyond that. Since the correlations between nearest neighbours were weaker in the short-term than in the mid-term, the first possibility is more likely. This observation also suggests that there is a degree of genome-wide homogeneity in how long input TF abundance take to change (likely due to physical limitations on the rates constants controlling bacterial gene expression). This agrees with the constraints on timing variability reported in (6). Fourth, the behaviour was orderly (rather than chaotic), with most genes responsive to the weak perturbations also responding to the stronger perturbations. This suggests the existence of features (on genes and/or the TFN) affecting the responsiveness ( Supplementary Figure S34). Similarly, there is a good overlap between the sets of genes responsive in the short-and in the mid-term, but weak overlap to those responsive prior to the perturbation (Supplementary Figure S35). Fifth, on average, as K TF increased, the correlation between the input and each output gene decreased. This is likely unavoidable and may be a limiting factor in how many input TFs genes can have. Finally, it is μ |b| that (partially) controls the genes' responsiveness to the stress, while the apparent relationship between μ K TF and μ |LFC| is due to the linear correlation between μ |b| and μ K TF . Nevertheless, the possible values of μ |b| are limited by the values of μ K TF .