Crowding-induced transcriptional bursts dictate polymerase and nucleosome density profiles along genes

Abstract During eukaryotic transcription, RNA polymerase (RNAP) translocates along DNA molecules covered with nucleosomes and other DNA binding proteins. Though the interactions between a single nucleosome and RNAP are by now fairly well understood, this understanding has not been synthesized into a description of transcription on crowded genes, where multiple RNAP transcribe through nucleosomes while preserving the nucleosome coverage. We here take a deductive modeling approach to establish the consequences of RNAP–nucleosome interactions for transcription in crowded environments. We show that under physiologically crowded conditions, the interactions of RNAP with nucleosomes induce a strong kinetic attraction between RNAP molecules, causing them to self-organize into stable and moving pelotons. The peloton formation quantitatively explains the observed nucleosome and RNAP depletion close to the initiation site on heavily transcribed genes. Pelotons further translate into short-timescale transcriptional bursts at termination, resulting in burst characteristics consistent with instances of bursty transcription observed in vivo. To facilitate experimental testing of our proposed mechanism, we present several analytic relations that make testable quantitative predictions.


Heuristic solution for of the hierarchical TASEP model
The average TASEP motor velocity For the TASEP, gap sizes are geometrically distributed as (21) for some constant a < 1. Unless a motor is blocked by another motor (g = 0), it will hop forward with rate k, and the average velocity of motors can be calculated as v b = k ∞ g=1 P TASEP (g; a) = ka.
From this it follows that a = v b /k, and we can write

Intra-and trans-peloton gap sizes
The inclusion of dynamic roadblocks will split the motor dynamics into an intra-peloton and a trans-peloton TASEP describing gap sizes below and above the roadblock shadow ∆. In line with our heuristic argument, we assume there to be no roadblocks within a peloton. Apart from the leading motor, all motors within a peloton thus attempt to hop forward with rate k ip . The pelotons themselves are controlled by the leading motor, which faces a trans-peloton gap filled with roadblocks and thus attempts to hop forward with rate k tp . Assuming that the gap-size distribution is geometric both below and above the roadblock shadow, we can now write our normalized heuristic gap-size distribution for the intra-and trans-peloton regimes as The above equations are valid when (k ip /k tp ) ∆ is large, which we refer to as the stable peloton regime (SPR). The condition for the SPR can intuitively be seen as combining the strength of the attraction between motors and roadblocks (k ip /k tp ) and its range (∆). Due to the SPR conditions exponential dependence on the roadblock shadow size, we expect physiological systems where the roadblock size is substantially larger than the motor step to always be in the SPR. In this limit, we have P ip (g; v b /k ip ) = P TASEP (g; v b /k ip ), g < ∆, With these conditional distributions, we can now calculate the average gaps sizes for both regimes Defining p b as the probability of a gap in the bulk being a trans-peloton gap, we can write the average gap between motors as Taking the average motor size into account, the average motor density is the inverse of the typical distance between the fronts of neighboring motors, The relative fraction of trans peloton gaps Combining the intra-and trans-peloton distributions we can write the complete gap-size distribution of the hierarchical TASEP as In steady state, the probabilistic flow from intra-to trans-peloton gaps (a peloton is split in two) and from trans-to intra-peloton gaps (two pelotons merge) should balance. The motor ahead of a gap of size ∆−1 hops with an average rate v and extends the gap to size ∆, inducing the mean- In turn, a motor behind a gap of size hops forward with an average rate and decreases the gap to size ∆ − 1, inducing the probabilistic flow In the steady state these two flows should balance, and equating these rates gives Taken together, Equations 6, 7, and 9 relate the average density in the system to the velocity and the microscopic model parameters. In Figure 2 C in the main text we compare the gap-size distribution resulting from our heuristic arguments (solid lines) with ones generated through simulations of the BRM (dots) at different motor densities 2 Relation between heuristic arguments and mean-field solution of BRM For periodic boundary conditions and irreversible roadblock binding, there exists a mean-field solution for the special case δ m = δ rb = 1 [2]. It is informative to consider our more general but less precise heuristic approach in the light of this mean-field solution. We first present a short summary of the mean-field solution, and then show how a few approximations give our heuristic results. The gap-size distribution for the BRM in the mean-field is given by Here k(g) is the average stepping rate of a motor into a gap of size g. This average stepping rate can be determined by calculating the probability of a roadblock with binding rate k b being bound to a lattice site after a time t since the last motor passed, P rb (t) = 1 − e −k b t . If we make the mean-field assumption that the motor moves with the average velocity v of the system, the probability of a roadblock being bound can be written in terms of the gap size to the motor ahead P rb (g) = 1 − e −gk b /v b . The effective stepping rate of a motor as a function of the gap ahead is now given by where we in the last step introduced the roadblock shadow ∆ = 1 + v b /k b for the BRM, and in direct analogy with our heuristic approach. If we now approximate k(g) by a step function the gap-size distribution is given by Enforcing normalization, Equation 13 implies Equation 4.

Observable bulk quantities
In addition to the motor density, there are other interesting observables that can be calculated if we know the average velocity in the bulk. Among them are the current of motors, J = ρ b m v b , and the average number of motors in a peloton, n p b = 1/p b . Further, only gaps between pelotons are filled with roadblocks, and then typically only beyond the roadblock shadow. If we let ρ eq rb be the equilibrium roadblock occupancy in the absence of motor activity, we can estimate the average roadblock occupancy as In supplementary Figure 1 we illustrate the relationships derived for various observables and check our arguments against simulations of the BRM. In Figure 1 A-D we show the effect of varying the trans-peloton hopping rates for long roadblock equilibration times. As predicted by our analytical arguments, up to a critical motor density ρ 1 the velocity remains approximately constant (Figure 1 A), and the total current of motors grows linearly with the motor density (Figure 1 B), while the roadblock occupancy decreases linearly with motor density (Figure 1 C). At the critical density, and long before the track is completely covered by motors, all roadblocks are evicted. For motor densities above the critical density, the velocity and motor current follows the relationship for the TASEP without roadblocks (the ipTASEP) ( Figure 1 A and B). In Figure 1 D we plot the typical peloton size up to the critical density, after which whole system acts as one large roadblock-excluding peloton. In Figure 1 E-H we vary the roadblock equilibration times. For rapid roadblock equilibration (red curves Figure 1 E-H) the roadblock shadow is small, gaps are largely filled with roadblocks, and the system is well described by a single TASEP with roadblocks in every gap (the tpTASEP) (dashed line in Figure 1 E). In this regime, the total density of roadblocks decreases weakly with motor density (red curve Figure 1 G), as roadblock shadows are small and the motor footprints are all that excludes the roadblocks. For intermediate roadblock equilibration times (the blue curves in Figure 1 E-H) the roadblock shadow is larger, resulting in peloton formation, a velocity that is less sensitive to motor density, and a system that is better at evicting roadblocks. The breakdown of our predictions for roadblock densities and peloton size in the case of fast roadblock binding (red and blue curves in Figure 1 G and H) is not surprising given that we here have small enough roadblock shadows to push the system outside the SPR.

Asymptotic behavior in the SPR
We here show how Equations 6, 7, and 9 can be solved explicitly in terms of the motor density rather than velocity in the limit SPR. For notational convenience we introduce the average excess trans-peloton gap g b tp and the critical density ρ 0 , to write Equation (9) as In the SPR, the transition density ρ 0 is very small by definition. In the limit ρ 0 g b tp 1, pelotons are small and p b is close to one, and Equation 3 can be written as Here we have a system controlled by the tpTASEP. In the limit ρ 0 g b tp 1, pelotons are large, p b is small, and Equation 7 can be written as If the first two terms on the right hand side dominate, we have the standard ipTASEP. If the last term dominates, we have a composite system. Taken together, there are three limits given by In the middle regime is large, and the average velocity (see Equation 6) is close to k tp , Using Equations 18 and 19, all observables can be written to leading order in 1/ρ 0 as In this regime, (ρ 0 ρ m ρ 1 ), the average bulk peloton size is large, and that is why we refer to this as the stable peloton regime.

The bulk state is never reached
With the parameter values in Table 1 in the main text, the bulk state is given by From the above it is clear that the average steady-state peloton size n p b is in general enormous throughout the experimentally accessible range, and that the true bulk-dynamics will never be reached over a finite gene. Judging by the size of bulk pelotons, polymerases that meet along the gene stay together until termination, invariably producing bursts of RNA production.

Initiation limited dynamics
Without roadblocks, and if the initiation rate k in is limiting, the density and the distribution of motors at the start of the lattice are the same as in the bulk [3]. With the inclusion of roadblocks, the microscopic organization among motors and roadblocks can change from the start of the lattice to the bulk. Here we assume that the density of motors is low enough, such that once a motor is initiated it is only slowed down by other motors when it encounters a peloton. This condition means that once the initiation site is cleared, motors step away much faster at the beginning of the lattice than a new motor typically initiates. We will refer to this regime as the slow initiation regime, and we detail its extent below. Figure 2. At the start of the lattice, the time gaps between newly initiated motors are exponentially distributed with rate k in . As the motors move into the system, those motors that happened to have a roadblock bound ahead will start inducing peloton-forming traffic jams. For convenience we here call these jams proto-pelotons, and they will grow until all motors between roadblocks are absorbed into one peloton. Once all motors are collected into pelotons, we will refer to these as the fully formed pelotons. We here set out to determine the nature of this peloton formation, and what effects it has on both motor and roadblock density profiles.

A schematic kymograph for the TASEP with roadblocks and open boundary conditions is shown in supplementary
In all the expressions below, the superscript 'in' refers to the first site after the initiation site for which x = 0. For a roadblock to bind to the first site after a motor just left, the motor first has to take δ rb steps and then a roadblock has to bind, all before another motor initiates. Considering the splitting probabilities for each step, we can write the probability of a roadblock binding between two motor initiation events as Here τ in m is the average time it takes a motor to take a step at the start of the track, The definition of the slow initiation regime implies τ in m 1/k in , and Equation 22 can be simplified as Equation 23 and 24 can be used to solve for p in explicitly in the steady state, where W is the Lambert W function. In the limit of low initiation rates we can write Next we calculate the typical time it takes for n * p motors to aggregate into a peloton. Let ∆x be the typical distance that the proto-peloton back end moves between two motors joining (supplementary Figure 2). From the geometry of typical times and distances sketched in the kymograph of supplementary Figure 2, we can write where Λ is given in Equation (25). In the last step above, we used the fact that we are considering the slow initiation regime, k in k ip , k tp . Proto-pelotons grow as more and more motor catch up, and again ignoring correlations, the final size n * p of a proto peloton is geometrically distributed as Here the first factor in the probability function accounts for the probability of having a roadblock in a gap, and the following factors accounts for the probability of having no roadblocks in the preceding n * p − 1 gaps. The probability that a proto-peloton is still growing at position x, or equivalently the probability that n * p > x/∆x, is then given by Letting n p (x) be the size of the average proto-peloton at a distance from the initiation site, we can now write down the discrete evolution equation n p (x + ∆x) = n p (x) + 1 · P grow , with n p (0) = 1, giving Though strictly only true for x in multiples of ∆x, we take the above equation to be valid for any position x ≥ 0.

The macroscopic effects of peloton formation
We now turn to calculate how the gradual growth of the proto pelotons impacts the motor density and velocity. The motor number density ρ m (x) at any position x is defined as the fraction of time that a site is occupied by (say) the front of a motor. The proto-peloton size at position x tells us that the fraction n p (x) p in motors take an average time 1/k tp to step, while the rest take 1/k ip , giving the average stepping time The total time T between the seeding of two pelotons (supplementary Figure 2) averages over peloton sizes to The fraction of time that the track is occupied by motors at position x is then where the ρ * m denotes the motor density where the proto pelotons have fully formed. The average motor hopping rate can similarly be written as Though true for the TASEP, note that here the average motor hopping rate (Equation 34) is typically not the same as the average velocity v(x) = τ −1 m (x) in our system. With the velocity defined this way, it satisfies the standard relation J = ρ m (x)v(x) where J = n p * / T is the flux of motors.
In supplementary Figure 2 we indicate the times and positions where there typically are roadblocks in pink. Moving away from the initiation site, the fraction of time a site is occupied by roadblocks grows because motors that have not yet caught up with a proto peloton move faster than the proto pelotons themselves. The moving front of equilibrating roadblocks (intersection of white and pink region in supplementary Figure 2) is typically offset with respect to the last motor of the peloton by a distance δ rb (see dashed square in supplementary Figure 2). Taking the offset δ rb into account, Equation 29 implies that a fractionP grow (x+δ rb ) of the proto pelotons is still evolving when the front of equilibrating roadblocks is at position x. The increase of the average time a site at position x (see supplementary Figure 2) is occupied by a roadblock then grows with distance as This expression can be summed in the same manner as we previously summed to solve for n p (x) in Equation 30, yielding The total fraction of time a site is covered by roadblocks is now Relative changes are often easier to measure experimentally than absolute changes, therefore we here also give the relative changes in velocity and density at the beginning, and compared to well after the pelotons are formed To have an appreciable motor density and velocity evolution, we need only the typical peloton to have a size of a few motors. In the limit of low initiation rates, the above can be written as We see that relative changes along the track grow in magnitude with the initiation rate, but that the effect saturates around k in ≈ 1/τ for motor-density and velocity changes, while the roadblock density saturates later, around k in ≈ e δ rb /xp /τ . Interestingly, we see that the total shift in motor density and velocity along the track is set solely by the ratio of motor stepping rates with and without roadblocks ahead. Figure 2: A schematic kymograph of the system with open boundary conditions. Motors (grey) initiate on the left side of the lattice, with an initiation site free of roadblocks. As motors travel into the system, their speed depends on if there is a roadblock (pink) ahead of them. With a roadblock ahead, the motor speed is k tp , while without roadblock ahead it is k ip . The typical time that a roadblock at position x is bound before being evicted by the next peloton is given by t rb (x). The right figure is a magnification of the region defining ∆x, which is the typical distance a motor travels to catch up with the proto peloton since the last motor caught up.

Bursts from terminating pelotons
The peloton-forming dynamics of our model will manifest as burst of motor activity if viewed from a specific position (for example the transcription-termination site). Using the average velocity of the system, we can translate the average gap-sizes to average time gaps between motors arrivals Letting k tr be the rate of reaction when the process is in the on state, k off be the rate at which the system transitions to the off state, and k on be the rate at which the system transitions back to the on state, we can relate our first-principles model to the phenomenological two-state model traditionally used to describe transcriptional bursts (Figure 4 C in the main text) [4]. As both models generate double exponentials, we relate them to each other by equating the time constant and the corresponding relative probabilistic weight. Since we are interested in describing transcriptional bursts, we here consider the limit where the two-state model gives clearly separated bursts, k on k off + k tr . In this limit we have, to leading order in k on , The relations in Equation 41 can be inverted to give For the bulk we can in principle calculate the corresponding two-state model using Equation 6, though this state is likely never reached. The physiologically more interesting situation is just after the initial pelotons have fully formed. The relative probability of a gap being between pelotons is then (see Equation 26) Once the initial pelotons have formed beyond x p , we know that the average velocity is k tp , and we have The trans-peloton gaps can be written as (see supplementary Figure 2) Combining Equations 42-45 we have k tr = k inτ 1 + k inτ In the main text we are interested in the case where motors are large and the initiation rate is low compared to motor stepping rates. In this limit we have k tr = k inτ 1 + k inτ k tp δ m , k off = 1 1 + k inτ k tp δ m , and k on = 1 Here the approximate relation should be valid in the case of transcription through nucleosomes as here the roadblock are substantially larger than the motors, andτ is consequently substantially larger than δ m /k tp .

Monte Carlo Simulations
We validate our heuristic arguments using a random-sequential-update Monte Carlo scheme with fixed time step dt to simulate our model. During a Monte Carlo step on a lattice of size L + 1 there are L + 1 possible events: all the motors on the lattice can make a step forward, bound roadblocks can unbind, roadblocks in solution can bind to an empty lattice site, and a motor can bind at the start of the lattice. The time step is chosen small enough that the probability of any event occurring with rate k in a time dt can be approximated as kdt. In our simulation k ip is the fastest rate, and we choose k ip dt = 0.1. We verified that our results are robust towards changes in dt. Without roadblocks, the time to equilibration for periodic systems scales with the system size as L 3/2 [1]. With roadblocks, the time to equilibration is expected to be larger due to the slow peloton dynamics. For the simulations with periodic boundary conditions we waited L 2 /dt iterations for the system to reach steady state and let simulations run a total amount of 10L 2 /dt iterations, and checked that the peloton size did not change for longer equilibration times. The velocities presented in supplementary Figure 1 and Figure 5 in the main text are calculated by averaging over the instantaneous hopping rates of the motors.