Multiple LacI-mediated loops revealed by Bayesian statistics and tethered particle motion

The bacterial transcription factor LacI loops DNA by binding to two separate locations on the DNA simultaneously. Despite being one of the best-studied model systems for transcriptional regulation, the number and conformations of loop structures accessible to LacI remain unclear, though the importance of multiple coexisting loops has been implicated in interactions between LacI and other cellular regulators of gene expression. To probe this issue, we have developed a new analysis method for tethered particle motion, a versatile and commonly used in vitro single-molecule technique. Our method, vbTPM, performs variational Bayesian inference in hidden Markov models. It learns the number of distinct states (i.e. DNA–protein conformations) directly from tethered particle motion data with better resolution than existing methods, while easily correcting for common experimental artifacts. Studying short (roughly 100 bp) LacI-mediated loops, we provide evidence for three distinct loop structures, more than previously reported in single-molecule studies. Moreover, our results confirm that changes in LacI conformation and DNA-binding topology both contribute to the repertoire of LacI-mediated loops formed in vitro, and provide qualitatively new input for models of looping and transcriptional regulation. We expect vbTPM to be broadly useful for probing complex protein–nucleic acid interactions.


S1 vbTPM workflow
The workflow of vbTPM, summarized in Fig. S1, is based on runinput files that contain all analysis parameters, including information about where the TPM data files are located, and where various results should be written to. These files can therefore be used as handles to an ongoing analysis and to intermediate results.
The three main tools for handling the analysis, marked in yellow in Fig. S1, are VB7_batch_run.m, which manages the VB analysis of raw position traces using the simple HMM model, VB7_batch_manage.m, a tool to collect the analysis results, and also to clean up and reset intermediate result files in case the analysis is interrupted, and finally VB7_batch_postprocess.m, a graphical tool to aid the manual state classification and construct factorial models based on this classification.
More advanced analysis beyond this step, including the EB procedure, currently require custom Matlab scripting. Further details are given in the software manual.

S2 The emission parameters
To gain more physical intuition about the parameters K, B that model the bead motion, we derive the corresponding ex-pressions for the standard deviation (or RMS value) and the correlation time, for the case with no hidden states. The bead motion model, Eq. (2) in the main text, can be expressed as a stochastic difference equation whose parameters depend on the hidden state, where w t are independent vectors of Gaussians with two independent components and unit variance, With no hidden states, this simplifies to and to compute the corresponding RMS value, we first substitute Eq. (S3) into ⟨ Now, the equation of motion (S3) means that w t and x t-1 are independent, so that ⟨w t · x t-1 ⟩ = 0, and since x t is also stationary it follows that ⟨ Finally, noting result file from postprocessing: state labels, factorial models, segmented traces.
HMM results for all trajectories in a single file TPM position data runinput file (parameters) HMM result files for single trajectories

VB7_batch_run
VB analysis of single trajectories.

VB7_batch_manage
Collect results and manage parallell jobs. Figure S1: Work flow for TPM analysis using vbTPM. The yellow boxes indicate the three main tools of the vbTPM toolbox and their functions, as described in the text. Solid lines indicate that a file is written by another file, or passed as argument to it. Dashed lines indicate flow of information handled internally by reference to the runinput file.
which leads to the expression for the RMS value of Eq. (3), To derive the correlation time, we similarly start with the equation of motion (S3) to compute ⟨x t · x t-1 ⟩. After applying the same type of arguments, we get ) Repeated application to longer times, and division by ⟨ where the last step is just the definition of the correlation time τ in terms of the sampling time ∆t. This is indeed the correlation time given in Eq. (3).

S3 Choice of priors
We would like to choose uninformative prior distributions in order to minimize statistical bias. This is unproblematic for the emission parameters K, B, since the amount of data in all states is large enough to overwhelm any prior influence. As derived in the software manual 1 , prior distributions for K, B are given by with the range B j ≥ 0, -∞ < K j < ∞. Throughout this work, we useμ j =0.6,ñ j =1, (S11) v j =5.56 nm 2 ,c j =30000 nm 2 , (S12) The prior for the initial state probabilities are Dirichlet distributed, p(π|N) = Dir(π|w (π) ), and these variables are unproblematic for the opposite reason: the long length of the trajectories makes the initial state relatively unimportant to describe the data. We use a constant prior of strength 5, i.e., where N is the number of hidden states. The transition probabilities need more care, because the potentially low number of transitions per trajectory makes the prior relatively more influential. The prior for the transition matrix A are independent Dirichlet distributions for each row, parameterized by a pseudo-count matrixw (A) ij . Following Ref. [1], we parameterize this prior in terms of an expected mean lifetime and an overall number of pseudo-counts (prior strength) for each hidden state. In particular, we define a transition rate matrix Q with mean lifetime t D , and then construct the prior based on the transition probability propagator per unit timestep, Here, t A is the prior strength; both t A and t D are specified in time units to be invariant under a change of sampling frequency. Further, the timestep is given by ∆t = n downsample /f sample , where f sample is the sampling frequency (30 Hz in our case), and n downsample is the downsampling factor (we use 3). Numerical experiments in Ref. [1] show that choosing the strength too low compared to the mean lifetime produces a bias towards sparse transition matrices. This is not desirable in our case, and we therefore use t D = 1 s, and t A = 5 s throughout this work.

S4 Performance on synthetic data
Here, we test the abilities of vbTPM to resolve close-lying states in synthetic data, and compare it to the RMS histogram method. We also verify that model parameters are recovered correctly, and that these results are insensitive to the factor three downsampling that we use for analysis on real data.
Our test model, depicted in Fig. S2A, has one unlooped (U) and two looped (M and B) states, and the difficulty of resolving states M and B can be tuned by decreasing either their RMS difference ∆RMS or their average life-time τ L (the lifetime of the aggregated state B+M is fixed at 30 s, same as the unlooped state). For each parameter setting, we generated and analyzed ten 45 minute trajectories. Fig. S2B-C shows a comparison of temporal resolution, using ∆RMS=40 nm and varying τ L . Resolving states using histograms means resolving peaks, and three distinct peaks emerge at τ L = 4 -8 s. In contrast, vbTPM resolves the correct number of states already at τ L = 0.5 s. This order-ofmagnitude improvement mainly reflects the detrimental effects of the low-pass filter used in the RMS analysis, and is insensitive to downsampling by a factor of three. The vbTPM limit can instead be compared to the bead correlation time τ , which were set to 0.1, 0.17, and 0.25 s for the B,M and U states in this data.
We also compared vbTPM to the histogram method for resolving states that interconvert slowly (τ L = 30 s) with varying degrees of separation in RMS. The result is shown in Fig. S3, and indicates that vbTPM does not significantly outperform the histogram method in this case.
To summarize, we mapped out the resolution of vbTPM in the range 0.0625 s ≤ τ L ≤ 30 s, 5 nm ≤ ∆RMS ≤ 40 nm. The results, in Fig. S4, show a nonlinear relation between the spatial and temporal resolution.
Next, we verify that model parameters are also well reproduced and insensitive to downsampling in this situation. in the data when those states interconvert too quickly to be resolved. The three-state models generally reproduce the input parameters with a slight downward bias that is more noticeable at high RMS values. We believe that this is an effect of the drift-correction filter we applied to the data. Note that the results with and without downsampling are almost indistinguishable. The mean lifetimes (Fig. S6) show similar trends of good fit and almost no difference with and without downsampling. Two-state models that do not resolve the two looped states learn their aggregated mean lifetime, which is indeed 30 s in the true model. The tendency to overestimate the short lifetimes can be rationalized by noting that short sojourns are more difficult to resolve, and therefore do not contribute as much to the estimated parameter values.
Individual transition probabilities (elements A ij ) are presented in Fig. S7. Here there is a clear difference with and without downsampling, since the latter estimates transition probability per timestep, while the former per three timesteps. Low transition probabilities suffer significant fluctuations due to small number statistics, while the higher transition probabilities are well reproduced.   Posterior mean value ± std. (an estimate of the parameter uncertainty) for two-and three-state models shown separately, according to which model size got the best score for each trajectory. Most error bars are smaller than the symbols. Analysis without (left) and with (right) downsampling give almost identical results.   Fig. S5. Due to symmetries of the underlying kinetic model it only contains three distinct transition probabilities. The difference with and without downsampling is due to the fact that the downsampled model effectively estimates transition probabilities per three timesteps, given by A 3 (blue dashdotted lines), instead of the single-step probabilities (black dashed lines) used to produce the data. Relative to these different targets, however, the analyses with and without downsampling give very similar results.

S5 Effect of short-lived spurious states
vbTPM is able to detect many short-lived spurious states that cannot be detected in RMS trajectories, and one might wonder if the presence of these states poses a problem for earlier results where they were not detected [2]. To test this, we compute some properties of our E8 constructs subjected to our standard screening process [2] (which does not detect short-lived artifacts), and compare them to a population where the trajectories are subjected to additional screening, namely, where trajectories with the most frequent short-lived spurious events are removed. The differences turn out to be small.
For this additional screening, we looked at the average frequency of transitions from genuine to spurious states and the fraction of time spent in spurious states. As shown in Fig. S8, the distributions of these properties for the E8x and TAx trajectories have distributions that are fairly broad. For this comparison, we set thresholds of at most 6 spurious transitions per minute and 5% spurious occupancy (dashed lines in Fig. S8A,B), which removed about 30% of all trajectories (although the fraction varied significantly between different constructs). Figure S9 shows average state occupancies, mean dwell times, and average rates of loop-loop interconversions computed from models converged with the EB algorithm on all trajectories in our E8x constructs (assuming simple threestate kinetics, and should thus be interpreted with care). Solid lines show results for trajectories passing the standard screen- ing, while dashed lines represent the results after the additional screening to remove trajectories with many short spurious events. As seen in Fig. S9, the presence or absence of these "most spurious" trajectories generally have a small effect on the analyzed average properties.

S6 Equilibration analysis
As discussed in the main text, analysis of both E8x and TAx experiments shows that a significant number of trajectories populate only one of two looped states, along with the unlooped state, whereas others populate all three states. One possible explanation for the apparent existence of 2-state and 3-state populations is that we are simply observing equilibration effects, and that every trajectory would eventually populate all three states, provided a bead is observed over a sufficiently long measurement interval. In order to test this null hypothesis-that is, the hypothesis that all trajectories observed for a particular construct are actually drawn from Note that the occupancy values in (A) are not directly comparable to those in our earlier analysis [2], since the effect of trajectories without looping activity was not corrected for in this plot. a single, three-state population, and some of them end up only exploring one of the two looped states due to the finite observation time-we have generated datasets consisting of simulated state trajectories, drawn from an underlying 3-state population, and compared the number of 2-state and 3-state trajectories in the simulated data to those found in the analysis of the experimental data. The procedure for this analysis is as follows. We first perform vbTPM analysis of the experimental data, and then use the EB analysis to estimate a distribution p(A | α) over the transition rates and a distribution p(π | ρ) over the initial state probabilities. (As shown in Fig. S14-S18 below, the EB analysis tends to give more accurate state assignments than the VB analysis, since it uses information from multiple trajectories at once. It also describes the variability between individual beads.) Note that this EB analysis implicitly assumes all trajectories belong to a single 3-state population, though not all trajectories are required to populate each of the 3 states. For each trajectory n = 1 . . . N in the experiment, we now simulate a trajectory s n,t with a number of time points T n that is identical to that of the n-th trajectory in the experimental data. To do so, we first sample A n ∼ p(· | α) and π n ∼ p(· | ρ).
We then sample s n,1 ∼ p(· | π n ) and s n,t ∼ p(· | A s n,t-1 ) for t = 2 . . . T n . We repeat this procedure 100 times, using new values A n and π n on each sweep. Figure S10 shows the number of 2-state and 3-state trajectories obtained through an EB analysis of real data as compared to the corresponding numbers in simulated datasets. In this analysis we define a trajectory as having 3 states when ∑ t E[s n,t,k ] > 5 for all 3 states k. In other words, a trajectory must assign at least 5 time points to each state in order to be classified as having 3 states. This threshold was empirically chosen to exclude instances where a brief transition to a spurious state may be misinterpreted as a transition to an actual state. However, we verified that analysis results were not qualitatively different when this threshold was lowered to 1 time point for each state. Note that the EB analysis can sometimes find a third genuine state that the VB algorithm missed, as shown in Fig. S16, and thus TA105 does show a few three-state trajectories.
Analysis of the E8x trajectories shows a significantly lower number of 3-state trajectories than in equivalent simulated data. The TAx constructs show a similar, if less pronounced, trend; we believe that this is due to the poor statistics for these constructs, in which the 2+3 pattern is less extreme (that is, fewer TAx constructs have a robust mixture of 2-and 3-state populations, compared to the E8x constructs). Note that for the TAx constructs that do have a significant number of both 2-and 3-state trajectories (e.g., TA100, TA103, TA106, TA109), the trend follows that of the E8x constructs, with more 2-state trajectories observed experimentally than in simulated data. Note also that the error bars on simulated counts show an interval of two standard deviations (95% confidence), which represents a very conservative estimate of the uncertainty. In other words, under the null hypothesis where all trajectories are described by a single 3-state model, we would expect to see a significantly higher number of 3-state trajectories than we actually observe experimentally, suggesting that the 2-state trajectories and the 3-state trajectories in our data are not drawn from the same underlying population. These results lead us to hypothesize that we are observing three looped states, not two, as detailed in the main text. The EB method appears unbiased in all cases, indicating that the tendency of EB to undercount transitions in the synthetic E8106 data is indeed an effect of rare transitions rather than a systematic downward bias.

S8 Loop-loop interconversions in all constructs
Having established in Fig. 7 that the EB algorithm can reliably detect direct loop-loop interconversions, we present the corresponding analysis for our other constructs. For every construct, we ran EB analysis on all trajectories identified as three-state by the VB algorithm and manual classification, and counted the posterior expectation of the number of BMinterconversions. However, TA105 had no three-state trajectories in this analysis (Fig. 5), so for this construct we instead counted BM-transitions in three-state trajectories based on the EB analysis done for Fig. S10, an analysis that includes all trajectories. The results are shown in Figs. S12-S13. The EB analysis detects loop-loop interconversions in all constructs. However, since the total number of three-state trajectories tend to decrease with decreasing loop length, the evidence is most convincing for the longer constructs.

S9 Example trajectories
In Figs. S14-S18, we provide a few examples of analyzed trajectories from the E8106 construct. Each example shows the RMS trace (black), the sequence of most likely hidden states from the VB analysis ("HMM", yellow), the sequence of most likely genuine states from the corresponding factorial model (magenta), and the sequence of most likely states from the empirical Bayes (EB) algorithm, converged with three genuine states on all two-and three-state trajectories (cyan). In some cases, we also show short sections of driftcorrected position traces (x(t), y(t) in blue and red), where the segmentation indicated the presence of short-lived spurious states.    Figure S17: An example of a three-state trajectory with a significant number of spurious states, which the factorial HMM successfully ignores. The x(t), y(t) positions of some of these spurious events are shown in the panels below the main trace, and are probably caused by the bead transiently sticking to the surface. The slow drift towards the origin during these events are caused by the drift-correction filter.  Figure S18: An example of a two-state trajectory with a significant number of spurious states, which the factorial HMM successfully ignores.