Precise tuning of bacterial translation initiation by non-equilibrium 5′-UTR unfolding observed in single mRNAs

Abstract Noncoding, structured 5′-untranslated regions (5′-UTRs) of bacterial messenger RNAs (mRNAs) can control translation efficiency by forming structures that either recruit or repel the ribosome. Here we exploit a 5′-UTR embedded preQ1-sensing, pseudoknotted translational riboswitch to probe how binding of a small ligand controls recruitment of the bacterial ribosome to the partially overlapping Shine-Dalgarno (SD) sequence. Combining single-molecule fluorescence microscopy with mutational analyses, we find that the stability of 30S ribosomal subunit binding is inversely correlated with the free energy needed to unfold the 5′-UTR during mRNA accommodation into the mRNA binding cleft. Ligand binding to the riboswitch stabilizes the structure to both antagonize 30S recruitment and accelerate 30S dissociation. Proximity of the 5′-UTR and stability of the SD:anti-SD interaction both play important roles in modulating the initial 30S-mRNA interaction. Finally, depletion of small ribosomal subunit protein S1, known to help resolve structured 5′-UTRs, further increases the energetic penalty for mRNA accommodation. The resulting model of rapid standby site exploration followed by gated non-equilibrium unfolding of the 5′-UTR during accommodation provides a mechanistic understanding of how translation efficiency is governed by riboswitches and other dynamic structure motifs embedded upstream of the translation initiation site of bacterial mRNAs.


Global fitting
All the data for association and dissociation rates for each condition were fitted together globally in Origin pro software to get higher accuracy of double-exponential fitting (1,2). To reduce the number of independent parameters associated with double-exponential fitting for both association and dissociation rates as well as considering the heterogeneity of the 30S binding to the R-mRNA +30 for each condition, the shorter binding time of the double-exponential fitting was shared across all the conditions of preQ1 and effect of mutations. The global fitting of rates separately yielded two components for each of association (kon, slow and k shared on, fast) and dissociation (koff, slow and k shared off, fast) rates, one of which was variable, and the other was shared over all the conditions. While the shared component of the rate constant (k shared ) remained fixed for all conditions, the variable component was observed to be profoundly responsive to the influence of preQ1. The variable rate constant components and their relative contributions were used to compare different conditions to determine the role of preQ1 and strategic mutations on 30S ribosome binding.
Error Analysis: The errors of the kinetic parameters were estimated by bootstrapping using a Matlab code. Briefly, a data subset is created by chosing fifty molecules at random from the pool of available molecules for a particular experiment. The kinetic parameters are then calculated for those molecules. The process is repeated for ten such ittarations. The standard deviation of those ten subsets is used to represent the error of that measurement.

Distributions of High (H), Mid (M), Low (L) groups of molecules
For 30S binding to R-mRNA +30 , three types of 30S binding time distribution were observed by plotting an accumulated distribution of total binding time of 30S for all the conditions pooled together ( Supplementary Fig. S7).
The cumulative binding time histograms were fitted with three Gaussian plots (with R-Sq value of 0.9914), showing there are three types of total binding populations, which were assigned as H, M, and L to represent the binding regime they cover. The population H represented the high-range of binding time of the 30S (>30% of the total observation window), M represented mid-range binding time (between 20-30% of the total observation window) and L represented low-range binding time (<10% of the observational window). These regions for H, M, and L were used as cutoffs for calculating cumulative or percentage population for each condition. The cumulative or percentage population for each regime was calculated for each condition by counting the total binning for each regime.

Rastergram distributions
For determining the nature and number of standby (short) and cleft-accommodated (long) binding events, we represented a random selection of 100 molecules for each condition in a rastergram (3,4). First, the molecules were clustered into H, M, and L groups by analyzing the total binding time of the 30S each molecule categorized from the cutoffs estimated from the total binding time histogram distributions shown for each condition. To categorize the binding events into standby (red) and cleft-accommodated (blue) binding events, we took the geometric mean of the two components of binding times obtained from the double-exponential fitting of the cumulative plot for the binding times obtained earlier. We then counted the number of red and blue events for the required conditions and used them to compare any change in the nature of 'standby' and 'cleft-accommodated' binding events under the influence of preQ1 or the effect of strategic mutations. We plotted each individual molecule's 30S binding behavior categorized into the groups of H, M, and L to represent binding events as standby or cleft-accommodated events (shown in red or blue) within each condition. MATLAB scripts for raster plots are available upon request.

Supplementary Figures
Supplementary Fig. S1| R-mRNA truncations to determine RNA required for 30S IC formation.
Comparison of 30S IC formation efficiency on full length R-mRNA FL and R-mRNA +30 truncation and R-mRNA -11 truncation (no SD/ ORF). The IC was first formed including mRNA, 30S, Initiation factors, 32 Plabeled fmet-tRNA and GTP. The reaction was then pelleted by ultracentrifugation on a sucrose cushion. The ratio of tRNA after and before palleting represents the extent of 30S-IC formation. As expected, ICs do not form on mRNA that lacks a SD and ORF, whereas efficiency of initiation is similar for full length R-mRNA and R-mRNA +30 . A Blocked-R-mRNA +30 construct was formed by extending the capture strand to basepair with the entire P1 stem-loop. Thus, pseudoknot formation (including P2) is blocked. However, the SD-region remains unimpaired. Accordingly, increased 30S binding was observed for this blocked mRNA +30 , where 52% of mRNA molecules show high accessibility (lower panel) compared to 26% in the unblocked mRNA +30 (upper panel).
Supplementary Fig. S9| Plots of cumulative unbound and bound dwell times for the 30S binding without and with preQ1 in ligand-jump experiments. (a) Cumulative frequency plot for tunbound of 30S binding to R-mRNA +30 without preQ1 (red) before dark period monitored for first 8000 frames and with 1 µM preQ1 (black) after dark period monitored for next 8000 frames. 1 µM preQ1 is added during the dark period that lasts for 5000 frames (1 frame = 0.1 sec) (b) Cumulative frequency plot for tbound of 30S binding to R-mRNA +30 without PreQ1 (red) and WT (first lane) shows the presence of 30S with and without S1. S1 was depleted from 30S to form 30S ΔS1 (second lane). S1 was then reconstituted by adding purified S1 (stoichiometric; third lane).
Supplementary Fig. S13| Reconstitution of S1 into 30S subunits. Incorporation of S1 at various molar ratio to ΔS1-30S increased the 30S binding to the R-mRNA +30 . Supplementary

Supplementary Tables
Supplementary Table S1. List of RNA sequence used for microscopy and biochemistry. The surface captured part of the mRNA is shown in gray, the aptamer is italized, the SD sequence is non-italizised and underlined, an insertion is bolded and highlighted in gray, the start codon is italized and green.