Assaying RNA structure with LASER-Seq

Abstract Chemical probing methods are crucial to our understanding of the structure and function of RNA molecules. The majority of chemical methods used to probe RNA structure report on Watson–Crick pairing, but tertiary structure parameters such as solvent accessibility can provide an additional layer of structural information, particularly in RNA-protein complexes. Herein we report the development of Light Activated Structural Examination of RNA by high-throughput sequencing, or LASER-Seq, for measuring RNA structure in cells with deep sequencing. LASER relies on a light-generated nicotinoyl nitrenium ion to form covalent adducts with the C8 position of adenosine and guanosine. Reactivity is governed by the accessibility of C8 to the light-generated probe. We compare structure probing by RT-stop and mutational profiling (MaP), demonstrating that LASER can be integrated with both platforms for RNA structure analyses. We find that LASER reactivity correlates with solvent accessibility across the entire ribosome, and that LASER can be used to rapidly survey for ligand binding sites in an unbiased fashion. LASER has a particular advantage in this last application, as it readily modifies paired nucleotides, enabling the identification of binding sites and conformational changes in highly structured RNA.


INTRODUCTION
RNA molecules play essential roles in nearly every step of gene regulation, from chromatin modification and transcription to translation regulation. RNA molecules fold into complex three-dimensional structures that can impart unique functionalities, from phosphodiester bond cleavage to protein binding (1,2). Several existing chemical methods directly measure RNA structure, both inside and out-side of living cells. Conventional chemical probes such as dimethyl sulfate (DMS, which methylates the Watson-Crick face of single-stranded adenosine and cytosine residues, as well as the 7 position of guanosine) and SHAPE (selective 2 -hydroxyl acylation analyzed by primer extension, which modifies any nucleotide by 2 -hydroxyl acylation at flexible sites) report primarily on the Watson-Crick pairing status of individual nucleotides (3)(4)(5). A critical component of the RNA structure toolbox is the ability to interrogate the surface opposite the Watson-Crick face to obtain a more general map of nucleobase solvent accessibility. Hydroxyl radical footprinting (HRF) has been used for decades to assay solvent accessibility by cleaving the sugar-phosphate backbone of the RNA at accessible nucleotides (6). While HRF is easily implemented in vitro, in vivo probing requires a synchrotron X-ray source (7).
We recently reported the development of Light Activated Structural Examination of RNA, or LASER (8). LASER takes advantage of light-activated aroyl azides such as nicotinoyl azide (NAz), which can form aroyl nitrenium ions in solution. Nitrenium ion electrophiles can react with electron-rich purine residues in RNA, through an electrophilic aromatic substitution reaction, to form C8 amide products with adenosine and guanosine ( Figure 1A). These C8 adducts can induce a reverse transcription stop, likely due to isomerization (trans-to-cis) along the C1 -N9 bond of adenosine and guanosine, and these RT-stops can be used to map solvent accessibility onto the primary structure of an RNA. Because the aroyl azides are readily taken up by cells, LASER can be utilized to footprint unique structural states and protein-RNA interactions within living cells.
The merging of chemical methods to measure RNA structure with deep sequencing has opened the door to large-scale analyses of RNA structure. DMS, Ncyclohexyl-N'-(2-morpholinoethyl)carbodiimide metho-ptoluenesulfonate (CMCT), and SHAPE have now been utilized by many labs to probe pools of RNAs or entire transcriptomes (9)(10)(11)(12)(13)(14)(15)(16)(17)(18)(19). HRF was also recently adapted for use with high-throughput sequencing (20), and we have recently   Figure 1. Chemical probing by LASER-seq and LASER-MaP. (A) Nicotinoyl azide (NAz) is activated by long-wavelength UV light to form C8 adducts on A and G residues. Adduct formation is thought to result in trans-to-cis isomerization of the nucleobase. Such isomerization provides a molecular explanation for the production of RT-stops, as observed previously with denaturing gel electrophoresis, and for nucleotide misincorporations. (B) LASER-Seq and LASER-MaP methods. Ribosome complexes or intact cells were treated with NAz and UV light, followed by RNA extraction, fragmentation, and size selection. After adaptor ligation and reverse transcription, cDNAs were size selected and separated into full-length and truncated products, which were separately circularized and subjected to high-throughput sequencing. modified this high-throughput technique to identify in vitro protein binding sites on the ribosome by localized generation of hydroxyl radicals in situ (21). Since hydroxyl radicals cause cleavage of the RNA backbone, they can only be identified by RT stop approaches, without the benefits of recent mutational profiling (MaP) technologies (22). MaP relies on the propensity of some covalent nucleotide modifications to cause mutations in addition to RT stops during reverse transcription, which can be quantified by high-throughput sequencing. MaP approaches have been used to identify sites of modification by dimethyl sulfate (DMS) (23,24), SHAPE reagents (22,24), and other probes that covalently modify RNA (19,25). Despite its utility in DMS and SHAPE structure probing, expansion of MaP to the many other chemical probes has yet to be realized. To expand LASER to studies of large structured RNAs, we developed LASER-Seq and LASER-Mutational Profiling (LASER-MaP). LASER reactivity should report on the solvent accessibility of the C8 position, providing an additional layer of information and allowing identification of binding sites or conformational changes in base-paired regions. Here we use the ribosome, a large ribonucleoprotein of complex but well-defined structure, as a test case for LASER-Seq and LASER-MaP. We find that LASER reactivity generally agrees with computed solvent accessibility across the ribosome, and that LASER can be used to rapidly survey the ribosome for ligand binding sites in an unbiased fashion. LASER has a particular advantage in this last application, as it readily reacts with paired nucleotides, enabling the identification of binding sites and conformational changes in highly structured RNA.

Synthesis and storage of chemical reagents
NAz was synthesized as in (8), and 1M7 was synthesized as in (26). NAz was stored in powder form, wrapped in aluminum foil, at −20 • C. NAz was dissolved in anhydrous DMSO at 3 M and used within a few days. 1M7 was dissolved in anhydrous DMSO immediately before use. Onc112 peptide (VDKPPYLPRPRPPR{d-ARG}IYN{d-ARG}) was synthesized by GenScript (Piscataway, NJ, Nucleic Acids Research, 2019, Vol. 47, No. 1 45 USA), resuspended in water to 20 mM, and stored at −20 • C.

In vitro NAz treatment of E. coli ribosomes
In vitro LASER reactions were performed in 25 l volumes in 1.5 ml microcentrifuge tubes (Axygen). 25 l reactions contained 1× HEPES modification buffer (30 mM K-HEPES pH 7.5, 7 mM Magnesium Acetate, 100 mM KCl), 1-2 M crude E. coli 70S ribosomes, and where indicated 4 M EF-G, 0.5 mM GDPNP, or onc122. For the first batch of experiments we performed 1 reaction each of 70S alone, 70S+EF-G, 70S+EF-G+GDPNP. For the second batch, we performed two reactions with 70S alone with and without NAz, and one reaction at each onc112 concentration. Reactions were brought to the final volume of 25 l by adding 2 l of 300 mM NAz in DMSO or DMSO alone. Reducing agents (besides those present in ribosome and protein stocks) were omitted to prevent potential reactivity with NAz. Reactions were incubated for 5 min at 37 • C, arranged uniformly around a UV lamp (20 watt Zilla Desert 50 UVB Fluorescent Coil Bulb) with the tube bottoms pointed towards the bulb, and exposed to UV light for 3 min. Reactions were brought to 300 l with 0.3 M sodium acetate pH 5.5, isopropanol precipitated, extracted twice (or until the protein interface was gone) with phenol/chloroform/isoamyl alcohol, and once with chloroform, ethanol precipitated, and resuspended in 40l water. Precipitations (30) were performed in 1.5 ml microcentrifuge tubes by bringing solutions to 0.3 M sodium acetate, adding 5-10 g of glycogen (if RNA is being quantified) or glycoblue (Invitrogen), and an equal volume of isopropanol or 2.5 volumes of ethanol, chilling for 15 min on dry ice, centrifuging at 20 000g for 30 min, and removing the supernatant. Pellets were washed by adding 400 l 70% (for intact RNA) or 80% (for fragmented RNA) ice-cold ethanol, centrifuging again for 5 min, and removing supernatant.

In vivo NAz treatment of K562 cells
K562 cells were grown in advanced RPMI 1640 media (Gibco) supplemented with 10% FBS (Gibco) and 2 mM L-Glutamine (Gibco). 500 000 cells were pelleted in a 1.5 ml microcentrifuge tube (Axygen), washed with 500 l 37 • C PBS, pelleted again, and resuspended in 100 l PBS + 10% DMSO, or PBS + 300 mM NAz. One biological replicate each were exposed to UV light for 1 or 3 min as described for in vitro reactions. 300 l of Trizol (Invitrogen) was added to each reaction and RNA was purified with the Zymo Direct-Zol RNA miniprep kit, following the manufacturer's directions for total RNA extraction, omitting the DNase treatment.
In vitro NAz, 1M7 and BzCN treatment of S. cerevisiae ribosomes 25l reactions were assembled containing 1× HEPES modification buffer (30 mM K-HEPES pH 7.5, 3mM Magnesium Acetate, 100 mM KCl, 2 mM DTT) and 0.5 M each of S. cerevisiae 40S and 60S ribosomal subunits. After incubating at 25 • C for 5 min, reactions were brought to the final volume of 25 l by adding 2.5 l of 100 mM 1M7, 100 mM Benzoyl Cyanide (BzCN, Sigma Aldrich 115959) or 3M NAz in anhydrous DMSO, or DMSO alone, and incubated at 25 • C for 6 min (1M7), 30 s (BzCN), or exposed to UV light for 3 min (NAz). Reactions were brought to 500 l with 0.3 M sodium acetate pH 5.5, isopropanol precipitated, resuspended in 200 l 0.3 M sodium acetate pH 5.5, extracted twice (or until protein interface was gone) with phenol/chloroform/isoamyl alcohol, and once with chloroform, ethanol precipitated and resuspended in 40 l water. The datasets presented in Figure 2E and Supplementary Figure S4 are each from a single reaction.

Processing of sequencing data
Raw reads were trimmed of 3 adaptor sequence (CACTCGGGCACCAAGGAC) with skewer (32). ShapeMapper 2.0 (33) was used to trim low-quality sequences, align reads to the consensus MRE600 (34) or S. cerevisiae (Saccharomyces Genome Database) rRNA sequence, and count mutations and read coverage at each position. Mutation fractions were defined as the number of reads not matching the reference at a given nucleotide position, divided by the total number of reads overlapping the position. A number of rRNA positions have nucleotide modifications or vary between rRNA copies within an organism, causing a high background of apparent mutations at that position. These positions were detected by manual inspection of mutation traces in a genome browser (35,36) and excluded from downstream analysis.  (37), and 5 ends were counted. Soft-clipped nucleotides were ignored when determining read ends to reduce the effect of untemplated nucleotides added during RT. For background subtraction, the RT stop RPM or mutation rate for the UV-only control was subtracted from the NAz-treated control for each nucleotide. A data analysis pipeline that performs these processing steps and outputs tables of RT stops and mutations is available on GitHub (https://github.com/borisz264/LASER seq 2018). Raw reads and processed data have been deposited in the Gene Expression Omnibus with accession GSE113529.

Counting mutation co-occurrences in sequencing reads
In order to identify mutations in each read, we used the 'output-parsed-mutations' flag in ShapeMapper 2.0 to produce a per-read list of mutations and their positions within the rRNA. We parsed this list to count the number of mutations in each read, only counting mutations as separate if their positions were separated by five or more nucleotides in the rRNA reference.

Detection of ligand-dependent reactivity changes
For computation of mutation rate change, fold change, or significance, background subtraction was not used. To determine significantly protected or deprotected nucleotides we performed the analysis method presented in (38) with the following minor modifications. For each nucleotide, mutation rates (M) were normalized by dividing by the average mutation rate (A) across all A and G nucleotides in the rRNA. The normalized mutation rate N is M/A. For each nucleotide, the difference in the normalized mutation rate between bound (N b ) and unbound (N u The standard error () of the mutation rate for a nucleotide is the square root of the mutation rate divided by the read depth (C), which is further scaled by the average mutation rate: σ = ( M/C)/A. Z factors were computed as z = 1 − (1.96 * (σ b + σ u )/| N|. Standard reactivity change scores were computed as = N−mean(all N) st dev(all N) . A nucleotide was considered to have a significant reactivity change if |S| > 0 and z > 0. For MA plots, fold change were computed as M b /M u and average read counts were computed as (M u C u + M b C b )/2. Since LASER does not modify all nucleotides, we did not average the signal over a sliding window or require multiple affected nucleotides within a range to call a nucleotide as protected or deprotected. Since this error model only accounts for the raw error inherent in read counting, and not any biological, biochemical, or experimental noise, we limited our hits to those found in all of the datasets. We used the same control dataset for all ligands within a batch for a given RT.

Computation of solvent-accessible surface area ROC curve analysis
To determine the solvent accessible surface area of the C8 or 2 OH positions of purines, we used the 'get area()' command in PyMol (Schrödinger, LLC). We used PDB ID 4ybb for E. coli. and 4v88 for S. cerevisiae ribosomes. We set the 'solvent radius' parameter to either 3, 4 or 5Å, 'dot solvent' to 1, and 'dot density' to 3. Solvent accessible surface area values are reported inÅ 2 . Nucleotides that were excluded from MaP quantification (see 'processing of sequencing data' above) or unresolved in the structure were excluded from ROC analysis. A ROC curve was generated by iterating a mutation rate threshold from 0 to 1 in steps of 0.00001 and counting the number of nucleotides below this threshold with SASA ≥ 5Å 2 (true positives) or SASA < 5 A 2 (false positives).

LASER-Seq and LASER-MaP detect solvent-accessible nucleotides in vitro and in vivo
To adapt LASER to high-throughput sequencing methodology, we performed a pilot experiment with purified E. coli ribosomes ( Figure 1B). We equilibrated ribosomes with 300 mM NAz, or an equal volume of DMSO, and exposed the mixture to ultraviolet (UV) light for 3 min. We prepared sequencing libraries using two different reverse transcriptases (RTs) (Superscript II (SSII), and TGIRTIII) and conditions which were previously optimized to detect DMS and SHAPE modifications by MaP analysis (22,23). We gel-purified full-length cDNA for MaP analysis, and truncated cDNA to enrich for RT stops. The isolated RT products were subjected to Illumina sequencing and the resulting sequences were mapped back to the consensus ribosomal RNA (rRNA) sequence to count RT stops (5 ends of reads) and mutations. For RT stop analysis, NAz reactivity is expressed as Reads Per Million (RPM): the number of reads with 5 ends mapping 1nt 3 of the nucleotide divided by the number of reads mapping to the rRNA (in millions). For MaP analysis, NAz reactivity is expressed as the number of mutations at a nucleotide position divided by the number of sequencing reads overlapping that position.
In the absence of NAz, a few prominent peaks of mutations were visible across the rRNA (Figure 2A and Supplementary Figure S1). These sites include posttranscriptionally modified nucleotides and sites of heterogeneous rRNA sequence. There is a substantial background of RT stops spread across the rRNA, presumably due to structure or sequence dependent RT stops ( Figure 2B and Supplementary Figure S2). This background was reduced upon NAz treatment, as sequencing space became occupied by NAz-dependent RT stops, and was further reduced by subtracting the UV control ( Figure 2B and Supplementary Figure S2). UV treatment alone caused little change in RT stops or mutations, indicating that UV-induced RNA damage is not a major source of background in this assay ( Supplementary Figures S1 and S2). After combined NAz and UV treatment, mutations and RT stops became evident at many other positions. For example, 16S G530, a highly-accessible nucleotide involved in recognition of codon-anticodon pairing during translation (39), displays strong NAz-dependent peaks of mutations and RT stops (Figure 2A and B), demonstrating the strong signal over background for this technique at accessible nucleotides.
In accordance with the specificity of NAz for A and G nucleotides, LASER treatment caused an increase in mutations and RT stops at A and G ( Figure 2C and D). This effect was substantially better for MaP analysis compared to RT stop analysis, where background RT stops are a clear problem. Subtraction of RT-stop background leads to a substantial improvement in detection of NAz-dependent Figures S2 and S3B), while the MaP data were only marginally affected by background subtraction ( Supplementary Figures S1A and S3A) due to the low background mutation rate of LASER-MaP ( Figure 2C, Supplementary Figure S1). SSII yielded higher rates of mutations at purines than TGIRTIII ( Figure 2C), while RT stops produced by TGIRTIII were substantially more enriched for purines, indicative of the higher RT stop background with SSII ( Figure 2D, Supplementary Figure  S3B). This is readily visible in Supplementary Figure S2, where the 16S rRNA landscape is dominated by a small number of high-intensity RT stops for TGIRTIII, consistent with the generally low accessibility of a highly structured and protein-bound RNA, while the SSII sample has a more uniform background of RT stops resembling the untreated control. This striking difference could be due to the higher temperature of reverse transcription for TGIRTIII (60 • C, compared to 42 • C for SSII), which would unfold RNA structure and might reduce structure-dependent RT stops in favor of NAz-dependent ones. For these reasons, we recommend the use of TGIRTIII for LASER-seq, and the use of SSII for LASER-MaP.

RT stops at A and G (Supplementary
We found that mutations and RT stop RPMs were reproducible within a given RT (Supplementary Figure S3C and D) and between the different RTs (Supplementary Figure  S3E). RT stops were poorly correlated with mutations, but the correlation increased when the background signal was subtracted (Supplementary Figure S3F). This indicates that the two assays have different sources of background noise, but some real NAz-dependent signal can still be detected at the same nucleotides by both methods, despite obvious differences.
These results show that both LASER-Seq and LASER-MaP are capable of detecting NAz-reactive nucleotides across a structure as large as the ribosomal RNA. However, our MaP datasets clearly have fewer positions with non-specific background compared to the RT stop datasets (Compare Supplementary Figures S1 and S2). It is exciting that LASER is well-suited to MaP analysis, as modification detection methods based on RT stops suffer from RT shadowing (read coverage is reduced immediately downstream of a heavily modified base) (23) and biases in library preparation due to the fact that the sequence at the RT stop determines the efficiency of capture during circularization (40)(41)(42). MaP-based methods suffer much less from these issues, making them a potentially more accurate measure of nucleotide modification. Most importantly, for MaP analysis, the mutation fraction is internally normalized by read coverage at the position being analyzed, and can be described by a rigorous error model (22)(23)(24)38). For these reasons, we focused on LASER-MaP for further experiments.
To contextualize the mutation rates seen in LASER-MaP, we compared it to other structure probing strategies that are amenable to mutational profiling. With purified S. cerevisiae ribosomes as our target, we performed LASER-MaP as well as SHAPE-MaP (22) using the reagents 1-methylnitrosatoic anhydride (1M7) and benzoyl cyanide (BzCN). SHAPE monitors internucleotide flexibility through 2 -OH acylation and previous reports have demonstrated that SHAPE reactivity is not governed by solvent accessibility (43). We also compared these datasets to our previously Nucleic Acids Research, 2019, Vol. 47, No. 1 49 published DMS-MaP probing of the same ribosomes (44). NAz produced more mutations at G than all other tested reagents, and more than the SHAPE reagents, but less than DMS, at A ( Figure 2E, Supplementary Figure S4).
MaP can identify correlated mutations on RNA by identifying multiple mutations in a single sequencing read, because RT does not necessarily terminate at the first modified nucleotide. This information can be used to refine RNA tertiary structures, or to deconvolute a mixed RNA population (24). As a preliminary test of the suitability of NAz for this application, we counted the number of reads with multiple mutations, as a fraction of the whole sequencing library. We found a 1.7-to 2.5-fold enrichment in reads with two mutations, and a 3-to 5-fold enrichment in reads with three mutations in NAz-treated samples, compared to the UV control (Supplementary Figure S5). This indicates that LASER-MaP is potentially suitable for correlated probing analyses such as RING-MaP (24).
NAz is cell-permeable (8), allowing for probing to be done inside living cells. We performed LASER-MaP on live human K562 cells and compared the data to our previously published LASER RT-stop data from radioactive primer extension of ribosomes in HeLa cells (8). As shown in Figure 2F, the two methods agree with each other, further demonstrating that LASER works to modify RNA inside living cells and that LASER-MaP can recapitulate the results from manual probing of RNA.

LASER-MaP is specific for measuring solvent accessibility
To test the utility of LASER-MaP as a predictor of solvent accessibility, we compared the mutation rate at each A or G nucleotide in the E. coli ribosome with the computed solvent-accessible surface area (SASA) of the C8 atom for the same nucleotide, based on a high resolution X-ray crystal structure (45). We computed SASA with NAz approximated as a sphere with a 4Å radius and defined solvent-accessible (true positive) nucleotides as purines with a SASA of 5Å 2 or more. We generated receiver operating characteristic (ROC) curves ( Figure 3A) that test how well the LASER-MaP signal can separate true positives from false positives (nucleotides with SASA <5Å 2 ) at different thresholds of mutation rate. The area under the ROC curve (AUC) quantifies the predictive value of the measurement, with 1.0 indicating the existence of a mutation rate threshold that detects all true positives with no false positives, and 0.50 indicating no predictive value above random chance. NAz reactivity was a good predictor of solvent accessibility (AUC = 0.75) compared to our DMSO control (AUC = 0.49). Our reported AUC is lower than the value found in a similar experiment performed on yeast ribosomes with DMS (14); however, for DMS the true positives/negatives can be defined by the base-pairing status of the nucleotide as well as SASA, while we are using SASA alone, which requires arbitrary choices of cutoff and solvent radius for its computation. The ROC curves were robust to the choice of probe radius or SASA cutoff. Slight increases in sensitivities were observed as the SASA cutoff for true positives was increased (Supplementary Figure S6A-D), and we only observed slight variations at probe radii >4Å (Supplementary Figure S6A-D). To further test our approach, we compared our LASER-MaP data for S. cerevisiae ribosomes to the S. cerevisiae ribosome crystal structure (46). Our ROC curves indicate a similar trend as seen for E. coli ribosomes, with an AUC of 0.82 ( Figure 3A). Broadly speaking, these results demonstrate that LASER is an accurate tool for measuring solvent accessibility.
Differences in nucleotide reactivity between reagents can be used to predict more accurate structures of a given RNA (47). Using S. cerevisiae ribosomes as our model (46), we calculated the SASA of 2 OH and generated ROC curves with the LASER-MaP and SHAPE-MaP (1M7) mutation frequencies ( Figure 3B). As expected, SHAPE reactivities are poor predictors of the solvent accessibility of 2 OH positions (AUC = 0.56). LASER-MaP accurately detects the solvent accessibility of C8 positions (AUC = 0.81) but not of the 2 OH in the same nucleotide (AUC = 0.59). Unexpectedly, SHAPE reactivity was weakly predictive of C8 accessibility ( Figure 3B). We reason that this is due to a correlation of positional flexibility with C8 accessibility, leading to an increase in modification of the 2 OH. These results show that LASER provides structural information complementary to that provided by SHAPE.
In the course of our analysis, we found the computation of C8 SASA to be relatively crude, yielding few C8 atoms with measurable SASA, even in exposed regions of the crystal structures. To further examine NAz reactivity in solvent exposed regions, we superimposed LASER-MaP reactivity onto the X-Ray structure of the S. cerevisiae ribosome (Figure 3C) (46). Upon inspection, the majority of unreactive residues appeared protected from solvent, and exposed regions of the 25S rRNA displayed various degrees of protections. For example, residues 1395-1414 of helix 46 display low solvent accessibility due to protection by ribosomal protein L32 ( Figure 3D). Nucleotides from the loop (G1404, A1406, A1407) contact L32, and the rest of helix 46 is buried inside the complex. Slight reactivity was observed for residue A1399 which is deeper in a ribosome pocket but has its C8 atom exposed to solvent. Adjacent residues A438 and G494 are not covered by either rRNA or proteins and had high NAz reactivity. Similar protections occurred around helix 45 ( Figure 3E), whose loop (G1349, A1350, A1352, G1354, A1355) is fully exposed and whose stem (A1343-G1346) is buried within the ribosome with C8 atoms pointing towards ribosomal proteins L4A and L18A.
We also examined regions with no computed solvent accessibility, but high NAz reactivity, some of which are depicted in Supplementary Figure S6. In each of these cases, such as G763 and A2222, manual inspection revealed C8 positions that were exposed to solvent with the residues not base paired in the crystal structure. The high mutation rate at these positions indicates that there could be multiple conformations in solution susceptible to NAz modifications. SASA computation uses one conformation of the structure and could overlook residues with multiple conformations in solution. As such, these discrepancies could be due to a combination of studying the static structure and crude SASA modeling, but with a local structure highly open and reactive to NAz in solution. These observations further support the notion that NAz is reacting with solvent exposed residues.

LASER-MaP can be used to monitor binding of ligands to ribosomal RNA
Differential chemical probing analysis is a powerful technique for identifying conformational changes, as well as the binding sites of proteins and small molecules in RNA complexes such as the ribosome (4,5,48). Upon ligand binding, nucleotides in the vicinity of the binding site are 'protected', becoming less accessible and thus unreactive to covalent modifying agents. Secondary protections and deprotections, further away from the binding site, may be indicative of larger conformational changes in the RNA induced by ligand binding. Probing with reagents such as DMS, kethoxal and CMCT has yielded enormous insight into ribosome structure and function (4,(48)(49)(50), but has been limited by the small number of unpaired nucleotides in the rRNA that are reactive to these agents, as well as the large number of primer extension gels required to survey such a large structure (48). This second bottleneck has recently been resolved by high-throughput sequencing methodologies (10)(11)(12)(14)(15)(16)(17)19,20,25), but the underlying issue of probe reactivity persists.
To test the utility of LASER-Seq for the identification of ligand binding sites, we performed LASER-MaP on purified E. coli ribosomes incubated with elongation factor G (EF-G), with or without the non-hydrolyzable GTP analog GDPNP. GDPNP is expected to lock EF-G in a ribosomebound state (51) and thus increase the likelihood of observing protections. We did not see major differences in mutation rate in the presence or absence of GDPNP, so we treated these samples as replicates. To identify sites of altered NAz reactivity upon EF-G binding, we adapted a Poisson counting error model that was previously used for differential SHAPE-MaP analysis (38). In this analysis, larger absolute numbers of mutations and absolute differences in mutation rate are more likely to be called real changes. A number of nucleotides were reproducibly protected or deprotected in the EF-G bound samples (Figure 4A, Supplementary Figure S7A, Supplementary Table  S2) with both RT enzymes. We also produced MA plots comparing the fold change in mutation for each nucleotide upon EF-G incubation to its average number of mutations between both datasets ( Figure 4B, Supplementary  (59). (E) Differences in LASER-MaP mutation fraction (SSII, EF-G bound minus unbound ribosomes) overlaid on the GTPase activation center of the EF-G-bound E. coli ribosome (PDB ID 3J9Z), viewed from the A site side of the ribosome. C8 atoms of purines in the GAC are shown as spheres, EF-G is shown in pink cartoon diagram, and pyrimidines are gray. Coloring of RNA by difference values was performed with RiboVis (https://ribokit.github.io/RiboVis/). Arrows highlight a subset of EF-G protected nucleotides, as well as A1095 which shows increased reactivity that was not determined to be statistically significant.  Figure S7B). These plots show that many of the statisticallydetected reactivity changes were in nucleotides with large numbers of mutations, but with small fold changes within the spread observed for other nucleotides with similar mutation rates. This suggests that these statistical calls are spurious.
The remaining protected nucleotides cluster in the GTPase-activation center (GAC) of the 23S rRNA, immediately adjacent to the known EF-G binding site ( Figure  4C, D and E) (52). Crystal structures of the E. coli ribosome alone or bound to EF-G (45,52) show this entire region of the GAC moving upon EF-G binding (Supplementary Figure S7C), consistent with the large number of protections in this region. The protections further from the contact site with EF-G could indicate compression of the GAC RNA upon EF-G binding, which might limit the ability of NAz to access C8 atoms therein. The interaction of EF-G with the ribosome was previously analyzed by probing with DMS and primer extension (53), but only two protections (A1067, A1069) were identified in this region. This is probably due to the paucity of unpaired DMS-reactive nucleotides in this region, as well as the reduced sensitivity of gel-based RT stop measurement. LASER-MaP, however, readily identified EF-G induced conformational changes at both paired and unpaired nucleotides in this region. These results demonstrate the utility of LASER-MaP as a tool to interrogate protein binding to large and base-paired RNAs.
In order to more directly test the reproducibility and quantitative nature of LASER-MaP, we performed an additional batch of in vitro probing experiments with E. coli ribosomes. We incubated 1M ribosomes with several concentrations of the proline-rich antimicrobial peptide onc112 (54)(55)(56) and performed LASER-MaP with SSII. We found that the reproducibility within a batch of samples was higher than between batches (Supplementary Figure S8A). This difference disappeared upon background subtraction. Using the same analytical method as for EF-G, we identified a number of rRNA residues which were protected or deprotected by onc112 binding (Supplementary Figures  5A and S8B. We superimposed these nucleotide positions onto the previously-determined structure of the Thermus thermophilus 70S ribosome co-crystallized with onc112 (56) (Supplementary Table S3). These positions cluster around the binding site of onc112 in the peptide exit tunnel ( Figure  5B). The ribosome concentration used in this experiment was too high for accurate measurement of binding constants, but many nucleotides displayed monotonic increases or decreases in LASER-MaP signal with onc112 concentration ( Figure 5C), indicating that LASER-MaP can provide a semi-quantitative if not fully quantitative measure of ligand binding. Other nucleotides behaved in more complex ways, with the signal plateauing or changing direction at high onc112 concentrations. This could mean, among other explanations, that these nucleotides have different dynamics for onc112 binding, or that background noise is dominat-ing the reduced mutational signal at high onc112 concentrations.

DISCUSSION
The recent advent of high-throughput RNA structure analysis methods has greatly advanced our ability to analyze the structures of transcriptomes and large RNA molecules. Here we expand the existing structure probing toolbox by adapting LASER into LASER-Seq and LASER-MaP. LASER has reactivity preferences that depend on solvent accessibility, making it orthogonal to other probing methods that depend on RNA base-pairing. We have thoroughly characterized these methods with ribosomes from 3 different species, both in vitro and in vivo, to show that they produce RT stops or nucleotide mutations that agree with solvent accessibilities computed from high-resolution crystal or cryo-EM structures, as well as manual gel-based LASER analysis.
LASER-MaP is sensitive and able to modify base-paired nucleotides, making it well suited to detecting ligand binding sites on RNAs. We demonstrated this by recapitulating the binding sites of EF-G and onc112 on E. coli ribosomes and identifying more protections in the same rRNA region than were detected by older methods. Our results are suggestive of a large-scale movement and compression of the GAC caused by EF-G binding, which is supported by structures of EF-G bound ribosomes, while the previous result could only detect the proximal binding site of EF-G. These distal protections raise the possibility that NAz reactivity is affected by the spacing or curvature of RNA helices, a facet of NAz probing that requires further analysis. If true, this effect could be useful for determining additional constraints on unknown RNA structures. The ability to detect translation-factor and small molecule binding means that LASER-MaP could aid in determining the mechanism of action of ribosome-targeting antibiotics in vitro or in vivo, by monitoring the conformational state of the ribosome after treatment with a drug (44,49,57), while simultaneously detecting the direct binding site.
LASER-Seq and LASER-MaP can be readily adapted to the transcriptome-wide probing of mRNA structures with the addition of rRNA depletion or poly-A selection. This will enable more precise predictions of mRNA structure in combination with existing SHAPE-and DMS-based approaches and may be able to provide information on protein binding sites in RNA that are not detectable by other methods. Our pilot experiments indicate that mutations occur at NAz-reactive nucleotides at a frequency greater than SHAPE reagents, and comparable to DMS for G nucleotides, so sequencing coverage requirements should be no higher than for other MaP techniques. Recent analyses of DMS probing data suggest that RT-stops and mutations provide complementary information, and both may be needed to provide a complete picture of the state of chemical modification (58) (BioRxiv: https://doi.org/10. 1101/292532, https://doi.org/10.1101/176883). More work is required to determine if this is true for LASER, and to integrate RT-stop and MaP data from complementary probes into a single analytical framework for RNA structure prediction. We envision that LASER-Seq and LASER-MaP will be immediately applicable to many existing problems, from the identification of protein and small-molecule binding sites in large RNAs, to transcriptome-wide prediction of RNA structure and solvent accessibility.

DATA AVAILABILITY
The data analysis pipeline and outputs tables of RT stops and mutations is available on GitHub (https://github.com/ borisz264/LASER seq 2018). Raw reads and processed data have been deposited in the Gene Expression Omnibus with accession GSE113529.