-
PDF
- Split View
-
Views
-
Cite
Cite
Keisuke Fukunaga, Takamasa Teramoto, Momoka Nakashima, Toshitaka Ohtani, Riku Katsuki, Tomoaki Matsuura, Yohei Yokobayashi, Yoshimitsu Kakuta, Structural insights into lab-coevolved RNA–RBP pairs and applications of synthetic riboswitches in cell-free system, Nucleic Acids Research, Volume 53, Issue 6, 11 April 2025, gkaf212, https://doi.org/10.1093/nar/gkaf212
- Share Icon Share
Abstract
CS1–LS4 and CS2–LS12 are ultra-high affinity and orthogonal RNA–protein pairs that were identified by PD-SELEX (Phage Display coupled with Systematic Evolution of Ligands by EXponential enrichment). To investigate the molecular basis of the lab-coevolved RNA–RBP pairs, we determined the structures of the CS1–LS4 and CS2–LS12 complexes and the LS12 homodimer in an RNA-free state by X-ray crystallography. The structural analyses revealed that the lab-coevolved RNA–RBPs have acquired unique molecular recognition mechanisms, whereas the overall structures of the RNP complexes were similar to the typical kink-turn RNA-L7Ae complex. The orthogonal RNA–RBP pairs were applied to construct high-performance cell-free riboswitches that regulate translation in response to LS4 or LS12. In addition, by using the orthogonal protein-responsive switches, we generated an AND logic gate that outputs staphylococcal γ-hemolysin in cell-free system and carried out hemolysis assay and calcein leakage assay using rabbit red blood cells and artificial cells, respectively.

Introduction
Archaeal L7Ae and its bacterial and eukaryotic homologs are RNA-binding proteins (RBP) that bind to a structural motif called a kink-turn (k-turn) [1–4] found in various RNAs such as ribosomal RNA [5], small nucleolar RNA [6], RNase P RNA [7, 8], intron [9], and untranslated region (UTR) of messenger RNA (mRNA) [10, 11]. Since Saito and colleagues reported protein-responsive riboswitches using the L7Ae and k-turn motif RNA, which function in cell-free protein synthesis (CFPS) system and mammalian cells [12], the RNA–RBP pair has been widely utilized for the development of biological tools (e.g. cell-free riboswitch [12, 13], mRNA delivery [14, 15], mRNA loading into exosomes [16], mammalian RNA switch [17–27], and synthetic nanostructure [28–32]) as a building block. Engineering natural RNA–RBP pairs to generate orthogonal pairs can significantly expand the utility of these tools as well as provide fundamental insights into RNA–RBP interaction. Therefore, we previously developed a novel directed evolution method, PD-SELEX (Phage Display coupled with Systematic Evolution of Ligands by EXponential enrichment), to alter and improve the binding selectivity of the existing RNA–RBP pairs (i.e. k-turn motif RNA and L7Ae) [33]. The PD-SELEX is a library-versus-library in vitro selection method, and we conducted the PD-SELEX using N20 RNA library and phage-displayed L7Ae mutant library. As a result, we identified two orthogonal RNA–RBP pairs, CS1–LS4 and CS2–LS12, where CS and LS are abbreviations of Consensus Sequence and L7Ae Scaffold, respectively. The two RNA–RBP pairs exhibited strong binding affinity (KD = 7 pM) and high selectivity with >4000-fold difference in KD. Here, we report the co-crystal structures of the lab-coevolved CS1–LS4 and CS2–LS12 pairs that provide the structural basis of the ultra-high affinity and selectivity of the engineered RNA–RBP pairs. Despite their potential utility, synthetic biology applications of the CS1–LS4 and CS2–LS12 pairs remain scarce: while CS1–LS4 pair has been reported to function in mammalian cells [34], applications of these pairs in the CFPS systems have not been reported. With the help of the determined structural information of the engineered RNA–RBP pairs, we designed and created two protein-responsive cell-free riboswitches that are orthogonal to each other. The orthogonal cell-free riboswitches enable programmable and tunable control over gene expression in a test tube and inside artificial cells encapsulating the CFPS system [35–38]. Although reports on the cell-free riboswitches that respond to small molecules are relatively well documented [39–41], only three papers have been reported on cell-free protein-responsive riboswitches [12, 13, 42]. Using the PURE (Protein synthesis Using Recombinant Elements) system, a prokaryotic CFPS system reconstituted with defined components (small molecules and recombinant proteins only with exception of native ribosomes and transfer RNAs from Escherichia coli) [43, 44], we have identified LS4- and LS12-responsive cell-free riboswitches with high ON/OFF ratios exceeding 100. To the best of our knowledge, these cell-free riboswitches exhibit the highest ON/OFF ratios reported to date. Furthermore, by utilizing the orthogonality of the riboswitches, we report the construction of an AND logic gate to control functional expression of the two-component γ-hemolysin (γHL) in vitro.
Materials and methods
Protein expression and purification for structural analysis
LS4 and LS12 genes were subcloned into pE-SUMOpro vector (LifeSensors), which encodes an N-terminal His6-SUMO tag. The LS4 and LS12 were expressed in E. coli strain BL21-CodonPlus (DE3)-RIL (Agilent Technologies) at 20°C overnight after induction with 0.5 mM isopropyl-β-D-thiogalactopyranoside (IPTG). The cells were collected by centrifugation, and pellets were resuspended in lysis buffer (containing 50 mM Tris–HCl, pH 8.0, 500 mM NaCl) and stored at -80°C until use. The cells were disrupted by sonication, followed by centrifugation to remove cell debris. The soluble fraction was applied to a Ni-NTA agarose column and thoroughly washed with lysis buffer containing 20 mM imidazole-HCl. The target SUMO fusion protein was eluted with lysis buffer containing 400 mM imidazole-HCl. The eluted solution showed that absorbance at 260 nm (A260) is higher than absorbance at 280 nm (A280), indicating that endogenous nucleic acids from E. coli were co-purified with recombinant proteins. Therefore, we here used multi-step chromatography to prepare nucleic acid-free LS4 and LS12 proteins. The fusion protein was cleaved overnight with 0.2 mg of Ulp1 protease and dialyzed against a buffer containing 50 mM Tris–HCl, pH 8.0, 100 mM NaCl, and 0.5 mM tris(2-carboxyethyl)phosphine (TCEP). The protein was then loaded onto a HiTrap SP column (Cytiva). We found that high A260 fractions (contaminated nucleic acid) were eluted as a flow-through fraction, whereas nucleic acid-free forms of the LS4/LS12 proteins were eluted by a linear gradient from 0.1 to 1.0 M NaCl in 50 mM Tris–HCl, pH 8.0 (Supplementary Fig. S1, left panels). Peak fractions containing the target proteins were pooled and the proteins were purified further using a HiLoad 16/60 Superdex 75 pg (Cytiva), equilibrated with 50 mM Tris–HCl, pH 7.0, 100 mM NaCl, and 0.5 mM TCEP. Final purified LS4 and LS12 proteins were concentrated using an ultrafiltration device (Amicon Ultra 15 ml filter, 10 kMWCO, Merck-Millipore) to 1030 μM and 1290 μM, respectively. The absence of nucleic acid contamination was confirmed by an absorbance spectrum measurement (A260/A280 ratio is around 0.6).
For crystallization of the CS1–LS4 complex, LS4 protein (Supplementary Table S2) was purified further using a Superdex 75 10/300 GL column (Cytiva), equilibrated with 50 mM Tris–HCl, pH 8.0, 200 mM NaCl, and 1 mM TCEP. CS1 RNA refolding solution contains 100 μM CS1 RNA, 50 mM Tris–HCl, pH 8.0, 200 mM NaCl, and 5 mM MgCl2. The CS1 RNA (5′-GGUGGCAGAGAAAGGCGAAAGCCUUGUGAGGCCAUC-3′; Hokkaido System Science Co., Ltd., desalting grade) was prepared by heating to 70 °C for 5 min, and then slowly cooling to room temperature. The RNA was purified using a Superdex 75 10/300 GL column equilibrated with 50 mM Tris–HCl, pH 8.0, 200 mM NaCl, and 5 mM MgCl2. Peak fractions containing the LS4 or CS1 were pooled and concentrated using an ultrafiltration device (Amicon Ultra 15 ml filter, 10 kMWCO, Merck-Millipore) to 775 and 800 μM, respectively. RNA and protein were mixed at a ratio of ∼1:1 (final concentration: 400 and 387 μM, respectively), and the mixture was incubated at 4°C for 2 h. For crystallization of the CS2–LS12 complex, the CS2 RNA refolding and purification is the same procedure as that of CS1 RNA. Finally, CS2 (5′-GGAUGCAGAGAACGAAAGUUCCAUGACGCAUCC-3′, Hokkaido System Science Co., Ltd., desalting grade) was concentrated to 780 μM. RNA and protein were mixed at a ratio of ∼1:1 (final concentration: 390 and 375 μM, respectively), and the mixture was incubated at 4°C for 2 h.
Crystallization, data collection, structure determination, and refinement
Crystallizations were performed by the sitting-drop vapor diffusion method at 4°C. Sitting drops contained 250 nl of samples mixed with 250 nl of reservoir solution. Crystals of LS12 alone were obtained in a reservoir solution (1600 mM sodium citrate tribasic). Crystals of CS1–LS4 were complex obtained in reservoir solution [4% (v/v) TacsimateT M pH 4.0 and 12% (w/v) polyethylene glycol 3350]. Crystals of CS2–LS12 were complex obtained in reservoir solution (40% PEG300, 100 mM sodium cacodylate/hydrochloric acid, pH 6.5, 200 mM calcium acetate). Prior to data collection, crystals of CS1–LS4 were transferred to a cryoprotectant solution containing 20% glycerol and flash cooled to −180°C, while crystals of CS2–LS12 and LS12 alone were picked up from crystal trays and flash cooled to −180°C, as all crystals were grown in cryoprotectant condition. X-ray diffraction data were collected at beamline BL45XU at Spring-8 (Hyogo, Japan) and 100 K with a wavelength of 1.000 Å. These data were processed using the ZOO system [45, 46]. Phases were determined by molecular replacement using the program Phaser [47] and search models of the previously determined Archaeoglobus fulgidus L7Ae (PDB ID: 4BW0). Model building was carried out using the program Coot [48]. The program Phenix.refine [49] was used for refinement. The structures displayed good geometry when analyzed by MolProbity [50]. X-ray data collection statistics and refinement statistics are shown in Supplementary Table S1.
Preparation of protein ligands for riboswitch assay
LS4 and LS12 genes were subcloned into pET-His-SUMO vector. LS4 and LS12 mutants were generated by inverse polymerase chain reaction (PCR) using a high-fidelity DNA polymerase (Q5 High-Fidelity 2× Master Mix, New England Biolabs). The DNA sequences were verified by Sanger sequencing. The bacterial expression vectors were transformed into E. coli BL21 (DE3) cells (ChampionTM 21, SMOBiO Technology). Transformants were pre-cultured overnight at 37°C in 2 ml of LB medium (Miller) supplemented with 100 mg ml−1 ampicillin and transferred to 200 ml fresh Lysogeny Broth (LB) medium supplemented with 50 mg ml−1 ampicillin. After 5 h culture at 37°C, IPTG was added at a final concentration of 0.5 mM, and the cells were further cultured at 20°C for 20 h. The cells were harvested by centrifugation at 5000 × g for 5 min at 4°C and suspended in 20 ml ice-cold His-tag purification buffer A (50 mM Tris–HCl, pH 8.0, 500 mM NaCl, 1 mM 2-mercaptoethanol, and 10 mM imidazole-HCl). After cell disruption by ultrasonication, Triton X-100 [0.5% (v/v)], MgCl2 (5 mM), and TURBOTM DNase (10 units, Ambion) were added to the crude cell extracts and incubated for 30 min at room temperature. The cell extracts were centrifuged at 10 000 × g for 10 min at 4°C to remove cell debris. Soluble proteins were loaded onto TALON metal affinity resin (2 ml, Clontech) in an Empty Gravity Flow Column (Bio-Rad) and washed with 12 ml His-tag purification buffer A. His-tagged proteins were eluted by 8 ml His-tag elution buffer (50 mM Tris–HCl, pH 8.0, 500 mM NaCl, 1 mM 2-mercaptoethanol, and 200 mM imidazole-HCl). The eluents were desalted and concentrated using an ultrafiltration device (Amicon Ultra 0.5 ml filter, 10 kMWCO, Merck-Millipore) and buffer-exchanged into a low-salt buffer (20 mM Tris–HCl, pH 8.0, 100 mM NaCl, and 1 mM dithiothreitol). The proteins were further purified with a cation-exchange column (Mono S 4.6/100PE, GE Healthcare) using ÄKTA pure 25 (GE Healthcare) with mobile phase A (50 mM Tris–HCl, pH 8.0, 100 mM NaCl, and 1 mM dithiothreitol) and mobile phase B (50 mM Tris–HCl, pH 8.0, 1 M NaCl, and 1 mM dithiothreitol): a linear gradient of 0%–100% mobile phase B was used. The purified proteins were again desalted and concentrated using an ultrafiltration device (Amicon Ultra 0.5 ml filter, 3 kMWCO or 10 kMWCO, Merck-Millipore) and buffer-exchanged into protein storage buffer (10 mM Tris–HCl, pH 8.0, 200 mM NaCl, 1 mM dithiothreitol, and 5% (v/v) glycerol). The purities of the proteins were confirmed by absorption measurement and by sodium dodecyl sulfate–polyacrylamide gel electrophoresis (SuperSepTM Ace 15%, Wako-Fujifilm) followed by Coomassie blue staining (AE-1340 EzStain AQua, ATTO). The concentrations of the recombinant proteins were determined using the standard curve generated from the band intensities of serially diluted bovine serum albumin. The protein sequences are shown in Supplementary Table S2.
Plasmid DNA construction and preparation of template DNAs for CFPS
Riboswitch sequences (Supplementary Table S4) were inserted into just under the T7 promoter and the upstream of the enhanced green fluorescent protein (EGFP) gene in pIVEX2.3-EGFP [40]. At first, pIVEX2.3-CS1-ON1-EGFP and pIVEX2.3-CS2-ON1-EGFP vectors were generated, and then other riboswitch variants were generated by using Q5 site-directed mutagenesis kit (New England Biolabs). Hlg2 and LukF genes [51] were subcloned into pIVEX2.3-CS1M2b-3c-ON8 and pIVEX2.3-CS2-3c vectors, respectively. The plasmid DNA sequences were verified by Sanger sequencing.
Template DNAs (Supplementary Table S3) for the CFPS were prepared by 25 cycles of PCR using a high-fidelity DNA polymerase (KOD Plus Neo, TOYOBO). The used primer sets are shown in Supplementary Table S3. The PCR products were purified with the NucleoSpin Gel and PCR Clean-up kit (MACHEREY-NAGEL) and eluted with ultrapure water. DNA concentrations were determined by absorbance at 260 nm.
In vitro transcription-coupled translation from DNA template
The reconstituted CFPS system (PUREfrex® 1.0) was purchased from Gene Frontier. The CFPS reactions (10 μl) were carried out as described previously [40]. We mixed the following reagents in a 0.2 ml PCR tube: 1.5 μl ultrapure water, 5 μl solution I, 0.5 μl solution II, 0.5 μl solution III, 0.5 μl Alexa FluorTM 647 hydrazide (1 μM), 1 μl protein ligand (10×), and 1 μl template DNA (50 ng μl−1). The tubes were incubated for 3 h at 37°C in a thermal cycler (T100, Bio-Rad), and then equal volumes (10 μl) of PBS-T [D-PBS(-) supplemented with 0.1% (v/v) polyoxyethylene(20) sorbitan monolaurate (Tween-20)] were added. Eighteen microliters of the mixture was transferred to a 384-well black plate, and fluorescence (490 nm excitation, 520 nm emission for EGFP; 650 nm excitation, 680 nm emission for Alexa FluorTM 647) was measured by a microplate reader (Synergy H1, BioTek). Background fluorescence was subtracted from the EGFP and Alexa FluorTM 647 fluorescence values. EGFP fluorescence was normalized by Alexa FluorTM 647 fluorescence to account for well-to-well variations. Relative fluorescence values were calculated from the EGFP expression level without riboswitch as a standard. The ON/OFF ratio was calculated from the EGFP/Alexa FluorTM 647 value with protein ligand, divided by the value without protein ligand. Bar graphs and data plots were generated using GraphPad Prism software 10 (GraphPad Software), and nonlinear curve fitting was carried out using the following equation.
Ymin: minimum fluorescence signal
Ymin: maximum fluorescence signal
EC50: half maximal effective concentration of ligand
X: molar concentration of ligand
h: Hill coefficient
Hemolysis assay
Heparin-treated rabbit red blood cells (RBCs) were purchased from Nacalai Tesque. The RBCs (20 ml) were once centrifuged at 2000 × g for 10 min at 4°C, and then the supernatant was removed. The precipitated RBCs were resuspended in 25 ml of D-PBS(-). The buffer exchange was repeated twice. Finally, RBCs were resuspended in 20 ml D-PBS(-), stored at 4°C, and used within one month.
Hemolysin was synthesized by the CFPS(PUREfrex® 1.0) in a 10 μl scale: ultrapure water (2.5 μl), solution I (5 μl), solution II (0.5 μl), solution III (0.5 μl), 50 μM LS4 (0.5 μl), 50 μM LS12 (0.5 μl), and 100 nM template DNA (0.5 μl). The tubes were incubated for 3 h at 37°C. Hemolysis assay was carried out in a 0.2 ml PCR tube: RBC (95 μl), CPFS products (5 μl). To prepare the positive hemolysis control, polyoxyethylene(10) octylphenyl ether (TritonTM X-100) was added to the RBCs at a final concentration of 1% (v/v). The tubes were incubated at 37°C for 10 min. After centrifugation at 8000 × g for 5 min at 4°C, the supernatants were subjected to absorption measurement using a NanoDrop photospectrometer (Thermo Fisher Scientific). Absorbance at 541 nm was used for the evaluation of hemolysis. The bar graph was generated using GraphPad Prism software 10 (GraphPad Software).
Statistical analysis
Prism software (GraphPad Software) was used for the statistical analyses. The analytical information (standard deviation, n values, and analytical methods) are shown in figure legends.
Results and discussion
Structural determination of CS1–LS4 and CS2–LS12 complexes
We determined the crystal structures of LS4 and LS12 in complex with the cognate RNA aptamer, CS1 and CS2, respectively (Fig. 1A). The structures revealed that both LS4 and LS12 proteins possess nearly identical structures to the parental protein, A. fulgidus L7Ae [LS4: root mean square deviation (RMSD) value of 0.39 Å over 108 Cα atoms, LS12: RMSD value of 0.44 Å over 106 Cα atoms] (Fig. 1B). Furthermore, CS1 and CS2 RNAs adopt a k-turn-like conformation characterized by a helix–internal loop–helix structure and a pronounced kink along the helical axis at the internal loop (Fig. 1C). The nucleotidenumbering follows a systematic nomenclature for k-turn nucleotides [52]. In the standard k-turn, a three-nucleotide bulge (L1–L3) is present, with the L3 base oriented toward the solvent (flipped out) and making no contact with the k-turn RNA. The strand containing the L1–L3 bulge is designated the “bulge” side, while the opposite strand is termed the “nonbulge” side. Each crystal structure contained two independent crystallographic complexes within an asymmetric unit (Supplementary Fig. S2).
![Characteristics and overall structures of the CS1–LS4 complex and the CS2–LS12 complex. (A) The overall structures of the CS1–LS4 complex (left, PDB ID: 9L6X) and the CS2–LS12 complex (right, PDB ID: 9L6Y). The models of the complexes are shown as cartoon representations. (B) Comparison between A. fulgidus L7Ae, lab-coevolved LS4, and LS12 proteins. Superimposition of overall structures of the L7Ae (light orange), LS4 (green), and LS12 (cyan) (left), close-up views of the region randomized in PD-SELEX (middle), and the sequences selected by PD-SELEX (right). The PD-SELEX-identified residues are colored in magenta. (C) Comparison of typical k-turn (Haloarcula marismortui Kt-7, PDB ID: 4BW0) [53], CS1, and CS2 RNAs. The overall structures (left) and the interaction network of each RNAs' internal loop (right) are shown as cartoon representations and secondary sequence structures. The internal loop is colored in blue, and the LS4-specific loop regions are shown in red. In the secondary sequence structures, Watson–Crick base pairs and non-Watson–Crick base pairs are shown as solid and dashed lines, respectively. Bases with their edges facing each other are shown as open circles. The circled nucleotides are oriented toward the solvent.](https://oup.silverchair-cdn.com/oup/backfile/Content_public/Journal/nar/53/6/10.1093_nar_gkaf212/3/m_gkaf212fig1.jpeg?Expires=1748006345&Signature=J8QsvY~64yBOTw3HrChcTbxOuPHHj5CXdtoEuVpwSJPwNDU1Uvskejoc8ETzS~7j5s9keP1v5uK9j8PqEu-6w71oX6KhaBn0zMUNoxWc6Iqtjdks1KLrlwmRgyw4SBQ0BziVW4sX-vY5nWQ6wZWahRhrd61iKNenOFmaI7w0gOACn7VaUmIYqHNpkwGBb6SZFpSMKW8~kXB8~XlhAiEOlOYzViC9Wg6h4Ctseum5nPx5y1uQ7hT8v5pDLXZjXDXEbQ3r-7TuGu9zDFPphTRGkZwrNkpNNycwlpO-W~p7Hb6Pgu8wmpSz1ZaN-TCK63VY~gzEenvaox4ndw5bjyfAcA__&Key-Pair-Id=APKAIE5G5CRDK6RD3PGA)
Characteristics and overall structures of the CS1–LS4 complex and the CS2–LS12 complex. (A) The overall structures of the CS1–LS4 complex (left, PDB ID: 9L6X) and the CS2–LS12 complex (right, PDB ID: 9L6Y). The models of the complexes are shown as cartoon representations. (B) Comparison between A. fulgidus L7Ae, lab-coevolved LS4, and LS12 proteins. Superimposition of overall structures of the L7Ae (light orange), LS4 (green), and LS12 (cyan) (left), close-up views of the region randomized in PD-SELEX (middle), and the sequences selected by PD-SELEX (right). The PD-SELEX-identified residues are colored in magenta. (C) Comparison of typical k-turn (Haloarcula marismortui Kt-7, PDB ID: 4BW0) [53], CS1, and CS2 RNAs. The overall structures (left) and the interaction network of each RNAs' internal loop (right) are shown as cartoon representations and secondary sequence structures. The internal loop is colored in blue, and the LS4-specific loop regions are shown in red. In the secondary sequence structures, Watson–Crick base pairs and non-Watson–Crick base pairs are shown as solid and dashed lines, respectively. Bases with their edges facing each other are shown as open circles. The circled nucleotides are oriented toward the solvent.
Comparison of our structures of the complexes with the parental L7Ae-canonical k-turn (Kt-7) complex revealed their similarities and differences (Supplementary Fig. S3). The L7Ae protein makes interactions highly selective for the k-turn structure through nonspecific backbone interactions, specific interactions by Asn33 and Glu34 residues with the base portions of guanine (1b and 2n) in the conserved G•A pairs in k-turn, and interactions by a loop (residues 88–92) with the L1 and L2 bases [53]. The nonspecific backbone interactions are also observed in our structures of the complexes. Although the 34th residue of L7Ae and the conserved G•A pair in the k-turn were randomized in PD-SELEX, the G•A pairs were maintained in CS1 and CS2 RNAs and the selected 34th residues (Arg34 of LS4 and Ser34 of LS12) retained their specific interactions with the G nucleobase (Supplementary Figs S3 and S4). Since the G•A pairs are conserved in k-turn RNAs, the loss of this non-Watson–Crick base pairing influences the k-turn-like structure of CS1 and CS2. Therefore, the recognition of their G nucleobases is a structural basis for the k-turn-like RNA structures bound by LS4 and LS12. In contrast, the loop (residues 88–92) is another region randomized in PD-SELEX, resulting in major differences between parental L7Ae-RNA and PD-SELEX-evolved pairs. This difference could be the structural reason why LS4 and LS12 bind specifically to the cognate RNA aptamers CS1 and CS2, respectively.
In the CS1–LS4 complex, the CS1 RNA contains five nucleotides on both the bulge and nonbulge sides of the internal loop (Fig. 1C, CS1), by having one nucleotide fewer on the bulge side and two nucleotides more on the nonbulge side, which differs from typical k-turn RNAs. Consequently, the L2 nucleotide is absent on the bulge side, while Ln1 and Ln2 on the nonbulge side form a larger loop compared to Kt-7 [53] and CS2 RNAs. The A(Ln2) base is oriented toward the solvent. Notably, a unique feature of this complex is the formation of non-Watson–Crick base pairs between G(L1) and G(Ln1), which is not observed in the other k-turn RNAs (Fig. 1C, CS1). The structure of the complex revealed that LS4 captures this distinctive G(L1)–G(Ln1) base pair through the Trp89 residue (Fig. 2A, upper panel; Supplementary Fig. S4), which was introduced through PD-SELEX [33]. Trp89 aligns parallel to the G(L1)–G(Ln1) base pair, forming a stacking interaction. The loop (substituted residues 88–92) in the LS4, including Trp89, has a different conformation than the loops in the L7Ae and LS12 (Fig. 1B). Therefore, interaction with the distinctive G(L1)–G(Ln1) base pair by the PD-SELEX-identified Trp89 residue is the molecular basis underlying the highly specific recognition of CS1 RNA by the LS4 protein.

RNA–protein interactions of the CS1–LS4 complex and the CS2–LS12 complex. (A) Close-up view of the RNA-binding mode specific to LS4 or LS12. LS4 and LS12 proteins are shown in green and cyan, respectively. The PD-SELEX-identified residues and their recognized bases are shown as stick representations. Yellow and blue dashed lines indicate side-chain-RNA base interactions and base-pairing interactions, respectively. Electron density maps for the CS1–LS4 and CS2–LS12 interactions are shown in Supplementary Fig. S4. (B) Schematic representation of the interactions between the LS4 and LS12 and their cognate RNA. The RNA interactions by side chain of amino-acid residues were only displayed. Red solid and green dashed double lines indicate hydrophilic and stacking interactions, respectively. The substituted amino acids are shown in purple. (C) Target RNA selection strategies of LS4 and LS12.
In the CS2–LS12 complex, the CS2 RNA exhibits a base-pair network and a structure that highly resembles canonical k-turn RNA such as Kt-7 [53] (Fig. 1C). The structure of the complex suggested that LS12 restricts the base type at L1 position of the CS2 RNA. The PD-SELEX-identified Arg88 residue form hydrophilic interactions with the C(L1) and A(L2) base portions. In addition, the Arg88 residue seems to make cation-π interaction with G(-1n) base portion (Fig. 2A and 2B, lower panel; Supplementary Fig. S4). These interaction networks fix the position of the Arg88 guanidino group. Therefore, the L1 position is restricted to pyrimidine bases, because a purine at the L1 position would cause steric hindrance with Arg88. Additionally, the substituted Tyr90 residue establishes a hydrophilic interaction with the phosphate moiety of the A(L2) nucleotide (Fig. 2A and 2B, lower panel); an interaction was introduced by the PD-SELEX, as the corresponding residue in the L7Ae is valine. Notably, we discovered that LS12 forms a homodimer in an RNA-free state, a characteristic not observed in LS4 or previously reported for L7Ae. Mutations of Lys37 and Glu40 in the L7Ae to Leu37 and Ser40 in LS12 enable the formation of dimer interfaces (Supplementary Fig. S5A). This homodimeric configuration sterically inhibits the RNA aptamer binding, since the second molecule would clash with the location of the k-turn RNA stem helix (Supplementary Fig. S5C). In the structure of the CS2–LS12 complex, this homodimer formation is eliminated upon target CS2 RNA binding. Consequently, the homodimer formation might interfere with binding to low-affinity nontarget k-turn RNAs. The molecular basis of LS12’s high specificity appears to be based on two key mechanisms: (i) pyrimidine restriction at the L1 position and (ii) binding interference for low-affinity RNA by homodimerization in the RNA-free state. Our structures of the complexes provide compelling insights into the co-evolution of LS proteins and RNA aptamers through PD-SELEX (Fig. 2C).
Protein-responsive cell-free riboswitches
To utilize the orthogonal RNA–RBP pairs for cell-free system, we next designed protein-responsive riboswitches by utilizing the CS1 and CS2 aptamers: the RNA aptamer sequence (CS1 or CS2) for protein ligand (LS4 or LS12) was inserted immediately downstream of the T7 promoter, and a leader sequence which is partially complementary to the aptamer (i.e. anti-aptamer) was inserted further downstream of the ribosome-binding site (RBS) (Supplementary Fig. S6). In this design, in the absence of the cognate protein ligand, the RBS is masked by the 3′ tail of the aptamer and the following few bases, thus ribosomal translation is impeded. When the cognate ligand is present, the aptamer domain in the 5′ UTR of the mRNA forms a complex with the ligand, and the open conformation of the RBS allows ribosomal translation of the downstream gene (Fig. 3A). First, we optimized the strength of the complementary sequence against the RBS and examined the performances of the riboswitches by using the PURE system in the presence (10 μM) or absence of the cognate protein ligand. We used EGFP as a reporter gene and tested LS4 protein-responsive riboswitch variants (Fig. 3B, ON1 to ON8). Among the eight tested variants, CS1-ON3 and CS1-ON8 riboswitches exhibited moderate ON/OFF ratios of 23 and 26, respectively. Second, we optimized the stem length of the CS1 aptamer by adding or removing a single nucleotide at the 5′ end of the CS1-ON3 and CS1-ON8 riboswitches (Fig. 3B, CS1-3c-ON3, CS1-5c-ON3, CS1-3c-ON8, and CS1-5c-ON8). The basal expression level of the CS1-3c-ON8 riboswitch-regulated EGFP reporter was ∼0.5% compared to that of the control without riboswitch, but the expression was upregulated 319-fold by the cognate protein ligand LS4, which corresponds to 160% of the control (Fig. 3B). The reason for the higher expression level compared to the control is presumably due to the insertion of the leader sequence (anti-CS1), which is translated as an N-terminal-tag (Supplementary Table S4). Next, we tested LS12 protein-responsive riboswitch variants (Fig. 3C, ON1 to ON8) and found that the CS2-ON3 and CS2-ON8 variants were promising (ON/OFF ratios = 46 and 26, respectively). Using the same strategy, we designed and tested CS2-ON3 and CS2-ON8 variants with a different stem length and found that CS2-3c-ON8 riboswitch exhibits a high ON/OFF ratio of 499 (Fig. 3C). The basal expression level of the CS2-3c-ON8 riboswitch was 0.7% compared to that of the control without riboswitch, and the maximum expression level was 330% of the control. The high ON/OFF ratios of the CS1-3c-ON8 and CS2-3c-ON8 riboswitches were also observed in real-time measurements of EGFP reporter expression (Supplementary Fig. S7).
![Development of protein-responsive cell-free riboswitches. (A) Schematic illustration of translational riboswitch. The riboswitch sequences are shown in Supplementary Fig. S6 and Supplementary Table S4. The structure of EGFP (PDB ID: 2Y0G) [81] was generated using Mol* Viewer [82]. Screenings of (B) LS4- and (C) LS12-responsive riboswitch variants. The bar graphs show the mean of three independent experiments (n = 3) with error bars representing the geometric standard deviation. Relative expression levels (%) of reporter protein (EGFP) are calculated by normalization with no riboswitch control in the absence of protein ligand. Numbers (red) above the bars indicate the ON/OFF ratio.](https://oup.silverchair-cdn.com/oup/backfile/Content_public/Journal/nar/53/6/10.1093_nar_gkaf212/3/m_gkaf212fig3.jpeg?Expires=1748006345&Signature=kQ8F5~JqHUl3u65-wApVfWdbTx~jzs9MeARTWckO6r0kmKFSYVotw7gTS824PdN0IqBVsgE0vOWg7quX0uzHbki2AQ3Yz3lo--IkoADd4AdLP~RTYxTFZmiofA20Laza3HVtv~uqKPnNLX6XoMY2c62QMm1B5Nen47F9ku4adatMVX51fPQmpS5vkfSjVztDHWI6UEbuu88oIO0DfWpvAaeEQ9CJAhvwAtUPj7EPXa4D1hd4s5FKFHSS9g734a9UCakXLNQ5CansTXUnOs-vr8lnMVHRA2DxSaLJXknhuVKdmJNN8gvAbpbq74k7W3UfchRuteAtX52066lRwP3-Jw__&Key-Pair-Id=APKAIE5G5CRDK6RD3PGA)
Development of protein-responsive cell-free riboswitches. (A) Schematic illustration of translational riboswitch. The riboswitch sequences are shown in Supplementary Fig. S6 and Supplementary Table S4. The structure of EGFP (PDB ID: 2Y0G) [81] was generated using Mol* Viewer [82]. Screenings of (B) LS4- and (C) LS12-responsive riboswitch variants. The bar graphs show the mean of three independent experiments (n = 3) with error bars representing the geometric standard deviation. Relative expression levels (%) of reporter protein (EGFP) are calculated by normalization with no riboswitch control in the absence of protein ligand. Numbers (red) above the bars indicate the ON/OFF ratio.
Next, to test the hypothesis that the RNA-binding surface residues of the LS4 and LS12 proteins are involved in the recognition of the cognate RNA aptamer, we carried out riboswitch activity-based protein mutant analyses (Supplementary Fig. S8). LS4 W89A mutant exhibited significantly reduced EGFP reporter expression (∼9% induction over the wild type), whereas other tested mutants, R34A and P90A, had no impact on the reporter expression levels, suggesting that the W89 residue strongly contributes to the interaction with the CS1 aptamer. On the other hand, the tested LS12 R88A and Y90A mutants exhibited significantly reduced reporter expression (∼14% and 6% inductions, respectively, over the wild type), suggesting that both residues are involved in the CS2 aptamer binding. Overall, these results are consistent with the structural analyses (Fig. 2).
Our cell-free riboswitch design is based on a kinetic trap mechanism [54–56]. Riboswitches are co-transcriptionally folded from the 5′ side and trapped into the OFF structure. However, in the presence of a cognate ligand, the 5′ aptamer domain is stabilized by the ligand, and the riboswitch structure is trapped into the ON structure. Since we added the DNA template and the protein ligand to the CFPS system at the start of the reaction (see Materials and Methods), the protein ligand can quickly bind to the aptamer domain during transcription before the riboswitch is trapped into the OFF structure. To confirm the mechanism, we tested the switching activity of CS1-3c-ON8 and CS2-3c-ON8 riboswitches from mRNA templates (Supplementary Fig. S9A). The induction levels (ON/OFF ratios) were 8 and 2, respectively. Thus, we found that the riboswitches hardly respond to the ligand once the riboswitch is trapped into the OFF structure (Supplementary Fig. S9B). The CS1-3c-ON8 and CS2-3c-ON8 riboswitches exhibited significantly enhanced performances based on the ON/OFF ratios compared to the reported cell-free riboswitches [12, 36, 39–42, 54, 57–59]. Previously, we developed cell-free riboswitches which respond to a small molecule (histamine, ASP2905, and theophylline) based on the same riboswitch design strategy [36, 40]. However, the ON/OFF ratios never exceeded 40. We speculate that not only fine-tuning of the riboswitch structures but also fast associations (kon values, CS1–LS4: 2.00 × 107 M−1 s−1, CS2–LS12: 4.96 × 107 M−1 s−1) and slow dissociations (koff values, CS1–LS4: 1.36 × 10–4 s−1, CS2–LS12: 3.26 × 10–4 s−1) [33] of the RNA aptamers may have contributed to more stable kinetic trapping of the riboswitches, resulting in the high observed ON/OFF ratios. However, a more systematic study would be necessary in the future to reveal the impact of the binding kinetics on the riboswitch performance.
Next, we tested the dose-dependent response of the CS1-3c-ON8 riboswitch by the LS4 protein ligand, and the median effect concentration (EC50) was estimated to be 0.8 μM (Fig. 4A). We also tested the cross-reactivity of the LS12 protein against the CS1-3c-ON8 riboswitch and observed that the EGFP reporter expression was induced at 0.1–10 μM range (Fig. 4A). In our previous study, we found that LS12 protein binds to CS1 RNA aptamer with a KD of 28 nM. Although the interaction between the LS12 protein and CS1 aptamer is ∼4000-fold weaker than that of the CS1–LS4 [33], the cross-reactivity is not zero in the presence of high concentrations of LS12 protein. To suppress the cross-reactivity of LS12 protein and improve the orthogonality of the riboswitches, we decided to utilize CS1M2b aptamer, which contains a single point mutation in the internal loop (Fig. 4B). Our previous surface resonance plasmon analyses showed that CS1M2b aptamer exhibits decreased binding affinity for both LS4 and LS12 proteins (KD values, LS4: 87 pM, LS12: 110 nM) [33]. As expected, the cross-reactivity of LS12 was significantly suppressed at least within the tested protein concentrations, although the maximum reporter expression level by the LS4 protein ligand decreased to 149-fold (Fig. 4B). Next, we tested the CS2-3c-ON8 riboswitch and found that the EGFP reporter expression was dose-dependently induced by the LS12 protein ligand (EC50 = 0.6 μM), and no cross-reactivity with the LS4 protein was observed (Fig. 4C).

Dose-dependent responses and cross-reactivities of (A) CS1-3c-ON8, (B) CS1M2b-3c-ON8, and (C) CS2-3c-ON8 riboswitches against the protein ligands. The plots are the mean of two independent experiments (n = 2) with error bars representing the standard deviation. The solid lines indicate a nonlinear curve fitting. The sequence of CS1M2b riboswitch is shown in Supplementary Table S4.
AND logic gate to control γHL expression
We demonstrated the functions of the CS1M2b-3c-ON8 and CS2-3c-ON8 riboswitches with the EGFP reporter gene. To demonstrate the robustness of the orthogonal riboswitches, we next carried out the gene expression control of Staphylococcus aureus Hlg2 and LukF genes [60]. The Hlg2 and LukF are components of γHL, which is an octameric nano pore-forming toxin [51]. We here aimed to design two-input translational AND gate and cloned the Hlg2 and LukF genes into downstream of the CS1M2b-3c-ON8 and CS2-3c-ON8 riboswitches, respectively (Fig. 5, Gene 1 and 2). The AND logic circuit can output the γHL when both inputs, LS4 and LS12 are present. We expressed the riboswitch-regulated genes using the PURE system and then mixed the products with rabbit RBCs that can sensitively detect the γHL activity [60]. No hemolysis was observed when the protein ligands were absent, suggesting that the riboswitch regulations were very tight (Fig. 5, Entry 1 and 2). This is consistent with the result that no Hlg2 and LukF expressions were observed by Western blot analysis (Supplementary Fig. S10, Entry 1 and 2). Marked hemolysis was observed when two protein ligands were present (Fig. 5, Entry 5). However, in the presence of only one protein ligand, the observed hemolysis was minimal and comparable to that of the no protein ligand control (Entry 2–4), suggesting that each protein ligand specifically activates the cognate riboswitch. Neither protein ligand alone (Entry 6) nor a single Hlg2/LukF expression (Entry 7–8 and 10–11) induced the hemolysis. We have confirmed that the hemolysis was induced by Hlg2/LukF mix which was pre-expressed separately (Entry 9 and 12), by α-hemolysin (αHL) expression (Entry 13) and Triton-X100 detergent treatment (Entry 14). We next applied the AND circuit to artificial cells encapsulating a high concentration of calcein (Supplementary Fig. S11). Here, we prepared large unilamellar vesicles (LUVs), which are composed of DOPC/cholesterol 1:1 by extruder method [61]. It is known that the high concentration of calcein inside liposomes is self-quenched, and the fluorescence is recovered by calcein release through nanopore formation [62, 63]. As expected, the calcein fluorescence significantly increased only when both protein ligands were present in the PURE system (Supplementary Fig. S11, Entry 2–4). We demonstrated that our cell-free riboswitches can control EGFP as well as Hlg2 and LukF. Taken together, the riboswitches are robust enough to regulate various genes.
![Design and construction of 2-input AND gate controlled by the two protein ligands, LS4 and LS12. The AND gate outputs γHL composed of Hlg2 and LukF. The hemolysis assay was carried out using rabbit RBCs, and the hemolysis was evaluated by absorption measurement at 541 nm. Bar graphs show the mean of three independent experiments (n = 3), and error bars represent the standard deviation. Statistical analyses were performed by one-way analysis of variance (one-way ANOVA) with Dunnett’s multiple comparison test. ****P < .0001. NS: not significant. Protein expressions are shown in Supplementary Fig. S10. Gene 1: CS1M2b-3c-ON8 riboswitch-regulated Hlg2, Gene 2: CS2-3c-ON8 riboswitch-regulated LukF, Gene 3: unregulated Hlg2, Gene 4: unregulated LukF, Gene 5: unregulated α-hemolysin. Entry 14: 1% (v/v) Triton X-100 treatment was used as a positive control for hemolysis. The dotted line square indicates that Hlg2 and LukF were synthesized separately by the PURE system and then mixed. The structure of γHL (PDB ID: 3B07) [51] was generated using Mol* Viewer [82].](https://oup.silverchair-cdn.com/oup/backfile/Content_public/Journal/nar/53/6/10.1093_nar_gkaf212/3/m_gkaf212fig5.jpeg?Expires=1748006345&Signature=nCkyon70sRmBDbWV0W5f07qdCNMrbkBPXuMLdF6rcFC7qFmvKSbNg25ZYSGE8mPhHb5ohnAzliDuZoN-2Ug6QLFLLgOkPyjMW-GpICubfkXvyZSnwhQorTBwVudOSOTnBGlq~sgnYqG0sXBP3Gzl5O6dwrDltnGMqvdr49BkKrT3EhbeB4X8QLe1IXBY~Gd427EAETOnGjS0MiK9MCREWjVPy5-U0wffH3sGkT9WjyUIF9UrzVrLaZz-Fmj8BasWV15NS07NsYJwoW8YGbkgWOLHIMsAAM9oJW5FGIU05YZ2DdxZn4JQNgb33k06BwYnfLH3DkLeNGs2TUHTHDGx~g__&Key-Pair-Id=APKAIE5G5CRDK6RD3PGA)
Design and construction of 2-input AND gate controlled by the two protein ligands, LS4 and LS12. The AND gate outputs γHL composed of Hlg2 and LukF. The hemolysis assay was carried out using rabbit RBCs, and the hemolysis was evaluated by absorption measurement at 541 nm. Bar graphs show the mean of three independent experiments (n = 3), and error bars represent the standard deviation. Statistical analyses were performed by one-way analysis of variance (one-way ANOVA) with Dunnett’s multiple comparison test. ****P < .0001. NS: not significant. Protein expressions are shown in Supplementary Fig. S10. Gene 1: CS1M2b-3c-ON8 riboswitch-regulated Hlg2, Gene 2: CS2-3c-ON8 riboswitch-regulated LukF, Gene 3: unregulated Hlg2, Gene 4: unregulated LukF, Gene 5: unregulated α-hemolysin. Entry 14: 1% (v/v) Triton X-100 treatment was used as a positive control for hemolysis. The dotted line square indicates that Hlg2 and LukF were synthesized separately by the PURE system and then mixed. The structure of γHL (PDB ID: 3B07) [51] was generated using Mol* Viewer [82].
Conclusions and future perspectives
In summary, we have elucidated the molecular recognition mechanisms of the CS1–LS4 and CS2–LS12 pairs by the X-ray crystal structure and protein mutant analyses (Figs 1-2; Supplementary Fig. S8). The lab-coevolved LS4 and LS12 proteins are structurally similar to the parental protein L7Ae, and the cognate RNA aptamers CS1 and CS2 form k-turn-like structures. In particular, G•A and A•G base pairs, characteristic of k-turn [1, 3], are indeed present in the CS1 [33], however internal loop composed of both sides of five nucleotides (left/right = 5/5) has never been found in the natural k-turn structures so far [52, 64]. We revealed that unique G(L1)–G(Ln1) base pair in the CS1 is stabilized by the Trp89 residue of LS4, giving rise to the specific CS1–LS4 interaction. On the other hand, the specific CS2–LS12 interaction is ensured by multi-point base recognitions and LS12 homodimer formation in an RNA-free state. We expect that our lab-coevolved RNA–RBP pairs will be useful for the development of a variety of molecular devices, not only for cell-free synthetic biology [65–67] but also for mammalian synthetic biology [68–70]. Also, we have successfully designed and developed the protein-responsive ON switches that function in the CFPS system (Fig. 3). Recently, several mammalian riboswitches with triple-digit ON/OFF ratios have been reported [71–75], but cell-free riboswitches with such ON/OFF ratios have not been reported. It should be noted that a small molecule-responsive riboswitch with an ON/OFF ratio of 90, which functions in an eukaryotic CFPS system (wheat germ extract), was recently reported [54]. Remarkably, our protein-responsive cell-free riboswitches, CS1-3c-ON8 and CS2-3c-ON8, exhibited greater ON/OFF ratios of 319 and 499, respectively (Fig. 3). To our knowledge, this is the first case that cell-free riboswitches with ON/OFF ratios >100 have been successfully developed (Supplementary Fig. S12). Because protein-responsive riboswitches are activated by a genetically encodable protein ligand, thus our riboswitches and protein ligand genes can be incorporated into multilayered gene circuits [38, 76, 77] such as cascade circuits. Also, we have demonstrated that the orthogonal protein-responsive riboswitches can be used to control the content release of RBCs and LUVs by modulating the expression of γHL subunits (Fig. 5; Supplementary Fig. S11). Researchers have utilized αHL nanopore for the functional control of artificial cells encapsulating the CFPS system [36, 38, 62, 78–80]. However, there has never been an example of using the two-component γHL nanopore in the CFPS system, highlighting the first attempt. This demonstration of the application of γHL to the CFPS system opens a new dimension in the field of cell-free synthetic biology and related fields.
Acknowledgements
We thank Mayumi Suzuki (OIST) and Nao Miyahira (OIST) for Sanger sequencing. The Hlg2 and LukF genes were a kind gift from Yoshikazu Tanaka (Tohoku University). The synchrotron radiation experiments were performed at SPring-8 (proposal nos. 2022B2549 and 2023A2549), and we thank the staff members of the beamline BL45XU facilities at SPring-8 for their help with data collection.
Author contributions: Conceptualization and project administration: K.F. and Y.K.; Funding acquisition: K.F., T.M., and Y.Y.; Investigation: K.F., T.T., M.N., T.O., and R.K.; Methodology and validation: K.F., T.T., M.N., and T.O.; Resources: T.M., Y.Y., and Y.K.; Supervision: K.F., T.M., Y.Y., and Y.K.; Data curation, formal analysis, visualization, and writing - original draft: K.F. and T.T.; Writing - review & editing: K.F., T.T., T.M., Y.Y., and Y.K.
Supplementary data
Supplementary data is available at NAR online.
Conflict of interest
None declared.
Funding
KAKENHI [19K15701 to K.F., and 21H05228 and 22K21344 to T.M.] from the Japan Society for the Promotion of Science (JSPS); fund from Tokyo Institute of Technology [to K.F. and T.M.]; fund from Institute for Tenure Track Promotion of University of Miyazaki [to K.F.]; Japan Association for Chemical Innovation (JACI) Prize for Encouraging Young Researcher [to K.F.]; Human Frontier Science Program (HFSP) Research Grant [RGP003/2023 to T.M.]; OIST [to Y.Y.]. R.K. is a recipient of the JSPS Research Fellowship for Young Scientists [DC1, 23KJ0969]. Funding to pay the Open Access publication charges for this article was provided by University of Miyazaki.
Data availability
The crystal structures of CS1–LS4, CS2–LS12, and LS12 homodimer in an RNA-free state have been deposited under PDB accession codes 9L6X, 9L6Y, and 9L6Z, respectively. The plasmid DNAs encoding His-SUMO–LS4, His-SUMO–LS12, CS1M2b-3c-ON8-EGFP, and CS2-3c-ON8-EGFP have been deposited to Addgene (ID: 228450, 228451, 228452, and 228453).
References
Author notes
The first two authors should be regarded as Joint First Authors.
Comments