Dynamic signal processing by ribozyme-mediated RNA circuits to control gene expression

Organisms have different circuitries that allow converting signal molecule levels to changes in gene expression. An important challenge in synthetic biology involves the de novo design of RNA modules enabling dynamic signal processing in live cells. This requires a scalable methodology for sensing, transmission, and actuation, which could be assembled into larger signaling networks. Here, we present a biochemical strategy to design RNA-mediated signal transduction cascades able to sense small molecules and small RNAs. We design switchable functional RNA domains by using strand-displacement techniques. We experimentally characterize the molecular mechanism underlying our synthetic RNA signaling cascades, show the ability to regulate gene expression with transduced RNA signals, and describe the signal processing response of our systems to periodic forcing in single live cells. The engineered systems integrate RNA–RNA interaction with available ribozyme and aptamer elements, providing new ways to engineer arbitrary complex gene circuits.


Plasmids
: Strains and plasmids used in this work.

Supplementary Movies
Supplementary Movie 1. Single cell dynamic response to theophylline of system theoHHAzRAJ12.

Supplementary Movie 2.
Single cell dynamic response to aTc of system breakHHRzRAJ12.

Plasmids construction
To express and characterize the RNA-based signal transduction, we have generated a modified plasmid vector, pSTC2 containing a pSC101m origin of replication (a mutated pSC101 ori giving a high copy number) and a kanamycin resistance selection marker (Supplementary Figure 1). The pSTC2 vector is based on our previously reported vector pSTC1 1 by removing the mRFP coding sequence and tagging the carboxyl terminus of the superfolder GFP (sfGFP) 2 with the ssrA degradation tag (ASAANDENYALAA) 3 by polymerase chain reaction (PCR). The underlined AS dipeptide coding sequence corresponds to an NheI restriction site and acts as a linker, while the ssrA tag targets proteins to the ClpXP degradation pathway, significantly increasing their degradation rates and therefore dynamic behaviors 3,4 .
For engineering our gene cassettes, the pSTC2 vector was made so that independent promoters could drive the expression of the regazyme and the corresponding mRNA. We used the inducible promoters P LlacO1 (regulated by LacI and modulated externally by the chemical inhibitor isopropyl-β-D-thiogalactopyranoside, IPTG) and P LtetO1 (regulated by TetR and modulated externally by the chemical inhibitor anhydrotetracycline, aTc) 5 . Note that both promoters were placed in opposite directions to avoid transcriptional interference 1 Table 6. All plasmid manipulations were performed using standard S6 molecular biology techniques 6 . All enzymes used for plasmid digestions were from Thermo Scientific, USA. All oligonucleotides were synthesized from Integrated DNA Technologies, USA. The different RNA devices (from the terminator of the regazyme to the 5' UTR of the mRNA, see Supplementary Figure 2) were chemically synthesized and cloned in plasmid pIDTSMART (pUC replication origin, ampicillin resistance marker) and then subcloned into pSTC1 or pSTC2.

PCR-based mutagenesis
Dysfunctional regazymes (both core catalytic mutations and theophylline binding activity mutations in the aptamers, see Supplementary Figure 7) were constructed using PCR-based site-directed mutagenesis with Phusion high fidelity DNA polymerase (Thermo Scientific, USA), followed by template digestion with DpnI (Thermo Scientific, USA) for 1 h at 37 °C.

S7
Strains used in this study are listed in the Supplementary , where brakets denote average per samples).
The normalized fluorescence of plain cells (transformed with a plasmid without GFP) was also considered for background subtraction when appropriate, and then obtaining the stationary protein expression value (magnitude per cell).

Quantification of in vivo catalytic activity
To in vivo quantify the catalytic activity of the two versions of the RAJ12-based regazyme that sense theophylline (theoHHAzRAJ12) and an effector RNA (breakHHRzRAJ12), we  Incubation was continued at 37 ºC with shaking and 2 mL aliquots were taken at 0, 2

Microfluidics device construction
To understand the dynamic regulation of our regazyme devices, we have examined in vivo gene expression with single-cell, time-lapse fluorescence microscopy using a microfluidics device (Supplementary Figure 16a). This was designed to support monolayer growth of Instrumente, Germany). The bonded chips were cured in 80 °C incubator for at least 2 days before experiments.

Time-lapse microscopy and image analysis
All images were acquired using Zeiss Axio Observer Z1 microscopy (Zeiss, Germany),

Optimization algorithm
We developed an optimization algorithm to design regazymes provided the sequences of a given aptazyme and a riboregulator. On the one hand, the aptazyme responds to its ligand to cleave the RNA sequence at a given point 10 . On the other hand, the riboregulator is able to activate protein expression by inducing a conformational change in the 5' UTR of the mRNA.
The sequence of a regazyme is composed of prefix and suffix sequences flanking the aptazyme followed by the riboregulator (Supplementary Figure 6a). These sequences constitute part of the transducer module. The aptazyme and riboregulator sequences are kept fixed. The only premise for the design is that the riboregulator needs to have the seed region in its 5' tail. Hence, the prefix and suffix are designed to get the seed paired within the structure of the full regazyme (state OFF), and unpaired (and then exposed to the solvent for RNA-RNA interaction) within the resulting structure after cleavage induced by the ligand (state ON). These sequences are totally variable in nucleotide composition and length.

S15
Starting from random sequences for the prefix and suffix, the algorithm implements a heuristic optimization based on Monte Carlo Simulated Annealing (Supplementary Figure   6b) 10 where W is the thermodynamic work required to get the aptamer formed to sense the ligand or get the seed unpaired or paired, and it is assumed to be proportional to the activation free energy (ΔG act ). We balanced equally these states. Denoting by Λ and Γ 0 the structures of the aptamer and seed within the regazyme before cleavage, and by Γ the structure of the seed after cleavage, we can calculate these works by where G n is the average free energy contribution per nucleotide (here we consider G n = 1.28 Kcal/mol), and d is the Hamming distance between two secondary structures.
Typically, the convergence of the algorithm is fast, obtaining sequences with ΔG score = 0 in some seconds or minutes. We used the Vienna RNA package with default parameters 11 .
Our systems are based on conformational changes, but algorithms for multi-state RNA design are scarce. Our approach tackles this problem, allowing sequence and structure specifications, and exploiting RNA folding algorithms such as Vienna RNA. In this work, we just focused on the computational design of the transducer module, but nothing would prevent a full design of the molecule, including the riboregulator. A strategy of nesting design processes (one for the riboregulator and another for the regazyme) enhances the convergence of the corresponding S16 algorithms, as well as makes the designs more modular. To this end, our approach has the advantage of leaving unconstrained the sequence length. However, our approach has the limitation of just using 2D structure to model RNA conformational change and catalysis.
Certainly, this type of mechanisms could involve pseudoknot interactions and even noncanonical base pairs, for which 3D models could better capture the interaction and processing features. Nevertheless, although the ribozyme has tertiary contacts, the exposition or blockage of the seed region in our design is governed by secondary structure. In addition, our model does not take into account kinetic binding effects, which might have an impact on the designs.

Mathematical model
The full set of biochemical reactions (constants in brackets) of the system theoHHAzRAJ12 (small molecule sensing) is Clearly, this model can be easily rewritten in case of sRNA sensing. Then, to quantitatively model the protein synthesis in the cells, we could construct a system of differential equations based on those reactions. However, due to the lack of reliable values for many of the parameters, we decided to take a quasi-steady state approach.
We first assumed that the global and intracellular concentration of external inducers is the same, and that they bind very fast (relative to other time scales in the system) to their target molecules (i.e., IPTG to LacI, aTc to TetR, and theophylline -Theo-to RNA aptamer).
Therefore, the total concentrations of RNAs are taken in quasi-steady state, given by To simplify the model of the aptazyme cleavage, we introduce the following term where α 0 is the fraction of cleavage in absence of any ligand and α the maximal fraction in presence of theophylline. Moreover, λ is rate at which the catalytic reaction takes place (it is related to γ where β 0 is the translation rate in absence of riboregulator and β the rate in presence of it.
Finally, δ g is the first-order protein degradation rate.
[GFP] 0 = 0 could be taken as initial condition in vitro, but because the system is expressed in a cellular context we should take

Mathematical analysis of the dynamic response
To analyze the dynamic response of the regazyme system, we assume that the only inducer with time dependence is theophylline and that the total concentrations of mRNA and In case of producing directly a riboregulator from promoter P LtetO1 , we would have [Riboregulator] = M. In addition, because the cis-repression is very efficient 1 and GFP has a degradation tag, we can take β 0 << β and δ g >> µ. Finally, we can write the time dependence of GFP as to be integrated numerically in combination with Eqs (S6) and (S9).  (c) Scheme of the composability of reazymes to implement several circuits.

Supplementary Figure 23: Schemes of expanded application of regazyme-based circuits.
We show the implementation of cascades (of small molecule-sensing regazymes coupled with sRNA-sensing regazymes), how to increase the fan-in or fan-out of a regazyme-based system, and also the implementation of feedback and feedforward loops (with sRNA-sensing regazymes).