A bottom-up characterization of transfer functions for synthetic biology designs: lessons from enzymology

Within the field of synthetic biology, a rational design of genetic parts should include a causal understanding of their input-output responses—the so-called transfer function—and how to tune them. However, a commonly adopted strategy is to fit data to Hill-shaped curves without considering the underlying molecular mechanisms. Here we provide a novel mathematical formalization that allows prediction of the global behavior of a synthetic device by considering the actual information from the involved biological parts. This is achieved by adopting an enzymology-like framework, where transfer functions are described in terms of their input affinity constant and maximal response. As a proof of concept, we characterize a set of Lux homoserine-lactone-inducible genetic devices with different levels of Lux receptor and signal molecule. Our model fits the experimental results and predicts the impact of the receptor's ribosome-binding site strength, as a tunable parameter that affects gene expression. The evolutionary implications are outlined.


INTRODUCTION
The advance of genetic engineering has made it possible to modify genetic programs inside cells by re-designing them in predefined ways (1). Synthetic biology has emerged as a discipline in which modular biological parts are used for the construction of genetic devices. As in any engineering discipline, mathematical and computational models provide the workbench to infer system-level behavior from the properties of the biological parts (2). Standard engineering predicts output responses of a device given a set of input signals and a specified internal set of pieces. Within synthetic biology, the proper characterization of simple blocks in a reliable way constitutes a major challenge for the building of complex genetic devices (3)(4)(5).
The transfer function, a term borrowed from electronics, is the representation of the relationship between the input and the output of a system (6,7). This concept has been translated within synthetic biology as the response of a regulable genetic device in the presence of a signal that acts as the control variable of the system. In most relevant scenarios, nonlinear responses are often desirable in order to implement the digital logic abstraction found in man-made circuits. This can be achieved using mechanisms such as saturation of biochemical systems (8), ultrasensitivity (9), multistability (10) and transcription factor cascades (11) among others. Hill functions have been commonly used for the fitting of experimental datasets in biochemistry (12), computational biology, (13), pharmacology (14), systems and synthetic biology (10,(15)(16)(17)(18). The success of this approach comes from the fact that fitting data require little a priori knowledge of the underlying biological mechanisms, and provide quantitative information about affinity and cooperativity of the system (8).
In genetics, Hill-like functions come from the assumption of cooperative effects due to transcription factor multimerization (19) and can be derived from equilibrium calculations on ligand-receptor binding. However, in most cases, its representation results from the correction of the hyperbolic Michaelis-Menten approach by adding an empirical exponent n (14), written as As a consequence of its empirical nature, neither the original Michaelis-Menten premises nor biological information remains in the model, losing the link between the kinetic parameters and biological mechanisms. Accordingly, models constructed by fitting have very limited predictive value beyond the exact conditions in which data were acquired. Thus, the approximation taken is largely a heuristic one. Design often requires iterative optimization steps. However, any device modification may lead to some type of unpredictable behavior, forcing further empirical characterization. Unfortunately, such a circumstance is not rare in the process of construction and testing of a genetic device (4). Hence, there is a need for a more suitable framework that allows predictions and avoids time-consuming data collection.
In this regard, the Michaelis-Menten approach (20) may offer an inspiring alternative to the broadly accepted Hill fitting. Interestingly, the transfer function concept fairly matches the substrate-velocity plot for enzymatic catalysis. This classical plot constitutes a clever characterization of enzyme kinetics, connecting a simple experimental setup with a biochemically grounded model, based on very precise premises. In that way, an analogous perspective for genetic devices would confer to transfer functions a desirable predictive value. The aim of this work is to establish a quantitative relation between input's affinity, signal amplitude and the variation of the control variable (i.e. induction molecules). In order to provide an experimental validation, we shall compare our model predictions with the characterization of an engineered device: the Lux system.
The quorum sensing Lux system has been extensively used in synthetic biology (15,16,21). With a sophisticated regulation in nature (22), its engineered versions have been restricted to the transcriptional level, to which a Hill-like behavior with a wide range of cooperativities has been reported (7,13,16,23).
When we look at the biochemical characterization, the interaction of LuxR dimer with 3-oxo-C 6 -homoserine lactone (3OC6HSL) induces the binding to promoter (24). This process is mediated in a noncooperative manner by 3OC6HSL, as suggested by studies in Lux and its Car homolog systems (25,26). Interestingly, receptor without lactone cognate is able to bind the DNA promoter (27), suggesting that some expression mediated by free receptor may occur. This scenario, schematically represented in Figure 1A, provides a starting point for a more biologically meaningful model of this system. However, one issue remains: the ability to control and manipulate the elements of the device.
From an engineering perspective, modularity and orthogonal function of genetic parts is the key for the construction of tailored devices. At this point, the ribosome-binding sites (RBSs) are useful elements to control the efficiency of the translation of the mRNA pool. Efforts on the characterization of RBSs variants for different organisms have provided valuable information for the choice of one or another RBS in a genetic system (28). A comparison of the effect of these parts in the expression of the final output is given by its relative strength, which is calculated using a standard value of expression as a reference for normalization (29). The use of different RBSs constitutes a common way to modulate the expression of a particular gene. But what is the impact of RBS changes on the behavior of a device?
To tackle this question, our work presents an enzymology-like approximation that allows us to explore the role of different RBSs in a genetic pLux-LuxR-inducible circuit. This study shows how tunable parts of the device, Figure 1. Schematic representation of genetic regulation for the inducible LuxR-pLux engineered devices used in this study. Notation follows the mathematical model: R LuxR receptor, L lactone, and P pLux promoter, R 2 dimerized receptor without lactone, Q dimerized receptor with lactone, S R transcriptional complex not mediated by lactone, S L lactone mediated transcriptional complex, mG mRNA of reporter gene and G reporter protein. The K i notation represents equilibrium constants, while k i refers to kinetic constants (A). Genetic architecture of the three constructs analyzed (B).
such as the expression of receptor, modulate the transfer function in a completely predictable manner. As a proof of concept, an experimental characterization and a further mathematical modeling of an inducible genetic device is presented. This picture more close to the biological mechanism suggests some limitations of the convencional Hill fitting approach.

Bacterial strains and growth conditions
Cloning and expression experiments were performed in Escherichia coli Top10 (Invitrogen, USA). Cells were grown in Lysogeny Broth (LB) at 37 • C and selected with appropriate antibiotics (chloramphenicol 340 g/ml; kanamycin 250 g/ml; or ampicilin 100 g/ml; Sigma, USA). Bacterial strains were preserved in LB glycerol 20% (v/v) at −80 • C.
Three genetic devices were built, composed of a common 3OC6HSL inducible part followed by three On one hand, the inducible construct had the following structure: B0014-R0062-B0032-E1010; on the other hand, there were three variants of the constitutive part, cloned as B0014-R0040-X-C0062-X-E0040-B0014, where X corresponds to the three aforementioned alternative variants: B0034, B0032 and B0033 (labeled as strong, medium and weak, respectively). Technical details about the relative strength of these RBSs are taken from http://parts.igem.org/Ribosome Binding Sites/ Prokaryotic/Constitutive/Community Collection. As detailed in the part registry, the B0034 part was used as a standard for normalization. Therefore the relative strengths used in this work were 1, 0.3 and 0.03 for B0034, B0032 and B0033, respectively. All constructs were included in the Biobricks high copy number plasmid (pSB1AK3) and transformed by chemical method. In the case of the experiments with no receptor (see Figures 3C and 5B), a construct bearing only the inducible part within the same plasmid was used. All genetic constructs were confirmed by Sanger sequencing.

Fluorescence assays for gene expression determination
Strains containing the plasmid of interest were grown overnight in LB ampicillin at 37 • C and continuous shaking. A 1000-fold dilution from overnight culture was grown until exponential phase, OD 660 ≈ 0.4. Cultures were centrifuged at 4000 g, during 5 min and resuspended in fresh LB ampicillin up to an OD 660 = 0.3. Incubation for in vivo measures was carried out by transferring 100 l of the diluted cultures and 100 l of LB ampicillin with the appropriate 3OC6HSL (N-[␤-ketocaproyl]-L-homoserine lactone; Cayman Chemical Company, USA) concentrations into a flat bottom 96-well microplate (Nunc, Thermo Fisher Scientific, USA). LB without cells was included in the incubation as a background control for both fluorescence and absorbance.
Gene expression was monitored in time for a battery of 3OC6HSL concentrations by quantification of the RFP. LuxR was indirectly reported by measuring the concomitant expression of GFP placed in tandem with LuxR (Figure 1B). Incubation and measures of bacterial cultures dur- ing characterization were performed on a Synergy MX microplate reader (BioTek Instruments, USA) every 10 min for 14 h. Fluorescence measures for RFP (ex: 578 ± 9 nm, em: 616 ± 9 nm) and GFP (ex: 478 ± 9 nm, em: 516 ± 9 nm) with gain 70 were carried out, as well as optical density (OD at 660 nm) measures. Incubation was done at 37 • C with continuous orbital shaking (medium intensity). 3OC6HSL concentration conditions were prepared from an initial stock at 10 −2 M (3:1, phosphate buffered saline:ethanol). Serial dilutions in LB ampicillin ranging from 10 −4 to 10 −10 M were prepared the day of the experiment.

Data transformation and Hill function fitting
Sample absorbance and fluorescence readings (OD 660 (S), f(S)) were corrected using signal background control (OD 660 (B), f(B)). Averaged data were obtained from six independent experiments. As described in (18), output signal Θ i was calculated according to the formula: where i refers to GFP or RFP. The value Θ corresponds, with a factor of proportionality, to the concentration of the fluorescent protein i per cell. Matlab R2013a software was used for fitting according to the following formula, assuming a cooperative behavior and signal basality: Nonlinear least squares were computed using the trust region algorithm with default settings.

Time series and signal variation computation
A normalized value of Θ RFP , Θ RFP (norm), was calculated for every time step as following: given a time, Θ RFP for each [3OC6HSL] was divided by the maximal value. We defined signal variation, S, as the coefficient of variation of Θ RFP (norm) in a time interval. Evolution of S over time was calculated using a moving window, consisting of five points of Θ RFP (norm), therefore capturing the information of 50 min in total.
The relation between the values of S for the different [3OC6HSL] was evaluated using the estimator α. This value, defined as the standard deviation of the different S curves, was used to establish the region of the steady-state condition for transfer function acquisition. According to α analysis, the time for the transfer function acquisition was set at 14 h, applying a gain 75 for fluorescence measures.

Transfer function acquisition
The exact time for transfer function acquisition is often arbitrary and still constitutes an open issue (7,30). As an illustrative case, Figure 2A shows the evolution of Θ RFP along time for different [3OC6HSL] for the construct with the RBS rptor (medium). Similar qualitative behavior was observed in the other two constructs (data not shown). Representation of Θ RFP (norm) of the different [3OC6HSL] curves allowed a qualitative comparison of the transfer function along time. Figure 2B shows how the transfer function converged (from black to red lines) to a more and more signal overlap.
After the introduction of 3OC6HSL, cells require time for protein production and maturation. At the steady state, i.e. Θ/ t ∼ 0, production is compensated by degradation processes, giving rise to a constant value of Θ along time. However, given any arbitrary interval time t, Θ consists of the genetic behavior due to the actual change associated to protein production and the noise associated to the measure in that interval. This imposes a limitation for the application of the steady state definition.
To overcome this limitation, we used S as a way to establish a practical definition for the steady-state condition from experimental data (see Materials and Methods section for definition). As at the steady state biological signal does not contribute to the variation of the output, this one must be given only by noise. In Figure 2C, where the time evolution of S for every [3OC6HSL] (labeled from black to green) is shown, one can see how dispersion values depended on time but also on [3OC6HSL] (inversely proportional). The differences among S curves were captured in the time evolution of α value (see Materials and Methods section), represented by a blue thick line in Figure 2C. Looking at α, we could arbitrarily select a threshold to define the time when the steady steady is assumed for the transfer function acquisition. This mathematical transformation allowed us to usefully collapse the information about the level of fluctuations in a single curve. In our study, we chose a final point at 14 h to perform the transfer function characterization and further modeling fitting.

Characterization of the pLux device varying RBS rptor
In order to modulate the strength of gene expression, we characterized a set of constructions using different RBS rptor as illustrated in Figure 1B. The characterization of the respective transfer functions is summarized in Figure 3. The amount of output in response to [3OC6HSL] suggested a noncooperative effect, i.e. n ≈ 1 (see Figure 3A and Table 1 for numerical details). Furthermore, the results showed a decrease of the turning point when stronger RBS rptor were used.
The expression of receptor monitored by fluorescence measurements of the GFP tandem construction ( Figure 1B) showed how the receptor expression was practically unaffected in RBS rptor (medium) and RBS rptor (weak) constructs. However, a significant decrease of Θ GFP was observed in the case of the RBS rptor (strong) construct.
Measures with zero ligand concentration produced a basal signal. Such a response was also dependent on the strength of the receptor, as shown in Figure 3C. The analysis of constructs without receptor also produced a basal expression, independently of lactone concentration. This suggests that the engineered pLux alone was able to promote the expression without receptor.

Enzymology-based model premises
The mathematical model for the inducible device was established according to the following premises: r The parts of the genetics device have an orthogonal functionality, i.e. the components perform one predefined function and they have not got undesirable interactions with the other cellular components. Therefore, they perform the expected behavior.
Besides these general assumptions, for our specific LuxR device and according to the results shown in the previous section, we shall consider the biological mechanism shown in Figure 1A. In agreement to this, RFP expression is modulated by lactone concentration and both luxR-independent and luxR-dependent leakiness are also considered. In order to provide a more understandable view of this characterization, we formulate a mathematical model using the former set of enzimology-inspired premises.

A simple model predicts the role of the RBS rptor strength in output amplitude and turning point
Departing from the genetic circuit illustrated in Figure 1A, our model considers a process of sensing a molecule L by joining with a dimeric receptor R 2 . It is worth mentioning that the requirement of more than one L molecule would introduce a power over [L]. For the sake of simplicity, we shall assume that one ligand molecule is enough for receptor's activation and further addition has no impact on the behavior of active receptor complex. This assumption is supported by the noncooperative process observed in our experimental characterization and by the reported evidence previously mentioned in the Introduction section.
In this simplified version, we assume that leakiness due to receptor binding and naked promoter does not occur, i.e. k m0 = 0 and K 2 = 0. The reactions considered in this simple scenario are: giving rise to the following differential equations:  In this simple model, [R T ] is a function of [R 2T ], as detailed in the mathematical appendix. For the sake of simplicity and according to the biology of the system, we assume that the process of dimerization is prior to the binding with the ligand. Furthermore, we mathematically impose that the ligand interaction has a negligible effect on the equilibrium between dimer and monomer. According to this, we define two conservation equations: where by the assumption of [R T ] [P T ], [S L ] is not considered in (10). Applying the equilibrium constant definitions (5) and (6), we rewrite the [S L ] expression and the conservation Equations (10) and (11) as According to these equations, we can rewrite [S L ] in function [R 2T ] and [P T ] in Equation (9), giving rise to the expres- where γ and g are groups of kinetics constants as detailed in Equations (35) and (36) of the mathematical appendix.
[R 2T ] can also be written as a function of the normalized RBS strength (φ), with respect to the canonical RBS part (see the mathematical appendix for further details). Now, by defining we obtain a more compact expression in a familiar Michaelis-Menten form: Interestingly, the model allows us to define the maximum expression of the reporter (G app m ) and the affinity constant of the device (K app ) as functions of the normalized strength of the RBS receptor (φ). According to the expressions (16), (17) and (18), Figure 4 illustrates the Michaelian effect of [L] and φ in the device. Notice that G app m reaches its maximum value when φ (K 3 r ) −1 is satisfied (gray region in the Figure 4B). Under such a condition, the affinity of the device increases, reducing the value of K app as φ −1 (see Figure 4B).
In this model, G app m ranges from zero at φ = 0 to a maximal value, determined by the efficiency of the machinery, formally γ gk mL [P T ]. Conversely, K app at φ = 0 acquires its maximum value K −1 1 , corresponding with G app m = 0. It is worth mentioning that extreme cases address an ideal trend and they would not match real behavior. This is due to considerations of the simple model, such as the availability of ligand and resources, as well as the extreme variation of the φ values, which are far to be satisfied under physiological conditions. Unlike Hill approximation, this simple equation allows us to see in a qualitative way the role of this device according to the variation of its components.
It is worth to note the similarity of equation (18) with the so-called reversible acompetitive inhibition in enzymology (8). Input affinity and signal amplitude is affected in a saturable way by the RBS strength as it occurs with the inhibitor in the enzymological model. However, in our case such a dependence occurs in an activatory fashion.

Mathematical model including leakiness predicts the effect of receptor on output basality
The model presented in this section is a generalization of the previous one. In contrast to the simplified version, this model incorporates two lactone-independent expression pathways: one mediated by the free receptor and another by the naked promoter. Departing from the chemical reactions previously presented in Equations (4), (5) and (6), we now add the following binding equilibrium: The concentration of the protein reporter [G] is determined by the equation based on the intermediates species--including the free promoter activity mediated by k m0 --as described in the following ODEs: Analogously to the simple model, we shall write an expression of [G * ] based on equilibrium constants and total concentration of the chemical species. As detailed in the mathematical appendix, the resulting expression in its compact version is: where a 1 , a 2 and a 3 are functions of φ (detailed in Equations (53), (54) and (55) of the mathematical appendix), accounting for amplitude signal, basality and ligand affinity, respectively. By considering the two limit cases [L] = 0 and [L] = ∞ of the (22) and defining the turning point as that [L] = L 0.5 for G m /2, we connect mathematical parameters with measurable values of the transfer function as follows: The parameters G 0 , G m and L 0.5 correspond to the basal expression, maximal expression and turning point, respectively. Continuous lines in Figure 5A show that the effect of φ is similar to the observed one from the simple model behavior for signal amplitude (G m ) and turning point (L 0.5 ) (see Figure 4A for comparison). Leakiness by free receptor gives rise to a basal signal at [L] = 0 that also depends on φ in a saturable way, according to Equation (23).
This expression, although less treatable than the simpler version, offers a predictable behavior useful for data fitting. Figure 5 shows the fitting of experimental data using values of normalized strength for the strong (φ = 1), medium (φ = 0.3) and weak (φ = 0.03) RBS parts, as are described in the parts registry collection (see Materials and Methods). The fitting using the Equation (22) gave a basal output and turning point that fairly matched the experimental values (see Table 1). In the same way, amplitudes from the model and the experiments followed a similar trend as suggested by Pearson correlation between data and the model, as detailed in Figure 5.

DISCUSSION
Despite the potential applicability of Synthetic Biology in biomedicine and environmental issues, a proper characterization of global behavior of devices in a reliable and predictable way is missing. In this regard, the use of biological meaningful models allows us to determine the behavior of tunable genetic devices.
Inspired by well established formal approaches from enzymology, our results show that the impact of tunable parts such as the RBS receptor, the plasmid copy number (which would modulate [P T ]) and even the RBS directly affecting RPF, can be predictably studied without the use of the Hill function approach. From an engineering perspective, the model reveals that the sensation by an inducible device can be adjusted by varying the receptor levels. A strategy based on the changes of the receptor expression is worth considering rather than complicated protein engineering. Interestingly, the model offers a suitable explanation about the lack of cooperativity observed in the experimental results: only one molecule of lactone is required for the activation complex. This is in agreement with evidence found in literature for Lux and homologous quorum sensing receptors. The free ligand receptor formation allows us to explain the leakiness mediated by free receptor described in this work. Such evidence provides some concerns about the behavior of the engineered Lux device, specially when positive cooperativity is desired (13,15).
Noteworthy, for the RBS rptor (strong) we observed a reduction in the receptor levels, despite its expression took place under a constitutive promoter. This reduction may affect the device response when scaling up the system. These experimental results may suggest the existence of an indirect negative interaction between inducible gene expression, in our case RFP, and other nonregulated genes, such as LuxR. We speculate that the metabolic load associated to device induction could be responsible for this behavior. Looking at Figure 5B, the higher levels of basal RFP expression in constructs with and without LuxR in absence of lactone can be interpreted in similar terms. This latter example illustrates the possibility that experimental setup may not adjust well to our model premises, being the nonorthogonality of genetic devices another possible source of discrepancies. The predictive value of the model precisely relies on the assumption that its premises are fulfilled, and therefore deviations from the expected behavior could be a hint that some premises are broken. Further research incorporating factors such as metabolic load and crosstalks might improve the predictive capabilities of the model.
Bearing all this in mind, principles of organization described in this work may offer an evolutionary insight, in particular, in processes of adaptation or even the emergence of some type of diseases. According to our results, mutations at the level of receptor expression may offer a finer tuning process than those occurring at the polypeptidic chain of the receptor. In this context, the work provides a proof of concept for an interesting evolutionary perspective of the principles of biological design extracted from a synthetic biology approximation.

CONCLUSION
The use of an enzymology-based approach provides a framework for the study and reliable characterization of synthetic devices uncovering interesting connections of the principles of organization of natural systems. Further work on the extension of an enzymological approach to the study Nucleic Acids Research, 2014, Vol. 42, No. 22 14067 of more complex genetic behaviors would be of great interest for the controllability and development of new synthetic genetic devices.