A tightly regulated and adjustable CRISPR-dCas9 based AND gate in yeast

Abstract The robust and precise on and off switching of one or more genes of interest, followed by expression or repression is essential for many biological circuits as well as for industrial applications. However, many regulated systems published to date influence the viability of the host cell, show high basal expression or enable only the overexpression of the target gene without the possibility of fine regulation. Herein, we describe an AND gate designed to overcome these limitations by combining the advantages of three well established systems, namely the scaffold RNA CRISPR/dCas9 platform that is controlled by Gal10 as a natural and by LexA-ER-AD as heterologous transcription factor. We hence developed a predictable and modular, versatile expression control system. The selection of a reporter gene set up combining a gene of interest (GOI) with a fluorophore by the ribosomal skipping T2A sequence allows to adapt the system to any gene of interest without losing reporter function. In order to obtain a better understanding of the underlying principles and the functioning of our system, we backed our experimental findings with the development of a mathematical model and single-cell analysis.

: Induction of LexA-ER-B112 fusion protein transcription activation by ES.
The fusion protein containing the LexA binding domain, the human estrogenic receptor ER and the transcription activator B112 are bound to Hsp90 in absence of ES. After addition of ES Hsp90 is displaced and the complex now can enter the nucleus and bind to the lex A boxes preceding the MCP-VP64 gene. The gene is transcribed to mRNA, tranists from the nucleus to the cytoplasm and is translated into a functional fusion protein.
wt f6 MCP binding dCas9 binding tetO binding Figure S2: scRNA design. The scRNA contains a 20 bp region for the specific targeting of tetO in front of a target gene, two loops designed for dCas9 recruitment and two loops for MCP binding (MS2, f6). Figure S3: Induction of Venus expression using different tandem copies of tetO. Clones containing 1, 2 3 or 7 copies numbers of tet operator sequences respectively in front of the reporter gene Venus were designed after Zalatan et al. [1]. The tetO wt strain contained no CRISPRi genes, but the tetO reporter constructs. The samples called gRNA contained an uninducible system with a direct VP64 fusion to dCas9. The strains containing the MCP-VP64 fusion were called scRNA, SD for uninduced and scRNA, SG for galactose induced samples. The controls display the results for the 7x tetO Venus strain that contains a genomically integrated rtTA (see Supplementary Table S1) induced with doxycycline (SD) [1] and the wild type wt, which contained no genes for Venus expression. Cell density [OD 600 nm] Figure S5: Influence of induction of reporter gene expression on cell growth. The strains were inoculated to an OD600 of 1 in SG medium and grown over night in SD medium, SG medium or induction medium containing SG and 1nM, 10 nM or 100 nM ES, respectively. For the Venus reporter system (filled) no significant influence in growth could be detected, whereas with increasing amount of ES the cell density of the GOase-tGFP reporter system (hatched) decreased slightly.

Gating
In order to maintain comparability and comprehensibility, we followed a minimal gating strategy. We applied the following two gate to all of the data: • To remove debris we excluded all events with a forward-scatter area (FSC-A) signal below an experiment specific threshold (Fig. S13(left)).
• To remove possible doublets and cell groups we used a forward-scatter height (FCS-H) vs. forward-scatter area (FSC-A) density plot and excluded the cells that did not follow the expected linear relation (Fig. S13(right)) [2].

Stepwise galactose
Yeast cells were grown in SD-URA medium overnight at 30 • C. The cell density was determined photometrically and 1 · 10 7 cells/mL were inoculated in synthetic complete medium leaking uracil containing different amounts of galactose (SG). Induction was completed by addition of 100 nM ES. Samples were analyzed after 20 hours of induction. Therefore, 300 µL of the cell culture were centrifuged at 8000 rpm in a Heraeus Biofuge Pico centrifuge to remove medium. Cells were resuspended in PBS and analyzed by flow cytometry followed as described in the main paper. Data analysis was performed using Mathematica 11 software.

Serial dilution spotting assay
The double reporter strain was grown over night in 5 mL SD-ura at 30 • C. The cell density was determined photometrically and the suspension diluted to 1 · 10 7 cells/mL. Cells were induced in a total volume of 3 mL for 20 hours at 25 • C. Samples for growth in SD-ura, SD + 2000 nM ES, SG -ura, SG + 2000 nM ES. SG + 100 nM ES, SG + 10 nM ES and SG + 1 nM ES were prepared. Following [3], 25 µL of the samples were transferred to a 96-well plate after induction and a serial dilution was perfomed in steirle PBS. 5 µL of each sample were dropped onto a SD-ura agar plate and cells were incubated at 30 • C for up to three days to allow for growth differences to appear.

Stepwise galactose
Yeast cells were grown in SD-URA medium overnight at 30 • C. The cell density was determined photometrically and 1 · 10 7 cells/mL were inoculated in synthetic complete medium leaking uracil containing different amounts of galactose (SG). Induction was completed by addition of 100 nM ES. Samples were analyzed after 20 hours of induction. Therefore, 300 µL of the cell culture were centrifuged at 8000 rpm in a Heraeus Biofuge Pico centrifuge to remove medium. Cells were resuspended in PBS and analyzed by flow cytometry followed as described in the main paper. Data analysis was performed using Mathematica 11 softwaremove medium. Cells were resuspended in PBS and analyzed by flow cytometry followed as described in the main paper. Data analysis was performed using Mathematica 11 software.

Construction, preparaion and detection and preparation of vNAR reporter strain
The strain containing an additional double reporter system is based on a vNAR-T2A-tGFP reporter construct. Cells were constructed as described in the main paper, whereby an exemplary anti-Matuzumab vNAR [4] gene was integrated instead of the GOase gene. The vNAR gene was designed with an additional a C-terminal His-Tag to allow purification via immobilized metal affinity chromatography (IMAC) followed by validation through immunostaining. After construction and verification of correct genomic integration, induction was performed similar to the GOase construct as described in the main paper. Cells were grown overnight in 50 mL flasks in SD-URA medium at 30 • C. After growth, cells were inoculated to a cell density of 1 · 10 7 cells/mL into 500 mL SG-URA and induction was completed by addition of 500 µL of a1 mM ES solution. Induction was performed at 30 • C for 20 hours and vNAR-tGFP expression was verified by flow cytometry (data not shown). The cells were precipitated and the supernatant was concentrated to 3 mL using Amicon Ultra-15 Centrifugal Filter Units (MWCO 3 kDa, Merck Millipore) (data not shown).
The cell pellet was suspended in IMAC buffer A (10 mM imidazole), disrupted using a cell disrupter (Constant systems LTD) and cell debris was removed by centrifugation. For isolation of the His-tagged vNAR by IMAC, 30 mL cell lysate were applied to a 1 mL HisTrap HP column (GE). After washing with 10 column volumes (CV) buffer A, elution was performed with a linear gradient 0-100% buffer B (1 M imidzole) over 20 CV while collecting 1 mL fractions.
After SDS-PAGE (performed as described in the main paper) of selected samples, gels were either stained with Coomassie Blue for visualization of the whole protein content or blotted onto nitrocellulose membrane (Semi-dry Western Blot) for later immunostaining. 5 µL of the Blue Prestained Protein Standard, Broad Range (11-190 kDa) from NEB were utilized as marker. For each sample 15 µL were applied. Separated by washing steps (PBS with 0.05% Tween20) the membrane was blocked with 1.5% milk powder solution and incubated with monoclonal Penta-His antibody (produced in mouse, Quiagen) followed by Anti-Mouse IgG (whole molecule)-Alkaline Phosphatase antibody (produced in goat, Sigma Aldrich). Addition of NBT/BCIP substrate solution resulted in a purple-black precipitate in presence of alkaline phospotase activity enabling indirect detection of His-tagged proteins.

ROC-Curves
Our system represents a single-output logic gate that can be understood as a binary classifier. In signal detection theory, ROC (receiver operator characteristic) curves are used to characterize the performance of such classifiers, hence it was recently suggested to use ROC curves instead of the conventionally used fold-activation to display the gate's performance [5]. While the fold activation only accounts for the bulk behavior, the ROC curves incorporate the cell-to-cell variability and give thus a better estimate of the device's functionality at single-cell level. One important advantage of the ROC curve is that it characterizes the classifier performance independent of a particularly chosen threshold. Following [5] we obtain the coordinates (x(T ), y(T )) of the points on the ROC curve for each possible fluorescence threshold T by the following two equations: where N and M are the number of cells in the OFF and ON state, respectively and f i andf i denote the corresponding fluorescence values of the ith cell.

Deterministic Model
We start with the given system of two differential equations: where: f (es n ) = ν (es n /k) h 1 + (es n /k) h In the stationary state the time derivatives can be set to zero and one obtains: where φ = ν δ . To be able to compare the model and the date, we introduce a scaling factor ρ that maps the number of g to the measured fluorescence value. Our fitting function is hence: where Θ = ρφ = ρν δ is now a generalized scaling factor.
Fitting this model to our experimental data we obtained the following parameter values:

Stochastic Model
Keeping in mind that the volume of the nucleus of a yeast cell is roughly 3 · 10 −15 L, one notices that the Michaelis Menten constants obtained by the deterministic fit correspond to ≈ 2 − 30 ES particles nucleus . In order to verify whether our model reproduces the bimodal distribution that was observed in the fluorescence histograms of the analyzed systems, we made hence use of a stochastic version of our model. The model is based on the chemical equations given in the paper: For the parameter values of k and h we used the values obtained in the deterministic model (see section 3.4 in this document). The concentration of ES m is given by the experimental setup, the parameters ρ, δ and ν are constrained by the scaling factor Θ. γ was manually fitted for both systems separately. While γ should be independent of the GOI, the obtained values diver by a factor of ten. This might be due to some natural variation in the data, but could also be an indication that there are other reversible processes operating at slow time-scales. Due to the used hill-function rates, our model does not have mass action-kinetics. It it was recently shown that this can lead to a loss of accuracy, especially of the higher order moments of the distribution [6,7]. Nevertheless, since we later added extrinsic noise with a comparably large variance to the data (described below), the error due to the used nonlinear rates seems negligible.
In order to estimate the noise and fluorescence offset that is generated due to the auto fluorescence and the measurement device, we fitted Gaussian distributions to the negative control (only SD) for both reporter systems. The differences in the extrinsic noise distributions are mainly due to the different gain settings in the cytometer.
The final model-data was obtained as follows: where S are the raw data points obtained from the stochastic simulation of (8), ρ is the scaling factor constrained by Θ,ν and δ, O denotes the offset of the negative control and N (0, σ 2 ) is a random variable distributed normally with mean 0 and variance σ 2 (see Fig. S14.

Detrimental effects
For the dual-reporter system we observed that most cells did show fluorescence as predicted by our model. Nevertheless, a small fraction of cells did not show any fluorescence signal above the auto-fluorescence value. In order to analyze the origin of this unexpected behavior, we included a generic detrimental effect into our model. Thereto, we defined the function ζ(ES) that depends on the ES-concentration and returns the fraction of cells that do not show any fluorescence. We set this fraction of randomly chosen datapoints of our stochastic simulation to a value of 0 and rescaled the Θ parameter by 1/ζ to keep the mean constant. We asked whether ζ(ES) is a constant function (and does not depend on ES) or whether there is a functional dependency of ζ on ES. While the statistics are not good enough to provide exact parameters, we can observe that the detrimental effects depend on ES in a hill-functional form. In Fig. S15(left), ζ was set to: ζ(ES) = 0.21 1 1+(6/ES) 1.5 with [ES] = nM. The wrong predictions for 0.1nM do not indicate an inability of the model, but represent the high sensitivity of the system in this concentration regime. A small increase of ES by only 0.1ES results in large changes of the predicted distribution.