MegaTevs: single-chain dual nucleases for efficient gene disruption

Targeting gene disruptions in complex genomes relies on imprecise repair by the non-homologous end-joining DNA pathway, creating mutagenic insertions or deletions (indels) at the break point. DNA end-processing enzymes are often co-expressed with genome-editing nucleases to enhance the frequency of indels, as the compatible cohesive ends generated by the nucleases can be precisely repaired, leading to a cycle of cleavage and non-mutagenic repair. Here, we present an alternative strategy to bias repair toward gene disruption by fusing two different nuclease active sites from I-TevI (a GIY-YIG enzyme) and I-OnuI E2 (an engineered meganuclease) into a single polypeptide chain. In vitro, the MegaTev enzyme generates two double-strand breaks to excise an intervening 30-bp fragment. In HEK 293 cells, we observe a high frequency of gene disruption without co-expression of DNA end-processing enzymes. Deep sequencing of disrupted target sites revealed minimal processing, consistent with the MegaTev sequestering the double-strand breaks from the DNA repair machinery. Off-target profiling revealed no detectable cleavage at sites where the I-TevI CNNNG cleavage motif is not appropriately spaced from the I-OnuI binding site. The MegaTev enzyme represents a small, programmable nuclease platform for extremely specific genome-engineering applications.


INTRODUCTION
The rapid pace of development in the genome-editing field has led to a number of competing technologies, each with their benefits and limitations (1,2). The technologies can be broadly characterized based on the nuclease domain used to introduce a double-strand break (DSB) or nick at a target site. Two common reagents are zinc-finger nucleases (ZFNs) and transcription activator-like effector nucleases (TAL-ENs) that utilize the dimeric and non-specific FokI nuclease domain (3)(4)(5)(6). Two head-to-head ZFN or TALEN pairs must be designed to target a single site and positioned such that the FokI domains can dimerize to introduce a DSB (7)(8)(9), typically with 4-nt 5 overhangs. The non-specific cleavage activity of the FokI nuclease domain facilitates targeting of a wide range of sequences, but can lead to off-target cleavage (10)(11)(12). The recently developed CRISPR/Cas9 system has received significant attention due to the ease of programming targeting (13,14). In this system, a ribonucleic acid (RNA) guide molecule (the crRNA) targets the Cas9 nuclease to a deoxyribonucleic acid (DNA) target through an RNA/DNA heteroduplex (15)(16)(17)(18). A blunt-ended DSB results from two independent nicking reactions, one by a HNH nuclease domain and the other by a RuvC-like domain. An alternative nuclease architecture utilizes the naturally occurring meganucleases, or LAGLIDADG family homing endonucleases, which are typically encoded within self-splicing group I introns and inteins, and are characterized by an extensive protein-DNA interface (19,20). The nuclease active site is formed at the interface of two parallel ␣-helices, with cleavage generating a DSB with 4-nt 3 overhangs. Recently, a number of recombinase (21,22) and sequence-specific nuclease domains have been developed as alternatives to the non-specific FokI nuclease domain (23)(24)(25)(26). In particular, we and others showed that the monomeric and sequence tolerant GIY-YIG nuclease domain from the homing endonuclease I-TevI could be fused to zinc fingers, meganucleases and TAL effectors to create novel monomeric enzymes (27,28). The I-TevI-based reagents are active on substrates that contain a preferred CNNNG cleavage motif, generating 2-nt 3 overhangs.
Regardless of the technology, one common application of genome-editing nucleases is the generation of gene disruptions whereby mutagenic repair at the targeted DSB introduces frameshift mutations into a coding region. The mutagenic DNA repair events occur in the absence of an exogenously provided DNA repair template and result mainly Nucleic Acids Research, 2014, Vol. 42, No. 13 8817 from the non-homologous end-joining (NHEJ) pathway (29)(30)(31). However, many DSBs are repaired without mutation, as the compatible cohesive ends generated by the nucleases are re-ligated through the canonical NHEJ pathway, leading to a cycle of persistent cleavage and precise repair events that are non-productive for genome engineering. One strategy to bias repair events toward gene disruption is to co-express a DNA end-processing enzyme with the genome-editing nuclease (32,33). For instance, Trex2, a 3 -5 exonuclease, dramatically increases gene disruption when co-expressed with ZFNs, TALENs and meganucleases by processing of DSBs before DNA repair. One potential limitation of this strategy is the requirement to transfect the Trex2 coding region with the ZFN, meganuclease or dimeric TALEN constructs, which may be problematic in size-constrained vectors. Overexpression of Trex2 could also enhance mutagenic repair at unwanted off-target sites, although no increases in cellular toxicity or off-target cleavages were observed with Trex2 over-expressing cell lines (32,33). Gene disruption could be enhanced by targeting two reagents to the same locus, positioning two DSBs to effectively excise the intervening sequence and introduce a deletion. Such multiplexing of genome-editing regents is constrained by the dimeric architecture of the ZFNs and TALENs (34) and, in the case of the CRISPR/Cas9 system, requires the use of nicking variants and dual-guide RNAs (35,36).
Here, we propose an alternative strategy for gene disruption by coupling two different nuclease active sites into a single polypeptide. The MegaTev architecture is the fusion of a meganuclease (Mega) with the nuclease domain derived from the GIY-YIG homing endonuclease I-TevI (Tev). The two active sites are positioned ∼30 bp apart on DNA substrate, and generate two DSBs with non-compatible cohesive ends. The dual active MegaTev shows high gene disruption activity in HEK 293 cells without overexpression of DNA-end processing enzymes.

Bacterial strains and plasmid construction
Escherichia coli DH5␣ (New England Biolabs) was used for plasmid amplification, ER2566 (New England Biolabs) for protein expression and BW25141 (DE3) for bacterial two-plasmid selections (Supplementary Table S1) (37). Tev-Onu and Tev-Ltr fusions were cloned into pACYCDuet-1 using 5 NcoI and 3 XhoI sites as previously described (27). For the yeast DNA repair assay (3,38), the Tev-Onu and Tev-Ltr genes were amplified using Phusion DNA polymerase (New England Biolabs) with a 3 primer that introduced a C-terminal SV40 nuclear localization sequence (NLS) (primers are listed in Supplementary Table S2). Polymerase chain reaction (PCR) products were cloned into the NcoI/SalI sites of pGPD. The backbone target site plasmid for the yeast assay was created by amplifying a 300bp fragment of the pTox plasmid, digesting the fragment with BglII/SpeI, and cloning the digested fragment into pCP5.1 digested with BglII/SpeI to create pCPTox. All target sites were subsequently cloned into pCPTox using in vivo homology directed repair. For mammalian assays, human codon optimized Tev-Onu fusions (synthesized by IDT-DNA) were PCR amplified and cloned PstI and RsrII into pExodus. Tev-Onu fusions were cloned in-frame with a mCherry gene linked by a T2A peptide sequence from Thosea asigna virus to separate the translated proteins. The TO15 target site was subcloned into the pMSCVpuro retroviral vector (Clontech) using BglII and XhoI sites to integrate into genomic DNA in HEK 293 cells. To generate target sites for episomal plasmid assays, substrates were cloned into the SacI/XhoI sites of pcDNA3(+) vector. Constructs were confirmed by sequencing.

In vitro randomized substrate selection
A list of randomized target site oligonucleotides is found in Supplementary Table S2. The target site plasmid library for the randomized cleavage motif plus 3 bp of the spacer (N8) were constructed in the pSP72 backbone as described (39). The library complexity was estimated to be ∼6.4 x 10 4 for the N8 library based on the number of independent transformants, and from analyses of next-generation sequencing data.
Cleavage assays were performed with 23 nM of Tev169-Onu E22Q and 10-nM N8 plasmid in NEBuffer 3 (50-mM Tris-HCl pH 7.9, 100-mM NaCl, 10-mM MgCl 2 and 1-mM dithiothreitol (DTT) ) at 37 • C for 5 min. The Tev169-Onu fusions were purified as described (27) (Supplementary Figure S1). Samples were prepared for Ion Torrent sequencing at the London Regional Genomics Centre by PCR amplification of the target site region from the input plasmid library and from the plasmids isolated after three rounds of selection using PWO DNA polymerase (Roche) with barcoded primers. The sequencing data were parsed with custom Perl scripts that checked for anchor sequences either side of the randomized region, confirmed that the sequence between the anchors corresponding to the randomized region was 8 nt in length, and then extracted the randomized region for further analyses. For each round of selection, counts for each nucleotide j per position i were determined and then converted to proportions using the centered logratio transformation: Nucleotide selection was then determined by taking the difference in proportions for each nucleotide per position between the final round of selection and the input library. A positive value indicates selection or enrichment for a particular nucleotide relative to the input library, and a negative value indicates selection against a particular nucleotide relative to the input. The enrichment values were plotted in heat map format using R and ggplot2 (40,41), and enrichment values were considered significant if they were >2 standard deviations from the mean enrichment value for each data set.

Cleavage assays on radiolabeled substrate
In vitro cleavage assays were performed on internally radiolabeled substrates that were PCR amplified with [␣-32 P] dCTP. PCR products were loaded onto an 8% (w/v) polyacrylamide gel (29:1 acrylamide/bisacrylamide), run at 40 mA for ∼1.5 h, gel purified, eluted overnight at 42 • C in 8818 Nucleic Acids Research, 2014, Vol. 42, No. 13 5 ml of TE pH 8.0 and concentrated into 50-l volume. Cleavage reactions were performed in 20-l reaction volumes with NEBuffer 3 (50-mM Tris-HCl pH 7.9, 100-mM NaCl, 10-mM MgCl 2 and 1-mM DTT), 0.1 pmol of substrate and 2 pmol of Tev169-Onu fusion protein. Cleavage reactions were incubated at 37 • C for 1, 5, 10 and 25 min before stopping the reaction with 6 l of 100-mM ethylenediaminetetraacetic acid (EDTA) and 5 l of loading dye containing 0.5% sodium dodecyl sulphate (SDS). Mutant Tev169-Onu fusions were incubated for 1 h at 37 • C before stopping the reaction with 6 l of 100-mM EDTA and 5 l of loading dye containing 0.5% SDS. The entire reaction was loaded on a 15% (w/v) polyacrylamide gel (29:1 acrylamide/bisacrylamide) and electrophoresed at 40 mA for ∼1.5 h. The gel was removed from the apparatus and soaked in 10% glycerol plus 8% acetic acid before drying on Whatman paper and visualized using a phosphorimager (GE Healthcare).

Modified two-plasmid target site screen
The 64-triplet variants for the CNNNG cleavage motif were screened using a modified two-plasmid selection (42). Transformants were gridded onto selective plates (LB plus 25-g/ml chloramphenicol and 10-mM L-(+)-arabinose) and non-selective plates (Luria Broth plus 25-g/ml chloramphenicol, 50-l/ml kanamycin and 0.2% glucose) and incubated overnight at 37 • C. Plasmids isolated from survivors and non-survivors were sequenced to identify the NNN variant of the CNNNG motif.

Yeast ␤-galactosidase repair assay
This assay was performed as described (27). Briefly, YPH499(a) containing target site constructs were mated in triplicate with YPH500(␣) harboring the MegaTev constructs. After an overnight selection for diploids, cells were assayed for ␤-galactosidase activity using orthonitrophenol (ONPG). Activity was normalized to either a validated homodimeric zinc-finger nuclease (Zif268) or the wild-type TP15 substrate depending on the assay.

DdeI-resistance assays with plasmid substrates
HEK 293T cells were cultured in Dulbecco's modified Eagle's medium supplemented with 10% fetal bovine serum. Approximately 2.5 × 10 6 million cells were seeded 24 h before transfection on 6-cm plates. Cells were co-transfected with 3 g of pExodus Tev169-Onu E22Q and 3 g of pcDNA3(+) TO15 using calcium phosphate and incubated at 37 • C with 5% CO 2 for 16 h before changing media. After 48 h, plasmid was isolated from HEK 293T cells using the BioBasic miniprep kit. Target sites were PCR amplified, separated on a 1% agarose gel and gel purified. After gel purification, 250 ng of PCR product was incubated with 2 U of DdeI (New England Biolabs) in NEBuffer2 for 1 h at 37 • C. Digests were electrophoresed on a 1.5% agarose gel, stained with ethidium bromide and analyzed on an AlphaImager TM 3400 (Alpha Innotech).

Surveyor assays with integrated targets
Target site integration into HEK 293 cells was performed using the Phoenix Ampho retroviral packaging cell line. To accomplish this, 8 g of pMSCV TO15 was transfected into Phoenix cells using calcium phosphate and incubated at 37 • C with 5% CO 2 for 48 h. Media was removed and filtered through a 0.45-m filter into a falcon tube containing 6 l of 4-mg/ml polybrene (hexadimethrine bromide), and 6 ml of virus solution was used to infect HEK 293 cells to create the integrated cell line (HEK 293-TO15). Approximately 24 h before transfections, ∼2.5 × 10 6 HEK 293-TO15 cells were seeded on 6-cm plates and subsequently transfected with 6 g of pExodus Tev169-Onu or pExodus I-SceI. After 48 h, the HEK 293-TO15 cells were harvested and total genomic DNA isolated. Two rounds of nested PCR were performed, and gel purified PCR products were boiled at 95 • C for 10 min, then cooled slowly to 50 • C before flash freezing at −20 • C for 2 min. To assay for indels, 200 ng of PCR product was incubated with 2 U of T7 endonuclease I (New England Biolabs) in NEBuffer2 for 1 h at 37 • C, separated on a 1.5% agarose gel and analyzed using an AlphaImager TM 3400 (Alpha Innotech).

Western blots
Whole cell extracts were prepared 16 h post-transfection in lysis buffer (50-mM HEPES pH 7.4, 150-mM NaCl, 1-mM EDTA, 10% glycerol and 0.5% NP40) supplemented with protease and phosphatase inhibitors. After 30-min incubation on ice, lysates were centrifuged for 10 min, and the supernatant transferred to a new tube. One hundred micrograms of supernatant was loaded on a 12% SDS-polyacrylamide gel electrophoresis followed by a 1-h transfer to a polyvinylidene difluoride membrane and overnight hybridization at 4 • C with an HA (HA-7, Sigma) antibody. Blots were developed using the Western Lightning R Enhanced Chemiluminescence Reagent (Perkin Elmer, Waltham, MA, USA). Quantifications were done using ImageJ software and Image Lab (BioRad, Hercules, CA, USA).

Illumina sequencing
Target sites were amplified with barcoded primers, pooled and sequenced on an Illumina Mi-Seq platform at the London Regional Genomics Centre at Western University. The biological replicates were sequenced on two independent runs. Reads were processed for valid barcodes, and for the presence of primer sequences that flanked the MegaTev binding site using custom Perl scripts. Primer sequences were removed from parsed reads, the length and abundance determined and the data analyzed and plotted in R (41) using the ggplot2 package (40).

MegaTevs: chimeric fusions of GIY-YIG and meganuclease components
To determine if the I-TevI nuclease domain could function in the context of different meganucleases, we fused residues 1-169 of I-TevI (Tev169) to the native N-terminus of the catalytically inactive I-LtrI variant, I-LtrI E29Q, to create Tev169-Ltr E29Q (Tev-xLtr) ( Figure 1A). Along with the previously constructed Tev169-Onu E22Q (Tev169-xOnu) (27), we assayed the activity of both MegaTevs using a yeast recombination assay where a target site is positioned between a partially duplicated lacZ gene. Cleavage of the target site induces the single-strand annealing pathway to reconstitute a functional lacZ gene resulting in ␤-galactosidase activity. We tested activity on hybrid target sites consisting of the native I-TevI CNNNG cleavage motif (5 -CAACG-3 ) and DNA spacer derived from the phage T4 thymidylate synthase (td) gene fused to either the I-OnuI E2 or the I-LtrI binding site (TO or TL, respectively) ( Figure 1A). The substrates differed in the length of the DNA spacer (from 11 to 21 bp) separating the I-TevI CNNNG cleavage motif from the I-OnuI E2 or the I-LtrI binding site. As shown in Figure 1B, Tev169-xOnu and Tev169-xLtr activity was highest with a DNA spacer length of 15 bp, agreeing with results from profiling DNA spacer length requirements of the Tev-xOnu construct in a bacterial two-plasmid survival assay (27). Furthermore, mutating the critical cleavage CNNNG motif to ANNNA abolished activity for both Tev169-xOnu and Tev169-xLtr on the 15bp spacer substrate [TO15CS(−) and TL15CS(−)], demonstrating that the I-TevI nuclease domain maintains cleavage specificity in the context of a meganuclease fusion. To demonstrate that the MegaTevs are directed to their target sites by the meganuclease and not the I-TevI nuclease domain and linker, we tested the Tev-xOnu against the TL15 target site, and tested the Tev-xLtr against the TO15 target site. No activity was observed for either fusion on the reciprocal substrates ( Figure 1B), showing that the I-TevI nuclease domain does not direct targeting of the MegaTevs.

Dual active site MegaTevs for highly efficient targeted deletions
A unique aspect of the MegaTevs is the fusion of two homing endonuclease active sites into a single polypeptide chain. Each active site is positioned such that the top-strand nicking sites are separated by ∼30 bp on the TO15 DNA sub-strate. When both active sites are functional, this arrangement presents the opportunity to introduce two DSBs with different cohesive ends at a single site in a highly efficient and concerted process. As a proof of concept, we constructed and purified a Tev169-Onu dual nuclease where both the I-TevI and I-OnuI E2 active sites are wild type (Tev169-Onu; Figure 1C and Supplementary Figure S1). Activity was tested in vitro utilizing an internally radiolabeled PCR product of 242 bp containing the TO15 target site. Cleavage by the dual nuclease would be evident by the release of 29-bp product corresponding to the internal sequence between the I-TevI and I-OnuI E2 cleavage sites ( Figure 1C). As shown in Figure 1D, the dual active MegaTev efficiently produced three products after 5 min of digestion, with the accumulation of the internal product (IP) after 25 min. Interestingly, cleavage by the I-TevI nuclease domain precedes cleavage by I-OnuI E2, as I-TevI-specific products are detected at 1 min (TP1 and TP2), whereas I-OnuI E2 products are detected after 5 min (OP1 and OP2). Cleavage assays where I-TevI is active and I-OnuI E2 is inactive (Tev169-xOnu) produced two products consistent with only I-TevI cleavage activity. Similarly, a catalytically inactive R27A I-TevI in the context of an active I-OnuI E2 fusion (xTev169-Onu) produced two products consistent with I-OnuI E2 cleavage. No cleavage was observed for the dual dead nuclease (inactive I-TevI R27A and I-OnuI E22Q, xTev-xOnu) after 1 h of incubation. We also constructed and purified an analogous Tev169-Ltr dual nuclease (Supplementary Figure S1) and tested for activity on an internally labeled PCR product containing the TL15 site (Supplementary Figure S2). I-TevI-specific cleavage products were observed before I-LtrI products, with the internal cleavage product visible after 5 min of incubation.

Expression and activity of MegaTevs in HEK 293 cells
To test expression and integrity of the MegaTevs in human cell lines, we constructed codon-optimized versions. The expression constructs included the MegaTev open-reading frame (ORF), followed by an in-frame T2A peptide and mCherry ORF, allowing us to monitor MegaTev expression by mCherry levels (Figure 2A). Robust expression was observed for all MegaTev constructs. We also tagged the Cterminus of the MegaTev fusion and the I-OnuI E22Q enzyme with an hemagglutinin (HA) epitope tag to monitor protein integrity by western blot and found that ∼72% of the MegaTev enzyme was full length 16 h post-transfection ( Figure 2B). A smaller sized band of ∼45 kDa was present in western blots. The size of this product is consistent with proteolytic cleavage within the I-TevI linker region of the MegaTev construct. In contrast, MegaTevs with an HA tag at the N-terminus could not be detected by western blot. Proteolytic cleavage of the MegaTev enzyme to release N-terminal fragments consisting of the I-TevI nuclease domain could potentially generate nuclease domains with non-specific activity, although previous studies have shown that the GIY-YIG catalytic domains have very low affinity for DNA. To rule out potential toxic side effects associated with non-specific nuclease activity of the I-TevI catalytic domain, we overexpressed the Tev1-169 domain in HEK 293 cells. After 48 h post-transfection, we isolated cell extracts and monitored caspase-3 activity as an indicator of DNA damage response. As a positive control, HEK 293 cells were treated with etoposide to induce DNA damage. No significant difference was observed in caspase-3 activity for HEK 293 cells transfected with the Tev169 or the eGFP expression plasmids, whereas significant caspase-3 activity was found in extracts of HEK 293 cells that were incubated with etoposide ( Figure 2C). Collectively, these data demonstrate that MegaTevs are robustly expressed with low levels of toxicity in HEK 293 cells.

MegaTevs induce mutagenic repair in HEK 293 cells
To extend the utility of the MegaTev constructs to an in vivo context, we first co-transfected the single nuclease variant, Tev169-xOnu, and the TO15 target site cloned on a separate plasmid into HEK 293T cells. To monitor Tev-xOnu activity, we took advantage of a DdeI site that lies immediately downstream of the I-TevI CNNNG motif ( Figure 1A). Plasmid substrates cleaved in vivo by the MegaTev would be subject to mutagenic non-homologous end-joining repair, destroying the DdeI site. Subsequent resistance of target sites to DdeI digestion after PCR amplification from genomic DNA reflects Tev-xOnu cleavage and mutagenic repair. Significantly, we observed 9% DdeI cleavage resistance for target sites amplified from HEK 293T cells cotransfected with Tev-xOnu and the TO15 target site ( Figure  3A). In contrast, no DdeI-resistant products were observed for cells transfected with the TO15 target site plasmid only.
To confirm that DdeI resistance assay was detecting Tev169-xOnu events derived from cleavage and mutagenic repair of plasmid substrates, we transformed E. coli cells with total DNA isolated from HEK 293 to enrich for circular plasmids. Plasmid DNA was prepared in bulk, the TO15 target PCR amplified and subsequently digested with DdeI revealing ∼9% cleavage resistance (Supplementary Figure S3). Cleavage-resistant products were cloned and sequenced, revealing deletions that spanned the I-TevI cleavage site (Supplementary Figure S3), demonstrating that MegaTevs function to induce mutagenic DNA repair in human cells. Next, we stably integrated the TO15 target site into the genome of HEK 293 cells and assayed for activity after independent transfections with the Tev169-xOnu and xTev169-Onu constructs. After 48 h of incubation, the TO15 target site was PCR amplified and activity was assessed in two ways. First, the PCR products were digested with T7 endonuclease I (T7EI). T7EI was used rather than DdeI digestion to allow us to detect I-OnuI E2-specific events with the xTev169-Onu construct. As shown in Figure 3B, undetectable levels of cleavage were observed with both of the single active site Tev169-Onu fusions (xTev169-Onu and Tev169-xOnu), in spite of similar expression levels ( Figure  2A). Second, the amplified TO15 target site from two independent transformations of the nucleases was deep sequenced using the Illumina Mi-Seq platform. Activity was determined as the number of reads that possessed indels relative to the wild-type TO15 sequence (Table 1). We observed an ∼4-6% indel rate with the Tev169-xOnu variant over two experimental replicates, consistent with the indel rate estimated from DdeI cleavage resistance assays. Deep sequencing revealed a spectrum of deletion phenotypes, most of which were centered on the I-TevI cleavage site ( Figure  3C). In contrast, a <1% indel rate with the xTev169-Onu single active site variant was indistinguishable from the indel rate at the TO15 site sequenced from mock-transfected cells. Extremely low levels of cleavage by the xTev169-Onu variant, where I-OnuI E2 is active, are consistent with previous studies that required sorting for cells with high I-OnuI E2 expression levels and multiple rounds of PCR enrichment to visualize I-OnuI E2 cleavage (43).

Dual active MegaTevs efficiently induce deletions on integrated targets
In contrast to the single active site variants, higher activity was detected for the dual active MegaTev enzyme on the integrated TO15 target site ( Figure 3B). Between 20 and 40% activity was inferred from T7E1 assays on amplified PCR target sites from transfections of the Tev169-Onu dual nuclease, and from cells transfected with another MegaTev construct containing the I-TevI 1-184 fragment fused to I-OnuI E2 (Tev184-Onu). To confirm that MegaTev activity resulted in deletion of the sequence between the I-TevI and I-OnuI E2 cleavage sites, we sequenced individual cloned  PCR products and found three sequence types (M1, M2 and M3) that each occurred multiple times (Supplementary Figure S3). In each case, the intervening sequence between the I-TevI and I-OnuI E2 cleavage sites was deleted.
To expand on these results, the PCR products from two independent transfections were analyzed by deep sequencing, revealing an indel rate of ∼12-15% (Table 1). Precise deletion of the intervening sequence between the I-TevI and I-OnuI E2 cleavage sites was confirmed by plotting the absolute length differences of the reads relative to the wild-type sequence ( Figure 3D) and by examining individual reads from the deep sequencing data ( Figure 3E). Smaller deletion lengths centered on the I-TevI cleavage were also observed ( Figure 3E), but at a lower frequency ( Figure 3D). The difference in estimated MegaTev activity between the deep sequencing data and T7E1 assays can be attributed to the fact that only length differences were counted as indels from the sequencing data, whereas the T7E1 assay will detect nucleotide mismatches in addition to indels. Collectively, our data show that the fusion of two homing endonuclease active sites into a single polypeptide chain creates a dual nuclease that can introduce two DSBs at a single target site. In both an in vitro and in vivo context, the two DSBs excise a short internal fragment with high frequency. It is important to note that we did not co-express DNA end processing enzymes, such as the 3 -5 exonuclease Trex2, to enhance mutagenic repair at the TO15 cleavage site.

Requirements of the I-TevI CNNNG cleavage motif in vivo
Efficient cleavage by the I-TevI nuclease domain requires that the 5 -CNNNG-3' cleavage motif be spaced 15 bp from the meganuclease binding site ( Figure 1B). Based on studies on the native enzyme, the C and G of the motif are critical for cleavage, while the central three bases (the NNN triplet) exhibit a substantial degree of tolerance to substitution (44,45). However, the tolerance of the central three bases to substitution in the context of the MegaTev fusion has not been assessed, and we used a variation of a twoplasmid bacterial selection to rapidly determine survival of all 64 variants of the central triplet cloned into the toxic plasmid. In this assay (42), the 64 toxic plasmids were transformed into cells harboring the Tev169-xOnu fusion, plated on non-selective plates, and then replica-gridded onto selective plates to induce expression of the ccdB gene on the toxic plasmid. Cells survive this challenge if Tev169-xOnu can cleave and promote elimination of the toxic plasmid. The Tev169-xOnu fusion was used for this experiment to ensure that survival was due to I-TevI activity and not I-OnuI activity. Three different morphologies were observed on the selective plates; no growth (dead), colonies that grew to the same diameter on both the selective and non-selective plates (strong survivors) and colonies that grew on selective plates but were smaller in diameter than on non-selective plates and often formed a cauliflower morphology (weak survivors) (Supplementary Figure S4). Weak survivors were determined to be target sites that promoted <1% survival when assayed individually. Survival was plotted in a heat map format ( Figure 4A) revealing that C/G-rich triplets generally inhibited Tev169-xOnu survival.
We also assayed all 64-triplet variants in the quantitative yeast-based lacZ repair assay and plotted the activity for each triplet normalized to the activity of the wild-type AAC triplet on a log 2 scale ( Figure 4B). In general, more substitutions within the triplet resulted in lower activity, however some triplets with two or three substitutions were as active as the wild-type AAC triplet ( Figure 4C). In particular, triplets with A or T at the first position display activity on par with the wild-type sequence (for instance, AAT, ATA, TAT and TTT). As with the bacterial two-plasmid selection, triplets with a C and/or G in the first two positions supported lower ␤-galactosidase activity, and triplets with a G in the third position were less active than other nucleotides at this position ( Figure 4D).

MegaTev selects for the appropriately spaced cleavage motif from a random substrate
To determine the optimal sequence and spacing of the cleavage motif, we generated a plasmid library where the CAACG cleavage motif and three downstream base pairs were completely randomized ( Figure 5A, the N8 library). A round of in vitro selection with the N8 plasmid library consisted of in vitro digestion with purified Tev169-xOnu, isolation of linearized plasmid, re-ligation and transformation into E. coli for amplification. After three rounds of selection, the input library and the final round of selection were sequenced using the Ion Torrent platform. After data processing, we first determined the proportion of all 16 possible dinucleotide combinations (ANNNA, ANNNC, ANNNG, etc.) regardless of position within the N8-randomized region to ascertain if the MegaTev displayed a preference for the CNNNG motif. As shown in Figure 5A, the CNNNG motif was greatly enriched by round 1 relative to the other dinucleotide combinations, and predominated by round 3, indicating that CNNNG is the preferred motif. Although minor enrichment relative to the input library was observed for other dinucleotide combinations (ANNNA and ANNNT), these combinations do not support activity in cell-based assays, and are not considered relevant.
We next analyzed the phasing of the CNNNG motif within the N8-randomized region for reads containing this motif. This analysis was undertaken as native I-TevI can cleave the wild-type CAACG motif that has been moved closer to the primary binding site (albeit with lower efficiency) (46). The statistical occurrence of the CNNNG motif is 1 in 15 bp, and would be expected to occur at the four possible positions within the input N8 library (C1:G5, C2:G6, C3:G7 and C4:G8). Indeed, we observed the CNNNG motif with approximately equal frequency at the four possible positions within the input N8 library (Figure 5B, red bars). However, when the sequencing reads for selection rounds 1 and 3 were analyzed, we found an overwhelming preference for C and G at positions 1 and 5, respectively ( Figure 5B, blue and gray bars). This analysis confirms that the MegaTev is cleaving at correctly posi-tioned CNNNG motifs within the N8 library, with the G of the motif positioned 15 bp from the I-OnuI E2 binding site .
Using sequencing reads that possessed the CNNNG motif at positions 1 and 5 of the randomized region, we determined the abundance of each NNN triplet within the motif and plotted the log 2 abundance in a heat map format ( Figure 5C). As anticipated, A/T-rich triplets were preferred over G/C-rich triplets. This analysis also facilitated a comparison to the activity of each NNN triplet in the yeast DNA repair assay ( Figure 4B). Plotting the normalized abundance of each NNN triplet from the sequencing data versus the activity in the yeast-based assay showed that A/T-rich NNN triplets supported higher activity relative to G/C-rich sequences in both data sets ( Figure 5D).

Nucleotide preference within the CNNNG motif and at flanking positions
We next analyzed nucleotide preferences at each position in the N8 library after the third round of selection. Nucleotide preferences were determined by calculating the proportional abundance of each nucleotide at each position for both the input library and round 3 selection, and then plotting the difference (enrichment) between round 3 and input as a heat map ( Figure 6A). One advantage of this analysis is that it corrects for nucleotide bias in the input library. As shown in Figure 6A, apart from the expected C and G preference at positions 1 and 5, the strongest preference was observed at position 7, where T or A was selected for while a C or G was selected against. Interestingly, in four of the positions (3,(6)(7)(8), the wild-type nucleotide was not preferred, implying that the native td target site of I-TevI is not the optimal substrate.
To provide an in vivo context for the nucleotide preferences observed in the flanking DNA sequence, we independently made point substitutions in the TO15 substrate at positions 6 and 7 and tested their activity in the yeast ␤galactosidase assay ( Figure 6B). Substitutions to A or T at position 6 did not drastically reduce activity as compared to the TO15 substrate, while the C6G substitution showed a modest increase in activity. At position 7, the T7A substitution reduced activity by half, while the T7G substitu-tion reduced activity to background levels, supporting the enrichment preferences seen in the in vitro data.
Collectively, these data show that the MegaTev strongly prefers to cleave CNNNG motifs spaced 15 bp from the meganuclease-binding site, agreeing with previous studies on spacing of the cleavage motif (27). We also found that the I-TevI nuclease domain is tolerant of multiple substitutions within the CNNNG motif, with many variants cleaved better than the wild-type AAC triplet, but in general preferring A/T-rich sequences. Defining a strict consensus sequence within the motif is complicated by the observation of different levels of activity for each of the 64-possible CNNNG variants.

Testing for off-target cleavage
Off-target cleavage is a significant concern for genomeediting endonucleases. To assess off-target cleavage of the MegaTev architecture, we took advantage of previously predicted off-targets for the I-OnuI E2 variant (the backbone meganuclease for our MegaTevs) (43). The I-OnuI E2 vari- Boxplot of Tev169-xOnu activity on TO15 substrates with point mutants indicated by underlined and italicized bold-type font. Activity is normalized to Tev169-xOnu on the TO15 substrate. Each mutant was assayed at least three times. Sequences are displayed on the left of the graph with the mutations in bold type and underlined. ant was optimized to cleave a site within the human MAO-B gene, and a CNNNG motif is positioned 11 bp upstream of the I-OnuI E2 binding site. A number of the top-ranked off-target sites also possess CNNNG motifs within 30 bp of the I-OnuI E2 binding site ( Figure 7A), suggesting that they might be substrates for the I-TevI nuclease domain. These sites differ in the spacing of the CNNNG motif from the I-OnuI E2 binding site, and also in the NNN triplet of the cleavage motif. We tested for cleavage at these sites using T7EI digestion of PCR products amplified from HEK 293 cells that were transfected with the dual active Tev169-Onu MegaTev, and by deep sequencing of the PCR products (Table 1). As shown in Figure 7B and Table 1, we observed very low indel levels at the three off-target sites. We attribute undetectable (or extremely low) activity at these sites to the sub-optimal spacing of the CNNNG motif from the I-OnuI site, and to the presence of NNN triplets that are weakly active as judged by the yeast DNA repair assay. While the sites tested represent a small number of the potential off-target sites, our data suggest that very low levels of off-target cleavage will be observed with the MegaTev at off-target sites where the CNNNG motif is not optimally spaced or where the NNN central triplet does not support robust activity.

DISCUSSION
A number of characteristics of the I-TevI nuclease domain and meganucleases make them well suited for genomeediting applications. Of relevance to the current study is the fact that cleavage by each enzyme generates different length non-cohesive overhangs (2-nt 3 overhangs for I-TevI and 4nt 3 overhangs for meganucleases) (47)(48)(49). I-TevI remains bound to its cleavage products and protects DNA ends from exonucleolytic resection, affecting the extent and directionality of DNA repair events (50). Kinetic studies have also shown that product release by meganucleases is rate limiting (51), and denaturation of meganuclease-DNA complexes is often required to resolve cleavage products in agarose gels (52). Our sequencing data largely support this idea, as many reads display no evidence of exonucleolytic processing upor downstream of the I-TevI or I-OnuI E2 cleavage sites. End-sequestration by the single-chain MegaTev would also minimize DNA rearrangements such as translocations, inversions or duplications, as is observed with simultaneous expression of multiple genome-editing nucleases in multiplexing experiments. It is noteworthy that the fragments excised by the dual active MegaTev are similar in length to repair intermediates in the human nucleotide excision repair pathway (53). Thus, the MegaTevs may also be useful as regents for the DNA repair field, particularly if MegaTev nicking variants that excise only a single strand can be developed.
The MegaTevs used in this study were targeted to model DNA substrates. To be generally useful, the MegaTev platform must be able to target a range of sequences. Engineering meganuclease specificity has been greatly accelerated by a detailed understanding of protein-DNA contacts through crystallographic, computational and biochemical studies, and by improvements in screening methodologies (43,(54)(55)(56)(57). Recent efforts suggest that 1 in 300 bp can be targeted by the current set of meganucleases for which detailed protein-DNA contact maps are available (58). The targeting range of the MegaTev platform can also be increased by fusing the I-TevI domain to different meganucleases, as shown with the I-LtrI fusions. An alternative approach to generating MegaTevs that can target sites with a variety of nucleotide compositions could be achieved by fusing different GIY-YIG nuclease domains with distinct cleavage preferences to the same meganuclease backbone. Our preliminary studies in this regard have generated active enzymes by fusion of the GIY-YIG domain from the I-TuIa homing endonuclease to I-OnuI (Wolfs,J.M. and Edgell,D.R., unpublished results) (59).
Precise targeting and prediction of off-target sites will also require a detailed understanding of the nucleotide requirements of the GIY-YIG nuclease domain. In the case of I-TevI, past studies revealed that the nuclease domain required a 5 -CNNNG-3 cleavage motif (45,46) and that the linker was tolerant of multiple substitutions within the DNA spacer that separates the cleavage motif from binding site (44). More recently, the I-TevI cleavage motif was defined as CDDHGS (D = A,G,T; H = A,C,T; S = G or C) in the context of a monomeric TALEN architecture (cTALENs) (28). This consensus sequence differs from that observed for the Tev-mTALEN architecture (60) and for the observed nucleotide preferences of the MegaTevs. One reason may be that the cTALEN study examined cleavage preference within a CNNNG motif positioned 7 bp from the TALE binding site, as we find this distance is not permissive for efficient cleavage in either the MegaTev or Tev-mTALEN architecture. Our screen of DNA spacer length variation, in contrast, shows a clear spacing preference of 15 bp with MegaTevs derived from both I-OnuI and I-LtrI ( Figure 1B). Our results also revealed a wide range of tolerance to substitution within the CNNNG motif, with some NNN triplets cleaved more efficiently than the wild-type AAC sequence. We anticipate that both the spacing and sequence requirement of the CNNNG motif will 'de-toxify' off-target cleavage, for the simple reason that not all offtarget sites will have a permissive motif positioned appropriately from the meganuclease binding site. Indeed, our data, while representing a small number of potential off-target sites, revealed no detectable cleavage by T7EI assays, and an indel rate not significantly different than that observed for mock-transfected cells as judged by deep sequencing.
In summary, the MegaTevs represent a novel fusion of two different active sites to generate a dual nuclease with a high efficiency of gene disruption without the need to overexpress DNA end-processing enzymes. The compact size of the MegaTev, coupled with the high fidelity imparted by the specificity of the I-TevI and meganuclease domains, makes it suitable for genome-engineering applications where minimizing off-target effects is paramount.

SUPPLEMENTARY DATA
Supplementary Data are available at NAR Online.