Genome manipulation by guide-directed Argonaute cleavage

Abstract Many prokaryotic argonautes (pAgos) mediate DNA interference by using small DNA guides to cleave target DNA. A recent study shows that CbAgo, a pAgo from Clostridium butyricum, induces DNA interference between homologous sequences and generates double-stranded breaks (DSBs) in target DNAs. This mechanism enables the host to defend against invading DNAs such as plasmids and viruses. However, whether such a CbAgo-mediated DNA cleavage is mutagenic remains unexplored. Here we demonstrate that CbAgo, directed by plasmid-encoded guide sequences, can cleave genome target sites and induce chromosome recombination between downstream homologous sequences in Escherichia coli. The recombination rate correlates well with pAgo DNA cleavage activity and the mechanistic study suggests the recombination involves DSBs and RecBCD processing. In RecA-deficient E. coli strain, guide-directed CbAgo cleavage on chromosomes severely impairs cell growth, which can be utilized as counter-selection to assist Lambda-Red recombineering. These findings demonstrate the guide-directed cleavage of pAgo on the host genome is mutagenic and can lead to different outcomes according to the function of the host DNA repair machinery. We anticipate this novel DNA-guided interference to be useful in broader genetic manipulation. Our study also provides an in vivo assay to characterize or engineer pAgo DNA cleavage activity.


INTRODUCTION
Prokaryotic argonaute proteins (pAgos) constitute a diverse protein family (1,2). Unlike their eukaryotic coun-terparts, which use small RNA guides to interfere with RNA targets in regulation and defense (3,4), many pAgos were reported to cleave DNA targets using small guide DNAs (gDNAs) in vitro (5)(6)(7)(8)(9)(10)(11)(12). In vivo, several pAgos were shown to protect bacteria from foreign DNAs (13)(14)(15)(16)(17), but the defense mechanism, especially whether doublestranded DNA breaks (DSBs) are involved, remained elusive until recently. In-depth insights into the mechanism of pAgo-mediated defense were gained by analyzing CbAgo, a pAgo nuclease from a mesophilic bacterium Clostridium butyricum, in Escherichia coli as its expression host (18). In that study, a DNA interference pathway was revealed in CbAgo-mediated protection against invading DNAs. First, CbAgo generates and binds gDNAs from plasmids or other multicopy genetic elements. Next, gDNA-bound CbAgo introduces DSBs at the homologous sites, including chromosomal regions, and causes DNA degradation in collaboration with E. coli exonuclease RecBCD. Invader DNAs such as plasmids and phages can thus be targeted and eliminated efficiently through this mechanism.
It has been postulated that pAgo may have potential applications in genome manipulation ever since the discovery of its DNA nuclease activity, which could represent a novel DNA-guided genome editing tool that overcomes some of the limitations of CRISPR-based methods such as unintended secondary structures in guide RNA and difficulties in guide RNA delivery (5,19). However, to the best of our knowledge, mutations induced by guide-directed cleavage of pAgos in the host genome have never been firmly established. The observation that CbAgo can be directed to generate DSBs in chromosomes by plasmid-encoded guide sequences (GSs) motivated us to leverage such a mechanism to manipulate E. coli genomes. Here we demonstrate that guide-directed CbAgo cleavage can directly induce chromosome recombination between direct repeat sequences, or assist Lambda-Red recombineering in E. coli as a counterselection. The recombination system described here can also serve as an efficient in vivo assay to report or engineer pAgo DNA nuclease activity as we find the recombination rates from different pAgos correlate well with their reported in vitro cleavage activity. These findings demonstrate the potential of establishing a DNA-directed genome editing system using pAgo.

Culture conditions
E. coli, cultured in Luria-Bertani (LB) medium and agar, was incubated at 37 • C or 30 • C. When appropriate, antibiotics were added to the medium at the following final concentrations: ampicillin, 100 g/ml; chloramphenicol, 20 g/ml; kanamycin, 35 g/ml. Bacterial cell growth was monitored periodically by measuring the optical density of culture aliquots at 600 nm.

Strains and plasmids
E. coli strains used in this study are listed in Supplementary Table S1. Plasmids used in this study are listed in Supplementary Tables S2. Oligonucleotides used in this study are listed in Supplementary Tables S3. Procedures for the construction of strains and plasmids are described in Supplementary information. CbAgo-encoded plasmids are pBR322-derived and have copy numbers ∼15-20. Supplementing 100 g/ml ampicillin in the growth media is sufficient to maintain the plasmids, regardless of potential CbAgo-mediated degradation.

Determination of recombination frequency
Cells were transformed with appropriate plasmids and plated on LB plates supplemented with ampicillin. The next day, 5 ml of LB medium supplemented with ampicillin was inoculated with single colony and aerated at 37 • C until OD 600 = 0.3-0.4. The temperature was then adjusted to 18 • C and after 30 min protein expression was induced by adding anhydrotetracycline to 200 ng/ml for 16 h. Cultures were then cooled down on ice for 10 min, washed with icecold PBS (pH 7.2), resuspended in 5 ml of LB medium supplemented with ampicillin, and recovered at 37 • C for 5 h. Serial dilutions of cells were plated on the LB plates supplemented with appropriate antibiotics to determine cfu.

Fluctuation analysis
Cells were transformed with appropriate plasmids and plated on LB plates supplemented with ampicillin. The next day, 1 ml of LB medium supplemented with ampicillin and 200 ng/ml anhydrotetracycline was inoculated with single colony and aerated at 37 • C for 12 h before making serial dilutions of cultures and plating on the LB plates supplemented with appropriate antibiotics to determine cfu.
For plasmid-free strain 3×ChikanS, 3×ChikanS pal246 sbcCD and 3×ChikanS pal246, 1 ml of LB medium was inoculated with single colony and aerated at 37 • C for 12 h before making serial dilutions of cultures and plating on the LB plates without antibiotic or supplemented with kanamycin to determine cfu.
The Ma-Sandri-Sarkar Maximum Likelihood Estimator (MSS-MLE) Method or the Lea-Coulson Method of the Median in the Fluctuation AnaLysis CalculatOR (FAL-COR) (20) were used to calculate recombination rates and 95% confidence intervals. The online FALCOR tool is available at https://lianglab.brocku.ca/FALCOR/.

Flow cytometry analysis
Strain SMR6669 cells were transformed with appropriate plasmids and plated on LB plates supplemented with ampicillin. The next day, 1 ml of LB medium supplemented with ampicillin and 200 ng/ml anhydrotetracycline was inoculated with single colony and aerated at 37 • C for 12 h. Cultures were then washed with ice-cold PBS (pH 7.2), diluted 1:500 into ice-cold PBS (pH 7.2), passed through 40 m cell strainers, added 1 g/ml propidium iodide to determine cell viability, and analyzed on a Beckman Coulter Cytoflex S Flow Cytometer. For each experiment, 10 5 cells per culture and three cultures per strain were analyzed.
Flow cytometry data were analyzed using FlowJo software version 10.8.1. To comparatively quantify green cells, a green 'gate' was set arbitrarily as the window in which ∼0.9% of the control strain, SMR6669/pEmpty fall, according to the spontaneous SOS induction level (21).

Lambda-red recombineering
The kanamycin-resistance cassette was amplified from genomic DNA of SIJ488 lacZ via colony PCR with primers Lambda.Red.F/Lambda.Red.R. Resulting PCR product was gel-purified as dsDNA donor. To calculate mutation efficiency for Lambda-Red recombineering (referred to as the standard recombineering procedure), 5 ml of LB medium was inoculated with single colony of strain SIJ488 recA and aerated at 37 • C until OD 600 = 0.3-0.4. The Lambda-Red genes were then induced with 15 mM L-arabinose for 45 min. The culture was used to prepare electrocompetent cells by washing twice with 10% glycerol and resuspending in 50 l 10% glycerol. 2 l mixture of ∼300 ng dsDNA was added to the cells, which were then subject to electroporation and allowed to recover in 1 ml LB for 2 h at 37 • C. Serial dilutions of cells were plated on the LB plates with no antibiotic or supplemented with kanamycin to determine cfu.
To calculate mutation efficiency for CbAgo-assisted Lambda-Red recombineering, cells of strain SIJ488 recA were transformed with appropriate plasmids and plated on LB plates supplemented with ampicillin. The next day, 5 ml of LB medium supplemented with ampicillin was inoculated with single colony and aerated at 37 • C until OD 600 = 0.3-0.4. The Lambda-Red genes were then induced with 15 mM L-arabinose for 45 min. The culture was used to prepare electrocompetent cells by washing twice with 10% glycerol and resuspending in 50 l 10% glycerol. 2 l mixture of ∼300 ng dsDNA was added to the cells, which were then subject to electroporation and allowed to recover in 1 ml LB supplemented with ampicillin for 2 h at 37 • C. The recovered cells were diluted into 5 ml LB supplemented with ampicillin and 0.2% glucose and continued growing for 2 h at 37 • C. The temperature was then adjusted to 18 • C and after 30 min protein expression was induced by adding anhydrotetracycline to 200 ng/ml for 16 h. Upon induction, cultures were cooled down on ice for 10 min, washed with ice-cold PBS (pH 7.2), resuspended in 5 ml of LB medium supplemented with ampicillin, and recovered at 37 • C for 5 h. Serial dilutions of cells were plated on the LB plates supplemented with appropriate antibiotics to determine cfu.

Statistical analyses
GraphPad Prism 9 was used to evaluate statistical significance. Student's t-test (two-tailed) was used for the statistical analysis of experiments. P values <0.05 were considered significant.

Creation of the recombination system
Previous observation that guide-directed CbAgo cleavage at E. coli chromosomes efficiently triggers RecBCD activity (18) inspired us to hypothesize that RecBCD-dependent chromosome recombination should be triggered by guidedirected CbAgo cleavage as well. It has been shown that DSBs introduced by SbcCD cleavage at a 246-bp chromosomal palindrome (pal246) stimulate RecBCD-dependent recombination between two downstream direct repeat sequences (Supplementary Figure S1) (22). To determine whether similar recombination can be induced by guidedirected CbAgo cleavage ( Figure 1A, Supplementary Figure S2), we integrated a recombination cassette at cynX locus on E. coli strain DL1777, which is ∼6 kb away from the target lacZ locus ( Figure 1B). This recombination cassette contains an EM7 promoter and a kanamycin resistance gene whose function is abolished by the insertion of a stop codon array, which is flanked by two 270-bp direct repeat sequences. A recombination event between the two direct repeat sequences removes the insertion, restores the gene function, and confers kanamycin resistance to the host. It has been demonstrated that the presence of a Chi site, an 8base 5 -GCTGGTGG-3 motif recognized by RecBCD (23)(24)(25)(26), near the direct repeat sequences stimulates recombination (22). Therefore, we incorporated varying numbers of Chi sites into the genome with their 5 ends oriented towards the recombination cassette ( Figure 1B).
To target the lacZ locus, we created a targeting plasmid pTet CbAgo/GS encoding a CbAgo expression cassette under the control of a tetracycline-inducible promoter (pTet), and a 1000-bp GS homologous to lacZ gene ( Figure 1C). Importantly, GS is the only sequence on the plasmid (except for an 80-bp rrnB T1 terminator sequence) that is homologous to the genome, assuring only the lacZ locus will be effectively targeted. For controls, plasmids with no GS, no CbAgo gene, or neither, were created. To determine the dependence of CbAgo cleavage activity, we created plasmids encoding a CbAgo mutant (dCbAgo: CbAgo D541A-D611A) that contains mutations of two catalytic residues in its active site which were previously shown to abolish its endonuclease activity in vitro (6,7) and DSB generation activity in vivo (18).
We then combined the obtained plasmids and strains, induced CbAgo expression, and recovered the induced cells to measure recombination frequencies, which were calculated as the fraction of ampicillin-resistant cells that became resistant to kanamycin (kanamycin-resistant and ampicillin-resistant colony-forming units (cfu)/ampicillinresistant cfu), because only the recombinants have restored functional kanamycin resistance gene. When there are three or six Chi sites adjacent to the recombination cassette (corresponding strain 3×ChikanS and 6×ChikanS), recombination frequencies by CbAgo/GS are significantly higher than the rest control groups (Supplementary Figure S3A), suggesting a recombination pathway that is mediated by guide-directed CbAgo cleavage.
Interestingly, we observed remarkable recombination frequencies in dCbAgo/GS groups in some conditions (Supplementary Figure S3A). When there is no Chi site adjacent to the recombination cassette (corresponding strain nonChikanS), the cell bearing dCbAgo/GS had a recombination frequency being ∼10-fold higher than the one bearing CbAgo/GS. These results suggest there is a recombination pathway that is mediated by the non-cleavage function of CbAgo and is outperformed by the cleavagedependent pathway in the presence of the CbAgo active site. Although the exact mechanism remains unknown, this dCbAgo-mediated recombination pathway should be independent of DSB and RecBCD because dCbAgo/GS was previously shown not able to generate DSB or trigger RecBCD activity in vivo (18).
We also examined the effects of GS length on the recombination frequency (Supplementary Figure S3B) and found that in the range of 50-500 bp, recombination frequency increases as GS length increases. We decided to use 1000 bp as the GS length and 3×ChikanS as the model strain to perform fluctuation analysis (20) to estimate recombination rates ( Figure 1D). Fluctuation analysis, in general, provides an accurate estimate of mutation rate by calculating mutation per generation, while recombination frequency reflects average mutant numbers among the population that may vary greatly because mutants that arise earlier during growth will expand more than those that arise later. The cell bearing CbAgo/GS had a recombination rate that is 5-fold higher than the ones with dCbAgo, with and without GS, 15-fold higher than the one with CbAgo-only, and 30-fold higher over the rest control groups. The actual contribution of dCbAgo-mediated recombination to the total recombination events in 3×ChikanS CbAgo/GS should be much smaller than one-fifth because it should be largely outperformed by the cleavage-dependent pathway as previous observation suggests. Together, these findings reveal novel chromosome recombination that is induced by guidedirected CbAgo cleavage. Its Chi site dependence implies DSB formation and RecBCD processing during recombination.

Validation of the recombination system
To demonstrate the reliability of our recombination system, we tested four additional pAgos in strain 3×ChikanS, including CaAgo, CdAgo, CpAgo and IbAgo (refer to Supplementary Table S4 for the summary of pAgos used in this study). In the presence of GS, the rates of recombination induced by different pAgos correlate with the rank order of their reported in vitro ssDNA cleavage activity (Figure 2A, see (27). We also tested an engineered DSB in our system by integrating a pal246 into the lacZ locus on the genome of strain 3×ChikanS and its sbcCD knockout mutant and measuring their recombination rates ( Figure 2B, Supplementary Figure S4). The sbcCD + , lacZ::pal246 strain yielded a ∼100-fold increase in recombination rate compared to the sbcCD + , lacZ + strain and sbcCD, lacZ::pal246 strain. The bigger fold change stimulated by SbcCD/pal246 over CbAgo/GS is consistent with the previous observation that SbcCD/pal246 is more efficient in DSB generation than CbAgo/GS in vivo (18). These findings indicate a strong correlation between pAgo DNA cleavage activity, DSB generation efficiency, and recombination rate in our system.

Recombination depends on DSB generation and RecBCD but not RecA
To gain more insight into our recombination system, we first sought to provide solid evidence that suggests CbAgo can be directed to attack E. coli chromosomes and cause DNA damage ( Figure 1A, steps i and ii). Since DNA damage induces the cellular SOS response, we used an E. coli strain  Figure S4 for the genetic structure of engineered DSB. Recombination rates were determined by fluctuation analysis from eight independent cultures. Error bars represent 95% confidence intervals. carrying a chromosomally located gfp gene controlled by an SOS-inducible sulA promoter (21) and performed flow cytometry to quantify the single-cell fluorescence level (Figure 3A). We observed a 6-fold increase of fluorescence in cells expressing CbAgo and a 15-fold increase in cells containing CbAgo/GS, while the cells containing dCbAgo/GS exhibited no difference in cellular fluorescence level compared to the cells bearing empty plasmids ( Figure 3B). The fluorescence increase in cells expressing CbAgo without GS can be explained by previous observations that CbAgo actively degrades plasmids (18) and plasmids degradation triggers SOS-response (28). Alternatively, CbAgos that are loaded with chromosome-derived gDNAs can attack chromosomes and trigger chromosome stress. Importantly, the significant increase of fluorescence in cells expressing CbAgo in the presence of GS confirms that CbAgo can be guided to attack chromosomes, while dCbAgo/GS cannot.
Then we sought to determine the involvement of E. coli endogenous DNA repair machinery RecA and RecBCD in the recombination by creating and testing 3×ChikanS recA and 3×ChikanS recBCD mutant strains. Since the viabilities of knockout strains varied a great deal after induction (Supplementary Figure S5), we determined the fluctuation analysis is no longer suitable and decided to directly analyze recombination frequencies. For the 3×ChikanS recBCD strain, there was no difference in the recombination frequency of the CbAgoexpressing cells in the presence or absence of GS. This is a significant change from the result using the recBCD + strain ( Figure 3C), indicating the recombination induced by CbAgo/GS depends on RecBCD ( Figure 1A, step iii). Since RecBCD works closely with DSB (29), this observation also suggests DSB generation in CbAgo/GS induced recombination. An interesting discovery was that the 3×ChikanS recBCD strain bearing dCbAgo/GS showed ∼1000-fold decreased ampicillin-resistant cfu and only ∼20-fold decreased kanamycin-resistant and ampicillinresistant cfu compared to its RecBCD + counterpart (Supplementary Figure S5). These changes resulted in increased recombination frequency ( Figure 3C), suggesting dCbAgomediated recombination is RecBCD-independent. Moreover, this growth inhibition was reduced by ∼60-fold in the presence of the CbAgo active site, suggesting it is outperformed or inhibited when the CbAgo active site is present.
The CbAgo/GS induced DSBs can further explain the extremely low viability of 3×ChikanS recA strain bearing CbAgo/GS (Supplementary Figure S5): without the protection of RecA, continuously introduced DSBs trigger extensive DNA degradation by RecBCD, causing an enormous loss of chromosomal DNA and subsequent cell death (30)(31)(32). On the other hand, these cells had a high recombination frequency close to 0.1 ( Figure 3C), indicating RecA is not essential in the recombination induced by guide-directed CbAgo cleavage even though it actively repairs DSBs generated during the process. These findings motivated us to change the final step of our model ( Figure  1A, step v) to be independent of RecA. We hypothesize that this step--the actual recombination step between the two direct repeat sequences--may involve a mechanism similar to the RecA-independent, direct repeat-mediated DNA deletion during replication arrest (33,34).

CbAgo cleavage assists recombineering
The observation that the cfu of 3×ChikanS recA strain was reduced by three orders of magnitude when its genome is targeted by CbAgo (Supplementary Figure S5) is very intriguing, as it supports a strategy to leverage CbAgo/GS targeting as a counter-selection to facilitate recombineering ( Figure 4A). For comparison, a self-targeting CRISPR-Cas9 system was reported to reduce cfu by three orders of magnitude in E. coli (35). Co-expressing the CRISPR-Cas9 system to eliminate unedited cells, Lambda-Red recombineering achieved an increase of efficiency by ∼10 4 fold and a 65% overall mutation rate. We sought to combine the CbAgo targeting system with Lambda-Red recombineering by introducing CbAgo expression plasmids into strain SIJ488 recA, which is RecA-deficient and has arabinose inducible Lambda-Red recombineering genes integrated into its genome. We first performed the standard recombineering procedure with a dsDNA donor encoding kanamycin resistance cassette to replace the genomic  Recombination frequencies in different genetic contexts. Error bars, mean ± s.d. from eight independent cultures. P values were calculated by two-tailed unpaired Student's t-test; n.s. P > 0.05, *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001. lacZ gene. The recombineering efficiency was 1.2 × 10 −4 , calculated from the fraction of cells that became kanamycin resistant. Then we performed recombineering in CbAgoplasmids contained cells, induced CbAgo expression, and recovered the cells to characterize the proposed counterselection effect. The cell transformed with pTet CbAgo/GS had a mutation efficiency of 2.3 × 10 −2 , representing a ∼100-fold increase in efficiency from standard recombineering ( Figure 4B). Other control groups did not yield improvement, therefore the increased proportion of the edited cell population depends on guide-directed CbAgo cleavage.

DISCUSSION
Our study here demonstrates the combination of CbAgo and plasmid-encoded GS can induce mutations in the E. coli chromosome, via guide-directed CbAgo cleavage of target DNA, activation of DNA repair mechanism, and subsequent chromosome recombination. This strategy may apply to other organisms if chromosomal DSBs can be introduced and necessary cellular repair machinery can be trig-gered. Besides, we also demonstrate the potential of pAgo targeting to assist recombineering in RecA-deficient strains as another genome editing strategy. This method may extend to RecA-active strains, if RecA activity can be efficiently inhibited by, for example, expressing RecA inhibitor (36). A recent study also reported NgAgo-assisted recombineering (NgAgo, pAgo from Natronobacterium gregoryi), but the fold change was smaller than 2 and the enhancement of editing was not dependent on NgAgo endonuclease activity (37). The mechanism of guide-directed recombination by dCbAgo in our system remains unknown, although this pathway appears to be independent of RecBCD and DSB. Since dCbAgo can load gDNAs from plasmids in vivo (18) (Supplementary Figure S2), there is a possibility that dCbAgo may play a role in target recognition and following recruitment of E. coli nuclease or recombinase. This speculation is supported by findings in other pAgo research (14,38,39) and the fact that many pAgo genes have been found associated with a variety of genes including nuclease and helicase (1,13). After recombineering, the growth of unedited cells will be suppressed by CbAgo cleavage and subsequent DNA degradation, while successfully edited cells will be resistant to CbAgo cleavage and exhibit kanamycin resistance. Yellow, kanamycin resistance cassette. Blue, lacZ gene. (B) Mutation rates using strain SIJ488 recA in different genetic contexts. † Standard recombineering procedure was applied using plasmidfree cells. See materials and methods for details. Error bars, mean ± s.d. from three independent experiments. P values were calculated by twotailed unpaired Student's t-test; *P < 0.05.
Although CbAgo is among the most active pAgo nucleases identified so far (27), its DNA cleavage activity is still ∼10-fold lower than restriction endonucleases (7). Our recombination system presented here can potentially serve as a reporter and selection platform, which links pAgo cleavage to the development of antibiotic resistance. Therefore, mutants with enhanced DNA cleavage activity are likely to exhibit higher survival rates and be selected. Highly active pAgo nuclease, once obtained, should have bigger potential in DNA-guided genome engineering in vivo, and alternatively, may serve as a versatile restriction enzyme in vitro with the capability of targeting theoretically any DNA sequence using small oligos as gDNAs, providing a unique advantage over commercial ones (40).
An accompanying study by Esyunina et al. (41) has independently demonstrated that plasmid-guided CbAgo can induce homologous recombination in the target chromosomal loci, promote homologous recombination between plasmid and chromosomal DNA, and that this activity can be used for genome engineering.

DATA AVAILABILITY
Additional notes and data are available in the Supplemental materials.