Interrogating two extensively self-targeting Type I CRISPR-Cas systems in Xanthomonas albilineans reveals distinct anti-CRISPR proteins that block DNA degradation

Abstract CRISPR-Cas systems store fragments of invader DNA as spacers to recognize and clear those same invaders in the future. Spacers can also be acquired from the host's genomic DNA, leading to lethal self-targeting. While self-targeting can be circumvented through different mechanisms, natural examples remain poorly explored. Here, we investigate extensive self-targeting by two CRISPR-Cas systems encoding 24 self-targeting spacers in the plant pathogen Xanthomonas albilineans. We show that the native I-C and I-F1 systems are actively expressed and that CRISPR RNAs are properly processed. When expressed in Escherichia coli, each Cascade complex binds its PAM-flanked DNA target to block transcription, while the addition of Cas3 paired with genome targeting induces cell killing. While exploring how X. albilineans survives self-targeting, we predicted putative anti-CRISPR proteins (Acrs) encoded within the bacterium's genome. Screening of identified candidates with cell-free transcription-translation systems and in E. coli revealed two Acrs, which we named AcrIC11 and AcrIF12Xal, that inhibit the activity of Cas3 but not Cascade of the respective system. While AcrF12Xal is homologous to AcrIF12, AcrIC11 shares sequence and structural homology with the anti-restriction protein KlcA. These findings help explain tolerance of self-targeting through two CRISPR-Cas systems and expand the known suite of DNA degradation-inhibiting Acrs.


Introduction
Bacteria and archaea employ a variety of methods to defend against invaders ( 1 ).Of these, the only known defenses con-ferring adaptive immunity are CRISPR-Cas systems.These systems are incredibly diverse, with two classes, six types and > 30 subtypes defined to-date ( 2 ).Despite their diversity, all CRISPR-Cas systems utilize three general steps for adaptive immunity.In the first step (termed adaptation), CRISPR-Cas systems acquire short nucleic-acid fragments from invaders that are integrated as spacers in between conserved repeats within CRISPR arrays ( 3 ,4 ).In the second step (processing), the CRISPR arrays are transcribed as long precursor CRISPR RNAs (crRNAs) that are processed into mature crRNAs ( 5 ).Finally, in the third step (interference), mature crRNAs guide the CRISPR effector proteins to a DNA or RNA region complementary to the spacer portion of the cr-RNA.Targets flanked by a protospacer-adjacent motif (PAM) or targets lacking complementarity with the repeat portion of the crRNA activate the nuclease (6)(7)(8)(9).Activation then leads to either cleavage of the target that clears the invader or widespread collateral RNA cleavage that induces cellular dormancy (10)(11)(12)(13)(14).
The incorporation of new spacers during the adaptation step is generally biased towards foreign nucleic acids, although accidental incorporation of genomic fragments can occur ( 15 ,16 ).These genomically-acquired spacers would trigger self-attack against its own genome (i.e.autoimmunity) that should be lethal and therefore selected against ( 12 , 15 , 17 ); nevertheless, these spacers are quite common, with about 20% of bacteria with a CRISPR-Cas system harboring one or multiple self-targeting spacers ( 16 ).To-date, several modes of escape have been identified explaining how bacteria can evade autoimmunity triggered by self-targeting spacers ( 18 ).One mode is mutating the cas genes to inhibit one or multiple steps of CRISPR-Cas targeting, although this outcome sacrifices the protective function of the CRISPR-Cas system ( 15 ,19-21 ).Another mode is mutating or deleting the target region or flanking PAM to avoid recognition by the CRISPR-Cas system ( 15 , 22 , 23 ).A third is to block cas expression, again sacrificing the protective function.A final mode is inhibiting targeting by the CRISPR-Cas systems through anti-CRISPR proteins (Acrs), small and diverse proteins often encoded in prophages ( 16 ), that subvert immune defense but can also prevent autoimmunity.
While the different escape modes avert self-targeting for the affected bacteria, it is hard to determine the exact mechanism in a given bacterium with a known self-targeting system.Sequence-based bioinformatic analyses can identify some escape modes such as mutation of cas genes or target regions.However, other modes can be difficult to identify based on sequence information alone.Nevertheless, exploration of bacteria with self-targeting spacers revealed new classes of Acrs and uncovered functions of CRISPR-Cas systems that extend beyond adaptive immunity (24)(25)(26)(27)(28)(29)(30)(31).To-date, few experimental investigations of self-targeting have been conducted, with most focusing on bacteria with a single CRISPR-Cas system and few self-targeting spacers ( 18 ).In this study, we investigated self-targeting by two type I CRISPR-Cas systems encoded in the plant pathogen Xanthomonas albilineans CFBP7063.We discovered two endogenous Acrs that we named AcrIC11 and AcrIF12 Xal , which inhibit the respective system's nuclease activity but not DNA binding activity.Interestingly, AcrIC11 is homologous to the anti-restriction protein KlcA, suggesting that this Acr could also inhibit a distinct and common class of bacterial defenses.Our results uncover how X. albilineans likely escapes extensive self-targeting through two orthogonal CRISPR-Cas systems and expand the small set of known Acrs known to inhibit nuclease activity of type I CRISPR-Cas systems.

Plasmid construction
Supplementary Table S1 lists all plasmids used in this work.pXalb_IC_Cascade_GG was produced by Gibson Assembly (GA) using pXalb_IC_Cascade ( 32 ) as backbone and adding two type I-C repeats interspaced by mrfp1 that can be excised with the restriction enzyme SapI.J23108 was used as a promoter driving array expression.pXalb_IC_Cascade_sp1-4 were produced with GoldenGate using pXalb_IC_Cascade_GG as backbone and SapI (NEB) as restriction enzyme.Inserts were ordered from IDT as singlestranded oligos, phosphorylated by T4 PNK (NEB) and annealed by heating to 95 • C for 5 min and gradually cooling to room temperature.
pXalb_IF1_Cascade_sp1 was created by GA using pX-alb_IF1_Cascade ( 32 ) as backbone and adding two type I-F1 repeats interspaced by spacer 1. J23108 was used as a promoter driving array expression.pXalb_IF1_Cascade_sp2-4 were created by Site Directed Mutagenesis (SDM) on pXalb_IF1_Cascade_sp1.
pXalb_IC_Cascade_NT and pXalb_IF1_Cascade_NT were created by SDM on pX-alb_IC_Cascade_GG and pXalb_IF1_Cascade_sp1, respectively.pXalb_IC_Cas3_J23105 and pXalb_IF1_Cas2-3_J23105 were created by GA using pXalb_IC_Cas3 and pXalb_IF1_Cas2-3 ( 32 ) for nuclease amplification and pCB705 ( 33 ) as backbone, and changing kanamycin resistance to ampicillin resistance.pXalb_noCas3 was produced with SDM on pXalb_IC_Cas3_J23105. p70a_deGFP_sc101 was created by changing the origin of replication (ori) of p70a_deGFP to sc101 with GA using pCB705 ( 33 ) as source for the ori.pAcr_1-17_T7, pAcrIF12_T7 and pAcrIF3_T7 were created by GA using pET28a as backbone and double stranded DNA fragments containing E. coli codon optimized Acr sequences ordered from IDT as inserts.pAcr_1 / 3 / 5 / 7_J23105 and pAcr_15_J23115 were created by SDM on pAcr_1 / 3 / 5 / 7 / 15_T7, respectively.
All constructed plasmids were verified with Sanger sequencing.Plasmids pXalb_IC_Cas3 and pXalb_IF1_Cas2-3 were previously deposited to Addgene and are available with the plasmid ID 178766 and 178769, respectively.

RNA Sequencing
X. albilineans CFBP7063 (also named GPE PC73) was grown in TSB medium to an OD of 1.0 and 2 mL were pelleted.Total RNA was extracted with Direct-zol RNA Miniprep Plus (Zymo Research) including the in-column DNase I treatment according to manufacturer's instructions.An additional DNase I treatment with TURBO DNAfree Kit (Thermo Fisher) was performed and the RNA was cleaned with RNA Clean & Concentrator (Zymo Research).The RNA sample was split into two parts, where one part was used for sequencing of total RNA and the second part was used to sequence shorter-length RNAs.
For sequencing of total RNA, ribosomal RNA was depleted and the cDNA library was prepared using NEB-Next Ultra II Directional RNA Library Preparation Kit (NEB).Next-generation sequencing was performed with 50bp paired-end reads with 25 million reads on an Illumina NovaSeq 6000 sequencer.Sequencing quality was assessed with FastQC ( https://www.bioinformatics.babraham.ac.uk/ projects/ fastqc/ ) and sequencing data was cleaned with Cutadapt ( 34 ).Reads were mapped to the X. albilineans CFBP7063 genome (FP565176.1)using RNA STAR ( 35 ) and visualized with Geneious Prime 2019.1.3( https://www.geneious.com).Htseq-count ( 36 ) was used to determine the amount of reads per gene for calculation of TPM.
For RNA sequencing of shorter-length RNAs, the cleaned RNA was treated with 2 U / μl T4 PNK (NEB) in 1x T4 DNA Ligase Reaction Buffer (NEB) and 1 U / μl SUPERase •In RNase Inhibitor (Thermo Fisher) for 40 min at 37 • C.An additional clean up with RNA Clean & Concentrator (Zymo Research) was added.RNAs with a length of 15-100 nts were selected and the library was prepared using NEBNext Small RNA Library Preparation Kit (NEB).Next-generation sequencing was performed with 150 bp paired-end reads with 30 million reads on an Illumina NovaSeq 6000 sequencer.Sequencing quality was assessed with FastQC ( https://www.bioinformatics.babraham.ac.uk/ projects/ fastqc/ ) and sequencing data was cleaned with Cutadapt ( 34 ).Bowtie2 ( 37 ,38 ) was used to align sequencing data to the X. albilineans CFBP7063 genome (FP565176.1)and Geneious Prime 2019.1.3( https://www.geneious.com) was used to visualize the alignment.
Total RNA-Seq and small RNA-Seq were performed in biological duplicates.The sequencing data is publicly available through GEO accession number GSE229478 .Significance between the expression of the target genes and other genes in major prophage regions was calculated using the average TPM values between replicates and using a one-sided Student's ttest with unequal variance.P = 0.05 was used as the cutoff for significance.

Cascade binding assay in E. coli
To assess the binding ability of the type I-C CRISPR-Cas system, E. coli MG1655 containing p70a_deGFP_sc101 and pXalb_IC_Cascade_s4 or pXalb_IC_Cascade_NT were used.E. coli MG1655 with pXalb_IC_Cascade_s4 only were used as negative control.To determine binding ability of type I-F1 CRISPR-Cas system, E. coli MG1655 containing p70a_deGFP_sc101 and pXalb_IF1_Cascade_s4 or pX-alb_IF1_Cascade_NT were used.E. coli MG1655 with pX-alb_IF1_Cascade_s4 only were used as negative control.
Cells were grown in appropriate selection medium at 37 • C for 16 h.After back diluting cells to OD 600 = 0.02 cells were grown at 37 • C to OD 600 = 0.8.Cells were diluted 1:25 in 1xPBS and deGFP fluorescence was measured by flow cytometry using the Accuri C6 Plus analytical flow cytometer (BD Biosciences).Gating on living cells was applied and 30000 events were measured.Final fluorescence values were calculated by subtracting fluorescence obtained from the negative control.Fold-reduction was calculated by the ratio of no-array over the targeting final fluorescence values.Significance was calculated between the no-array and the targeting fluorescence values using W elch' s t -test.P > 0.05 is shown as ns, P < 0.05 is shown as *, P < 0.01 is shown as ** and P < 0.001 is shown as ***.

Cas3 degradation assay in E. coli
To assess degradation ability of the type I-C system, electrocompetent E. coli MG1655 containing type I-C Cascade and a targeting array (pXalb_IC_Cascade_sp1-3) or a noarray control (pXalb_IC_Cascade_NT) were prepared and electroporated with 50 ng pXalb_IC_Cas3_J23105. 50 ng pX-alb_noCas3 were electroporated as a no-nuclease control.After a one hour recovery in SOC medium at 29 • C, samples were diluted 1:100 in LB medium with 34 μg / mL chloramphenicol (Cm) and incubated at 29 • C for 16 h.Following, 1:5 dilutions series of the cultures were prepared and 5 μl spot dilutions were plated on LB plates with 34 μg / mL Cm and 100 μg / mL ampicillin (Amp).The plates were incubated at 29 • C for 24 h before calculation of colony forming units (CFU) values.
Degradation ability of the type I-F1 system was studied with electrocompetent E. coli MG1655 containing the type I-F1 Cascade and a targeting array (pXalb_IF1_Cascade_sp1-3) or a no-array control (pXalb_IF1_Cascade_NT) that were electroporated with 50 ng pXalb_IF1_Cas2-3_J23105. 50 ng pXalb_noCas3 was electroporated as a no-nuclease control.After a one-hour recovery in SOC medium at 37 • C, samples were diluted 1:100 in LB medium with 34 μg / mL Cm and incubated at 37 • C for 16 h.Following, 1:5 dilutions series of the cultures were prepared and 5 μl spot dilutions were plated on LB plates with 34 μg / mL Cm and 100 μg / mL Amp.The plates were incubated at 37 • C for 16 h before calculation of CFU values.
Transformation fold-reduction was calculated by the ratio of no-array CFU values over targeting CFU values.Significance was calculated between the log 10 (CFU) values obtained by the no-array samples and the targeting samples using W elch' s t -test.P > 0.05 is shown as ns, P < 0.05 is shown as *, P < 0.01 is shown as ** and P < 0.001 is shown as ***.

Acr prediction in X. albilineans
To identify Acr candidates in X. albilineans CFBP7063, we adopted a two-pronged approach looking for novel Acrs or homologs of known Acrs.In our first approach, we performed a guilt-by-association search for HTH motif-containing proteins that are typically found flanking Acr proteins ( 39 ).We then focused on candidates contained in predicted prophage regions on the chromosome based on VirSorter v1.0.3 ( 53 ), Prophage Hunter ( 54 ) and PHASTER ( 55 ,56 ) as well as on any of the three plasmids.The predicted prophage regions are listed in Supplementary Table S2.In our second approach, we began by assembling a database of Acr protein sequences, host genomes and virus genomes.The final database contains 400 Acr protein sequences, which were derived from downloading Acr proteins from published papers ( 25 ,39-47 ).To define new and extended Acr protein families, we performed an allagainst-all sequence similarity comparison on the Acr protein sequences using the Fasta tool ( 48 ).Subsequently, we clustered the Acrs using the Markov Cluster Algorithm (MCL) ( 49 ) based on previously published similarity criteria ( 50 ).Next, we used the MUSCLE tool ( 51 ) to generate a multiple sequence alignment (MSA) for each protein cluster.Successively, we created Hidden Markov models (HMMs) for each cluster based on the generated MSA using Hmmbuild ( 51 ,52 ).The HMM profile models were then run against all genes within the X. albilineans CFBP7063 genome using the Hmmsearch tool.All hits with an e-value below the cut-off of 0.001 were selected.The final list of putative Acrs can be found in Supplementary Table S3, which includes how the Acrs were predicted.
To find protein sequences homologous to AcrIC11, we performed a comprehensive search using four iterations of PSI-BLAST against the metagenomic and NCBI non-redundant protein databases.Next, multiple sequence alignments of the identified proteins and AcrIC11 were generated with the Muscle tool.Finally, a phylogenetic tree was constructed.

Acr activity in TXTL Cascade binding assay
The Cas proteins required for Cascade formation that were used in TXTL experiments were encoded on separate plasmids.Therefore, a MasterMix with the required Cas protein encoding plasmids in their stoichiometric amount was prepared beforehand.For the type I-C system, we used a stoichiometry of Cas5 1 -Cas8c 1 -Cas7 7 and for the type I-F1 system, we used the stoichiometry Cas8f1 1 -Cas5f1 1 -Cas7f1 6 -Cas6f 1 .
To test if and to what extent predicted Acrs lead to inhibition of binding activity in TXTL, we further developed our previously used TXTL deGFP repression assays ( 32 ).Therefore, we prepared 3 μl TXTL reactions containing the following: 2.25 μl myTXTL Sigma 70 Master Mix, 0.2 nM p70a_T7RNAP, 0.5 mM IPTG, 1nM pXalb_IC / IF1_gRNA1 / nt, 0.5 nM I-C or I-F1 Cascade Mas-terMix and 1 nM or 0.125 nM pAcr_X_T7 (1 nM: Acr_1-14 and Acr_16; 0.125 nM: Acr_15 and Acr_17).Acr_15 and Acr_17 were added in lower concentrations to avoid unspecific deGFP-inhibition that we observed at a concentration of 1 nM.Reactions without Acr-containing plasmids were used as '-Acr' controls.The TXTL reactions were incubated in a 96-well V-bottom plate at 29 • C for 4 h to ensure the formation of a ribonucleoprotein complex.Furthermore, the incubation time leads to expression of the Acrs and allows for inhibition of first steps during CRISPR-Cas activity.After the incubation time, 1 nM p70a_deGFP reporter plasmid is added to the TXTL mix, the reaction is incubated at 29 • C for an additional 16 h and fluorescence endpoints are measured with BioTek Synergy H1 plate reader (BioTek) at 485 / 528 nm excitation / emission ( 53 ).The crRNAs encoded in pXalb_IC / IF1_gRNA1 are designed to target within the degfp promoter region 3 of a TTC or a CC PAM for the type I-C or the type I-F1 system, respectively, to ensure active targeting leads to inhibition of deGFP expression.All reactions were prepared with the liquid handling machine Echo525 (Beckman Coulter).Inhibition was calculated with the following equation:

Acr activity in TXTL Cas3 degradation assay
To test Acrs for their inhibitory activity on type I-C or type I-F1 degradation in TXTL, we extended our previously used degradation assay ( 32 ) similar to the above described test to check inhibition of Cascade binding.We shifted the target region from the degfp promoter to an upstream sequence (flanked by a 5 TTC or 5 CC PAM for the type I-C and the type I-F1 system, respectively).Cas3 was added to the TXTL reaction to enable degradation of the reporter plasmid and thereby reduce deGFP production while Cascade binding without degradation would not impair deGFP expression.Inhibition of a CRISPR-Cas system by an Acr in the degradation test but not in the binding test indicates specific inhibition of DNA degradation by the Acr.
Reactions comparing the inhibitory activity of Acr_1 (AcrIF12 Xal ), AcrIF12 and AcrIF3 were performed in 5 μl TXTL reactions as described above.pAcr_1_T7, pAcrIF12_T7 or pAcrIF3_T7 was added at final concentrations of 1-4 nM.

Acr activity in E. coli Cascade binding assay
To test the inhibition of Cascade binding by Acrs in E. coli , we adapted our flow cytometry assay assessing binding ability.E. coli MG1655 containing the reporter plasmid p70a_deGFP_sc101, pAcr_1 / 3 / 5 / 7_J23105, pAcr_15_J23115 or pET28a ('-Acr' control) and pX-alb_IC_Cascade_s4 or pXalb_IC_Cascade_NT were used to investigate the type I-C system.E. coli MG1655 with pXalb_IC_Cascade_s4 only were used as negative control.
To determine binding ability of the type I-F1 CRISPR-Cas system, E. coli MG1655 containing p70a_deGFP_sc101, pAcr_1 / 3 / 5 / 7_J23105 or pAcr_15_J23115 and pX-alb_IF1_Cascade_s4 or pXalb_IF1_Cascade_NT were used.E. coli MG1655 with pXalb_IF1_Cascade_s4 only were used as negative control.
Cells were grown in appropriate selection medium at 37 • C for 16 h.After back diluting cells to OD 600 = 0.02 cells were grown at 37 • C to OD 600 = 0.8.After cells were diluted 1:25 in 1xPBS, deGFP fluorescence was measured by flow cytometry using the Accuri C6 Plus analytical flow cytometer (BD Biosciences).Gating on living cells was applied and 30 000 events were measured.Final fluorescence values were calculated by subtracting fluorescence obtained from the negative control.deGFP fold-repression was calculated by the ratio of no-array over the targeting final fluorescence values.Significance was calculated between the -Acr samples and the Acr-containing samples using W elch' s t -test.P > 0.05 is shown as ns, P < 0.05 is shown as *, P < 0.01 is shown as ** and P < 0.001 is shown as ***.Inhibition was calculated with the following equation:

Acr activity in E. coli Cas3 degradation assay
To test the activity of Acrs in degradation inhibition in E. coli , we adapted our transformation assay assessing degradation ability.For the type I-C system, electrocompetent E. coli MG1655 containing type I-C Cascade, a targeting array (pXalb_IC_Cascade_sp2) or a no-array control (pXalb_IC_Cascade_NT), and pAcr_1 / 3 / 5 / 7_J23105, pAcr_15_J23115 or pET28a ('-Acr' control) were prepared and electroporated with 50 ng pXalb_IC_Cas3_J23105.After a one hour recovery in SOC medium at 29 • C, samples were diluted 1:100 in LB medium with 34 μg / mL Cm and 50 μg / mL kanamycin (Kan) and incubated at 29 • C for 16 h.Following, 1:5 dilutions series of the cultures were prepared and 5 μl spot dilutions were plated on LB plates with 34 μg / mL Cm, 50 μg / ml Kan and 100 μg / ml Amp.The plates were incubated at 29 • C for 24 h before calculation of CFU values.
Degradation ability of the type I-F1 system was studied with electrocompetent E. coli MG1655 containing the type I-F1 Cascade, a targeting array (pXalb_IF1_Cascade_sp3) or a no-array control (pXalb_IF1_Cascade_NT), and pAcr_1 / 3 / 5 / 7_J23105, pAcr_15_J23115 or pET28a ('-Acr' control) that are electroporated with 50 ng pXalb_IF1_Cas2-3_J23105.After a one hour recovery in SOC medium at 37 • C, samples were diluted 1:100 in LB medium with 34 μg / ml Cm and 50 μg / ml Kan and incubated at 37 • C for 16 h.Following, 1:5 dilutions series of the cultures were prepared and 5 μl spot dilutions were plated on LB plates with 34 μg / ml Cm, 50 μg / ml Kan and 100 μg / ml Amp.The plates were incubated at 37 • C for 16 h before calculation of CFU values.
Transformation fold-reduction was calculated by the ratio of no-array over the targeting CFU values.Significance was calculated between the values obtained by the -Acr samples and the Acr-containing samples using W elch' s t -test.P > 0.05 is shown as ns, P < 0.05 is shown as *, P < 0.01 is shown as ** and P < 0.001 is shown as ***.Inhibition was calculated with the following equation: 'NT' represents no-array values and 'T' represents targeting final values.

AlphaFold prediction and structural comparison
Protein structures of AcrIC11 in Supplementary Figure S3 were predicted using AlphaFold2 available through Colab-Fold ( 56 ,57 ) with default settings and subsequently visualized by Molstar ( 58 ).The structure of KlcA is available through the PDB database (2KMG) ( www.RCSB.org ) ( 42 ).The overlay of the predicted protein structure of AcrIC11 with the highest rank and KlcA was performed using the PyMOL Molecular Graphics System, Version 2.5.5 Schrödinger, LLC.For this, both protein structures were opened in PyMOL, and then KlcA was aligned to AcrIC11, which also calculates the RMSD value.Structural similarity was calculated using TMalign (version 20190922) ( 59 ).For comparison of amino acid sequences of AcrIC11 and KlcA, ClustalW multiple sequence alignment by MUSCLE (version 3.8) was performed ( 51 ,60 ).

Results
The two self-targeting CRISPR-Cas systems in Xanthomonas albilineans are actively expressed The Xanthomonas albilineans strain CFBP7063 encodes two CRISPR-Cas systems, a type I-C system and a type I-F1 system, along with six CRISPR arrays.Of these arrays, one is associated with the type I-C system and five are associated with the type I-F1 system (Figure 1 A).In total, four of the six CRISPR arrays (one type I-C array and three type I-F1 arrays) encode 24 self-targeting spacers directed mainly towards predicted prophage regions in the chromosome or towards one of the three plasmids of X. albilineans (Supplementary Table S4).Spacers guide their associated effector complex to complementary targets, resulting in target degradation during the interference step of the CRISPR-Cas immunity.As part of interference, the effector complex Cascade (CRISPR-associated complex for antiviral defense), consisting of three to five Cas proteins and the mature crRNA, binds to the target DNA ( 61 ).The endonuclease Cas3 is then recruited to the target bound by Cascade ( 62 ) to nick the non-target strand and degrade the DNA in a 3 -to-5 direction (63)(64)(65).Our previous work showed that both systems from X. albilineans efficiently carried out both steps of type I interference ( 32 ).
If cas genes are functionally encoded, one mode to evade lethal self-targeting is preventing expression of all or some cas genes.Therefore, we performed RNA sequencing (RNA-Seq) analysis on X. albilineans under different growth conditions.We could detect transcripts for all 13 cas genes (Figures 1 B and Supplementary S1), with expression levels ranging between 7 TPM (transcripts per million) for type I-F1 cas2-3 and 910 TPM for type I-C cas5 .To compare these values to genes that should be functionally expressed, we depicted ten genes that were found to be essential in a member of the Xanthomonas species and exhibit TPM levels comparable to the I-C and I-F1 cas genes ( 66 ) (Supplementary Figure S1).Thus, X. albilineans does not appear to protect against lethal self-targeting by actively suppressing transcription of the cas genes.
Expected processing of pre-crRNAs to mature crRNAs would be another indication of functional expression of the Cas proteins, as Cascade proteins are required for processing of crRNAs and the stability of mature crRNAs ( 67 ,68 ).To examine crRNA processing, we performed RNA-Seq analysis on shorter-length RNAs (Figure 1 C).Most spacers in the CRISPR arrays gave rise to the expected mature crRNAs for either system, while the short arrays 1 and 5 yielded atypical processing products (Figures 1 C and Supplementary S1) ( 5 ).Array 4, the sole type I-C associated array, generally yielded in the expected 11-nt 5 handle (Figures 1 C and Supplementary S1), while arrays 2, 3 and 6 yielded the expected processing pattern of type I-F1 systems (8-nt 5 handle) (Figures 1 C and Supplementary S1) ( 5 ).Interestingly, in arrays 2, 4 and 6, the most abundant crRNAs were self-targeting crRNAs (Figure 1 C), excluding the possibility of preventing autoimmunity by solely expression of crRNAs targeting foreign DNA.Therefore, we conclude that mature crRNAs as well as the necessary Cas proteins are produced to elicit self-targeting in X. albilineans .

Both CRISPR-Cas systems bind and degrade target DNA in E. coli
Beyond cas expression and crRNA processing, we investigated interference as the last step of CRISPR-Cas immunity.While interference could not be assessed in X. albilineans due to technical issues with plasmid transformation, our prior testing of Cascade and Cas3 with cell-free transcription-translation (TXTL) systems suggested that the I-C and I-F1 CRISPR-Cas systems each could enact interference in isolation ( 32 ).To assess if interference activity could lead to lethal chromosomal degradation, we assessed DNA targeting by either system in E. coli .
As Cascade must bind its DNA target before recruiting the nuclease Cas3 to induce target degradation ( 62 ), we first investigated target binding by Cascade in the absence of Cas3 based on transcriptional repression ( 69 ,70 ).We encoded the associated genes forming Cascade for the I-C system ( cas5 , cas8c and cas7 ) or the I-F1 system ( cas8f1 , cas5f1 , cas7f1 and cas6f ) as an operon on a plasmid under a constitutive promoter.The same plasmid also encoded a constitutively expressed single-spacer array we used in a previous study to target the deGFP reporter plasmid ( 32 ).The targets in the promoter of degfp were flanked by a 5 TTC (I-C system) or 5 CC (I-F1 system) PAM, which we previously identified and validated as preferred PAMs in vitro ( 32 ).Finally, the targeted deGFP reporter plasmid was added, and deGFP production was measured (Figure 2 A).Cascade of both CRISPR-Cas systems repressed deGFP expression by ∼700-fold (I-C system) and ∼25-fold (I-F1 system) compared to the non-targeting control (Figure 2 B).Therefore, either system's Cascade can bind DNA targets in vivo .
As target binding was successfully performed by the I-C and the I-F1 Cascade, we proceeded to test targeted DNA degradation by Cas3.We exchanged the deGFP reporter plasmid with a plasmid encoding the I-C cas3 or the I-F1 cas2-3 .We then tested three different spacers targeting the pro-moter or the coding region of the chromosomal gene lacZ with a flanking 5 TTC (I-C system) or 5 CC (I-F1 system) PAM (Figures 2 C and D).Both CRISPR-Cas systems significantly reduced plasmid transformation compared to the noarray control, indicating chromosomal degradation and cell death (Figures 2 C and D).All three spacers of the type I-C system similarly reduced plasmid transformation, whereas spacer 2 and spacer 3 exhibited an ∼80-100 times higher fold change than spacer 1 in the type I-F1 system (Figure 2 D).As expected, the absence of Cas3 negligibly reduced plasmid transformation in both systems (Figure 2 D).Given the lethality of chromosomal targeting with Cascade and Cas3 from either system, additional factors separate from Cascade and Cas3 likely exist that protect the X. albilineans from lethal self-targeting.

Predicted anti-CRISPR proteins inhibit both CRISPR-Cas systems in TXTL
We hypothesized that lethal self-targeting by both CRISPR-Cas systems is inhibited by the presence of Acrs encoded within the X. albilineans CFBP7063 genome ( 71 ,72 ).To identify potential Acrs, we performed a two-pronged approach by searching for homologs of known Acrs as well as applying guilt-by-association ( 39 ) that evaluates genes flanking HTH-containing genes and that are found within predicted prophage regions (73)(74)(75)(76) or on any of the three plasmids (see Figure 3 A, Supplementary Table S2 and methods).This search produced 17 Acr candidates (initially named Acr_1 through Acr_17) (Supplementary Table S3).Our RNA-Seq analyses indicated that a subset of the predicted Acrs is expressed in X. albilineans (Supplementary Table S2), suggesting at least some of the Acr candidates might actively inhibit one or both CRISPR-Cas systems.
We first subjected the predicted Acrs to TXTL assays we used previously ( 33 ,77 ) to assess their inhibitory activity.TXTL assays involve adding DNA constructs, resulting in the production of the encoded RNAs and proteins whose activity can be evaluated in the same reaction.We specifically developed two assays to evaluate the extent to which the inhibitory activity of each predicted Acr acted on or upstream of DNA binding, or on or upstream of DNA degradation (Figure 3 B).The first assay is an extension of the previously used DNA binding assay and assesses inhibition of Cascade-mediated transcriptional repression of deGFP expression.Active Acrs prevent binding of Cascade to a target in the degfp promoter, resulting in unhindered deGFP expression.The second assay is an extension of the previous degradation assay and assesses inhibition of DNA degradation by Cas3 recruited by Cascade.Here, a target upstream of the degfp promoter is chosen such that active Acrs prevent plasmid degradation.Inhibitory activity in both assays would indicate an inhibitory mechanism at or upstream of DNA binding, while inhibitory activity in only the second assay would indicate a degradation-inhibiting mechanism.
We tested all 17 putative Acrs with both assays for their activity against the type I-C and the I-F1 CRISPR-Cas systems (Figure 3 C).Transcriptional repression of degfp by the I-C Cascade was not substantially inhibited by any tested Acr candidate, at least not with an inhibitory activity higher than 11%.Type I-C degradation on the other hand was repressed by multiple Acr candidates, with Acr_3 exhibiting the highest inhibitory activity ( ∼57%) followed by three other Acrs (Acr_1, Acr_5, Acr_7) exhibiting lower but measurable inhibitory activity.Acr_1 fully inhibited degradation by the I-F1 Cas3 but not binding by the I-F1 Cascade, suggesting that this candidate functions as a DNA degradation-inhibiting Acr.Acr_15 partially inhibited repression of deGFP expression in the type I-F1 binding assay by ∼30%, although no inhibition was observed in the degradation assay.No appreciable inhibition was observed for the other Acr candidates suggesting that they are not inhibitors of either system, although we cannot rule out the possibility that the corresponding proteins were not functionally expressed in TXTL.expected for Acr_1, Acr_3, Acr_5 and Acr_7 given our prior TXTL results (Figure 3 C).
As Cascade bound to target DNA recruits Cas3 to induce DNA degradation, we measured the inhibitory activity of each Acr candidate in the E. coli DNA degradation assay (Figure 4 C).In this setup, inhibition of Cas3-mediated chromosomal DNA degradation would result in elevated colony numbers.Similar to our previous in vitro experiments, inhibition of a CRISPR-Cas system in the DNA degradation assay but lacking restoration of deGFP expression in the binding assay categorized the Acr as a degradation-inhibiting Acr.Acr_3 and Acr_1 significantly reduced transformation fold-reduction of the type I-C and type I-F1 system, respectively, compared to a no-Acr control (Figure 4 D).Mirroring our TXTL results, Acr_3 inhibited DNA degradation by 60%, while Acr_1 inhibited DNA degradation by 70% (Figure 4 D).Furthermore, Acr_15 modestly but significantly reduced plasmid transformation of the I-C Cas3 (17-fold reduction of plasmid transformation compared to 71-fold reduction in the no-Acr control), leaving open the question whether Acr_15 represents a bona fide Acr.All other tested Acr candidates did not substantially suppress degradation of one or both CRISPR-Cas systems.Acr_1 and Acr_3, the two candidates that emerged as validated CRISPR inhibitors, were respectively expressed the highest and third highest amongst the 17 candidates in X. albilineans (Supplementary Table S3), in line with active inhibition of self-targeting by both CRISPR-Cas systems.The lack of robust inhibition by Acr_5, Acr_7 and Acr_15 in E. coli reflects some disconnect between the experimental setups and underscores how TXTL results require follow-on validation in cellular systems.
With the validation of the inhibitory activity of Acr_3 and Acr_1 in TXTL and E. coli , we asked how both Acrs are related to formerly identified Acrs.Acr_3 does not share high amino-acid similarity to any previously characterized Acr but has numerous closely and distantly related homologs found in Xanthomonads and other bacterial pathogens (Figures 5 A  and S2).Thus, we renamed Acr_3, following the common nomenclature, to AcrIC11 ( 78 ).Interestingly, the identified homologs are either anti-restriction proteins or specifically KlcA, a known inhibitor of Type I restriction-modification systems ( 79 ).To further explore the relationship between AcrIC11 and KlcA, we predicted the structure of AcrIC11 using AlphaFold ( 56 ,57 ) and compared it to the published crystal structure of KlcA ( 79 ).Based on this comparison, the two proteins adopt similar structures (RMSD = 1.467Å, TM-score = 0.79111) ( 59 ).These findings offer the intriguing possibility that AcrIC11 evolved from an inhibitor of a distinct defense system and may even inhibit both defenses.Beyond Acr_3, Acr_1 shares 44.8% amino-acid identity with the previously published AcrIF12 ( 25 ) (Figure 5 B), therefore, we renamed Acr_1 to AcrIF12 Xal .AcrIF12 was discovered next to an anti-CRISPR-associated gene 4 ( aca4 ) by the 'guilt-by-association' method in Pseudomonas aeruginosa ( 25 ).The exact mechanism of AcrIF12 is unknown, although it was reported that this Acr does not strongly bind to Cascade nor Cas3 in isolation ( 80 ).To test, if AcrIF12 is also active against the X. albilineans type I-F1 system, we subjected AcrIF12 to our degradation-assay in TXTL (Figures 3 B and  5 C).AcrIF12 yielded an inhibitory activity of ∼30%.Interestingly, the inhibitory activity of AcrIF12 and AcrIF12 Xal was maintained when decreasing the Acr plasmid concentration by a factor of four (Figure 5 C), where decreasing the plasmid concentration of AcrIF12 Xal by 500-fold dropped the inhibitory activity from ∼100% to only 80% (Figure 5 D) .In contrast, the inhibitory activity of AcrIC11 dropped from 57% to 11% even when using half the amount of Acr plasmid (Figure 5 D).Such inhibitory activities over a wide range of Acr concentrations have been associated with catalytic Acrs (80)(81)(82).We also tested a separate Acr shown to specifically inhibit Cas3 from the I-F1 system in Pseudomonas aeruginosa as a control ( 83 ,84 ), although no inhibition was observed possibly due to limited host range (Figure 5 C).Overall, these results show that X. albilineans encodes two Acrs that can actively inhibit DNA degradation but not DNA binding by either CRISPR-Cas system, likely explaining how this bacterium evades self-targeting by two distinct self-targeting systems.

Discussion
In this study, we identified two degradation-inhibiting Acrs endogenous to X. albilineans , which we named AcrIC11 and AcrIF12 Xal .By blocking DNA degradation by Cas3, both Acrs are expected to prevent lethal self-targeting by the two CRISPR-Cas systems in X. albilineans .The possibility also remains that additional Cascade-inhibiting Acrs are encoded in the genome of X. albilineans .AcrIC11 and AcrIF12 Xal add to a growing number of Acrs that inhibit Cas3 but not Cascade by two general mechanisms ( 25 , 40 , 44 , 47 , 83-89 ).AcrIF3 and AcrIE1 directly bind Cas3, while AcrIC3 is suggested to do the same ( 47 , 83-85 , 87 ).In contrast, AcrIE2 and AcrIF5 bind Cascade and likely block Cas3 recruitment while preserving Cascade-induced DNA-binding ( 88 ,89 ).The mechanisms employed by AcrIC1, AcrIF16 and AcrIF17 to block DNA degradation remain unknown.
A search for AcrIC11 homologs revealed a large set of proteins annotated as the anti-restriction protein KlcA.KlcA was previously identified as an inhibitor of the four main families of Type I restriction-modification systems in vivo ( 79 ).However, KlcA was unable to inhibit restriction by an archetypal Type I restriction endonuclease in vitro and did not resemble standard anti-restriction proteins functioning as DNA mimics ( 79 ), indicating that KlcA operates through a distinct mode of action.We further showed that AcrIC11 is predicted to fold into a structure strongly resembling that of KlcA, suggesting overlapping functions (Supplementary Figure S3).What remains to be explored is whether AcrIC11 can inhibit Type I restriction-modification systems and / or KlcA proteins can inhibit Type I-C CRISPR-Cas systems.If so, AcrIC11 would be added to a small but growing list of anti-defense proteins that inhibit multiple bacterial defenses.These include the T7 phage protein Ocr that acts through DNA mimicry to inhibit restriction-modification systems and BREX systems ( 90 ,91 ), as well as phage-encoded nucleotidases that sequester or degrade cyclic nucleotide signaling molecules to inhibit diverse bacterial defenses (92)(93)(94)(95).In these examples, though, the anti-defense protein acts through an obvious shared mechanism.In contrast, the most obvious shared property for DNA degradation by the I-C CRISPR-Cas system and DNA restriction by Type I restriction-modification systems is DNA binding, although this is the unlikely mode of inhibition based on AcrIC11 inhibiting DNA degradation but not binding and KlcA failing to block DNA restriction in vitro .Thus, further exploring the mechanism of inhibition of AcrIC11 and the extent to which it can also inhibit restriction-modification systems could reveal new means by which individual proteins could circumvent multiple defenses posted by a bacterial host.
Elucidating the exact mechanisms by which AcrIC11 and AcrIF12 Xal inhibit DNA degradation could reveal new mechanisms of action.In particular, the inhibitory mechanism of AcrIF12 Xal and its homolog AcrIF12 likely differs from already known type I degradation-inhibiting mechanisms based on two observations.First, AcrIF12 did not co-elute with Cascade nor Cas3 in vitro in a previous study ( 80 ), ruling out direct binding with either.Second, we showed that AcrIF12 Xal maintained its inhibitory activity even when its expression plasmid was diluted by 500-fold (Figure 5 D), suggesting that AcrIF12 Xal and AcrIF12 could function as multi-turnover proteins.Elucidating the inhibitory mechanism of AcrIF12 Xal and AcrIF12 therefore could reveal unique means by which Acrs inhibit Cas3-mediated DNA degradation.
Inhibition of DNA degradation by AcrIC11 and AcrIF12 Xal still allows for DNA binding and bears the potential to transform each respective CRISPR-Cas system into a gene regulator.By silencing deGFP expression, we demonstrated that Cascade-mediated gene repression is possible even when AcrIC11 and AcrIF12 Xal are present (Figure 2 B).Gene regulation by self-targeting spacers can be beneficial as was shown previously in Francisella novicida which utilize scaR-NAs (small CRISPR / Cas-associated RNAs) to facilitate immune escape during host invasion ( 29 ) as well as in Haloarcula hispanica which utilize a separately-encoded crRNA to repress expression of a toxin ( 26 ).Interestingly, of the six most highly expressed self-targeting crRNAs (array 2: spacer 1; array 4: spacer 1, spacers 28-30; array 6: spacer 4), five are complementary to regions within the first predicted prophage (Figures 1 C and 3 A, Supplementary Table S4).Furthermore, only one of the spacers, Array 4: spacer 29, would not be expected to yield target DNA binding, as the target region possesses 9 mismatches and a PAM (GGG) not recognized by the I-C system ( 32 ).Exploring the potential of gene repression using the RNA-seq dataset, target genes within the main prophage regions were expressed significantly lower than the other genes in these regions (respective average TPM = 23 and 268, P = 0.000197) (Supplementary Tables S4-S6).However, more exploration is needed to elucidate the extent to which the self-targets are silenced by either CRISPR-Cas system.
The genomic location of AcrIC11 and AcrIF12 Xal provides hints about the history of X. albilineans .AcrIF12 Xal is encoded in the first predicted prophage that also harbors many self-targets (16 in total); this would suggest that AcrIF12 Xal facilitated prophage integration by hindering DNA degradation by the type I-F1 system.In contrast, AcrIC11 is encoded on plasmid II that does not harbor any self-targets.We suspect that plasmid II was present in X. albilineans before integration of the AcrIF12 Xal -bearing prophage, as the prophage contains multiple targets of the I-C system that would be blocked by the action of AcrIC11.Self-targeting spacers could also be acquired after prophage integration, although this seems unlikely given that many of the self-targeting spacers are located at the older end of their respective CRISPR arrays ( 96 ).Overall, elucidating the order of events could shed light on how prokaryotes come to possess self-targeting spacers and the impact on the evolutionary trajectory of each microorganism.

Figure 1 .
Figure 1.RNA-Seq analysis reveals transcription of cas genes and crRNA biogenesis for the two CRISPR-Cas systems in Xanthomonas albilineans.( A ) Ov ervie w of the type I-C and type I-F1 CRISPR-Cas systems endogenous to X. albilineans .cas genes associated with the I-C system and the I-F1 system are shown in different shades of blue and pink, respectively.Spacers complementary to a region in the chromosome of X. albilineans or one of its plasmids are shown in yellow, and spacers without complement arit y are depicted in black.( B ) Mapped reads of the type I-C and I-F1 cas genes f ollo wing total RNA-Seq.( C ) Mapped reads of the mature crRNAs f ollo wing small RNA-Seq of shorter-length RNAs.As the co v erage is highly varying, below each co v erage plot we show separate coverage plots for the regions with lower coverage.Expected processing patterns are indicated with dashed lines.

Figure 2 .
Figure 2. The type I-C and I-F1 CRISPR-Cas systems from X. albilineans bind and degrade target DNA in E. coli.( A ) Overview of the DNA binding assay in E. coli .Cascade (orange) is guided by its crRNA to the target region (blue) on the deGFP-reporter plasmid complementary to the spacer (blue).Cascade binding to its target co v ering the promoter of degfp inhibits deGFP expression that can be measured by flow cytometry.The experimental setup lacking a CRISPR array (no array) serves as a negative control.( B ) DNA binding by Cascade from both X. albilineans systems in E. coli .Without Cas3, Cascade can only bind target DNA, repressing deGFP expression but without target clea v age.( C ) Ov ervie w of the DNA degradation assay in E. coli .Cascade (orange) is guided by its crRNA to the target region (blue) within the promoter or the coding region of lacZ on the E. coli chromosome (target locations are shown in D) and recruits Cas3 (red).CRISPR-Cas interference causes DNA degradation which reduces the colony count on agar plates (gra y).T he e xperimental setups lacking a CRISPR array (no array) or lacking Cas3 serve as negative controls.( D ) DNA degradation by Cascade and Cas3 from both X. albilineans systems in E. coli .Fold-reduction in B and D is calculated based on a no-array control that is missing a spacer complementary to the E. coli genome or the reporter plasmid.The no-array control is the reference for statistical analyses.Bars indicate the mean of triplicate independent experiments.*** P < 0.001.** P < 0.01.* P < 0.05.ns: P > 0.05.

Figure 3 .
Figure 3. P utativ e A crs inhibit DNA binding or DNA degradation via either X. albilineans CRISPR-Cas sy stem in TXTL.( A ) Ov ervie w of the genomic organization of CRISPR-Cas systems, putative Acrs and predicted prophages regions in X. albilineans.The numbering of the arrays corresponds to that in Figure 1 .Placement of the arra y s, A cr candidates and self-targets indicates whether the y are encoded on the top or bottom strand of the chromosome or plasmid.Prophage regions are predicted with VirSorter v1.0.3 ( 73 ), Prophage Hunter ( 74 ) and PHASTER ( 75 , 76 ).Amino-acid sequences of all Acr candidates and their genomic location in X. albilineans can be found in Supplementary Table S3.( B ) Ov ervie w of testing Acr candidates for their binding and degradation inhibition in TXTL.On the left side, inhibition of binding activity is tested.Inhibition of Cascade-mediated transcriptional repression of deGFP expression indicates a functional Acr.On the right side, inhibition of degradation activity is assessed.Inhibition of DNA degradation by Cas3 recruited by Cascade indicates a functional Acr.Inhibition of DNA degradation while allowing Cascade-mediated DNA-binding classifies an Acr as a degradation-inhibiting Acr. ( C ) Inhibitory activity of putative Acrs in TXTL.Inhibitory activity of Acr candidates was tested in triplicates and the mean inhibitory activity is depicted.

Figure 4 .
Figure 4. Acr_1 and Acr_3 inhibit DNA degradation but not DNA binding via either X. albilineans CRISPR-Cas system in E. coli .( A ) Overview of testing A crs f or inhibition of transcriptional repression b y Cascade in E. coli.deGFP e xpression is restored when an A cr activ ely inhibits at and upstream of Cascade binding to its DNA target.See Figure 2 A for more details.( B ) Inhibitory activity of putativ e A crs on Cascade-binding in E. coli .deGFP repression was measured with flow cytometry.Bars represent the average of three biological replicas.( C ) Overview of testing Acrs for inhibition of DNA degradation in E. coli.Acrs actively inhibiting any step upstream of and including Cas3-mediated DNA degradation restore transformation efficiency.See Figure 2 C for more details.Type I-C spacer 2 and type I-F1 spacer 3 were used here.( D ) Inhibitory activity of putative Acrs on Cas3-mediated DNA degradation in E. coli .Fold-reduction in B and D is calculated based on a no-array control that is missing a spacer complementary to the E. coli genome or the reporter plasmid.The -Acr control is the reference for statistical analyses.Bars in B indicate the mean of biological triplicates, while bars in D indicate the mean of biological triplicates carried out with technical triplicates.Data points in B represent biological independent experiments and data points in D represent the mean of technical triplicates of a biologically independent sample.*** P < 0.001.** P < 0.01.* P < 0.05.ns: P > 0.05.

WP_041025165. 1 Figure 5 .
Figure 5. Acr_1 (AcrIF12 Xal ) and Acr_3 (AcrIC11) are homologous to AcrIF12 and the anti-restriction protein KlcA.( A ) Phylogenetic distribution of identified homologs of the Acr candidate Acr_3.Tree branches show close and distant homologs of the Acr candidate.( B ) Sequence alignment of Acr_1 and AcrIF12.Amino acid sequences of Acr_1 and AcrIF12 were aligned with Clustal-Omega 1.2.4.( 55 ).( C ) Inhibitory activity of AcrIF12 Xal , AcrIF12 and AcrIF3 on Cas3-mediated DNA degradation in TXTL.Degradation assays were conducted with Acr plasmid concentrations ranging from 4 nM to 1 nM.For more details see Figure 3 B. ( D ) Inhibitory activity of AcrIC11 and AcrIF12 Xal on Cas3-mediated DNA degradation in TXTL with different Acr concentrations.Degradation assa y s w ere conducted with A cr plasmid concentrations ranging from 2 0 to 2 −2 nM or 2 0 to 2 −9 nM f or A crIC11 or A crIF12 Xal, respectiv ely.B ars in C and D indicate the mean inhibitory activity of triplicate or duplicate independent e xperiments.