Depositing centromere repeats induces heritable intragenic heterochromatin establishment and spreading in Arabidopsis

Abstract Stable transmission of non-DNA-sequence-based epigenetic information contributes to heritable phenotypic variants and thus to biological diversity. While studies on spontaneous natural epigenome variants have revealed an association of epialleles with a wide range of biological traits in both plants and animals, the function, transmission mechanism, and stability of an epiallele over generations in a locus-specific manner remain poorly investigated. Here, we invented a DNA sequence deposition strategy to generate a locus-specific epiallele by depositing CEN180 satellite repeats into a euchromatic target locus in Arabidopsis. Using CRISPR/Cas9-mediated knock-in system, we demonstrated that depositing CEN180 repeats can induce heterochromatin nucleation accompanied by DNA methylation, H3K9me2, and changes in the nucleosome occupancy at the insertion sites. Interestingly, both DNA methylation and H3K9me2 are restricted within the depositing sites and depletion of an H3K9me2 demethylase IBM1 enables the outward heterochromatin propagation into the neighboring regions, leading to inheritable target gene silencing to persist for at least five generations. Together, these results demonstrate the promise of employing a cis-engineering system for the creation of stable and site-specific epialleles and provide important insights into functional epigenome studies and locus-specific transgenerational epigenetic inheritance.


INTRODUCTION
Genetic mutations have long been thought to be the dri v er of biological complexity and di v ersity. Emerging e vidence indica tes tha t biological comple xity continues to e volv e e v en in the absence of genetic variations ( 1 , 2 ). It is largely attributed to epigenetic modifications, which do not affect DNA primary sequences and yet govern the expression of genes and play critical roles in di v erse aspects of biological pr ocesses, ranging fr om genome stability to de v elopmental and environmental responses (3)(4)(5). The epigenetic variants, r eferr ed to as epialleles, are associated with a wide range of biological traits such as behavior adaptation in birds ( 6 ), metastable epialleles in mice ( 7 , 8 ), sex determination in fish ( 9 ), flowering regulation in plants ( 10 , 11 ), caffeine and oxidati v e stress induced epimutations in fission yeast ( 12 ), and eye color in Drosophila ( 13 ).
The tr ansgener ational epigenetic inheritance (TEI) of biological traits in epialleles has been well-documented in plants and also some animals (14)(15)(16). As with genetic mutations, TEI provides an important path for organisms to de v elop adapti v e responses to environmental stresses and thus has potential implications for breeding and evolution ( 17 ). With the exception of few epialleles, most well-known epialleles are spontaneous natural epimutations often, if not always, associated with transposable elements and repetiti v e DNA sequences ( 7 , 17 , 18 ). The concept that tandem repeat insertion can induce heter ochr omatin formation and transcriptional silencing has been well-documented in Drosophila and other systems (19)(20)(21) . The mechanisms by which repetiti v e DN A elements are targeted for DN A methylation have also been well understood by using the transgene systems in Arabidopsis ( 22 , 23 ). Howe v er, these studies mainly focus on the insertion sites. It is largely unknown whether the repeat insertion-induced heter ochr omatic marks can spread and silence the adjacent regions. The spreading mechanism, transmission tempo, and stability over generations also remain to be explored.
The silencing and spreading mechanisms of transposons and other repeats with DNA methylation in promoter and intergenic regions have been well studied ( 24 ). Gene promoter methylation r epr esses transcription by directly regulating the binding of transcription factors, and / or indirectly associating with the changes of histone modifications ( 23 ). In Ar abidopsis , approxima tely 5% of genes contain promoter DN A methylation, w hereas the bodies of ov er one-thir d of genes are methylated with unclear function ( 23 , 25 ). Gene body methylation (gbM) with mostly CG DNA methylation is often found in constituti v ely transcribed genes ( 26 ). Other type of intragenic methylation with both CG and non-CG methylation in acti v e genes is relati v ely less studied. While intragenic DNA methylation has been commonly found in di v erse plants ( 27 ), the function of intragenic DNA methylation in transcriptional regulation and how intragenic heter ochr omatin is established, maintained, inherited, and erased over time remain largely unknown.
Centromere not only functions to ensure chromosome segregation during cellular division but also plays important roles in genome ar chitectur e and chromatin regulation ( 28 ). In Arabidopsis , the centromere is largely composed of CEN180 repeats ( 29 , 30 ) and highly organized through epigenetic silencing marks such as DNA methylation and histone H3K9me2 ( 31 ). Centromeric CEN180 repeats, first found as highly recepti v e DNA sequences in the Arabidopsis genome ( 32 ), appear as the main components of the heter ochr omatin in the centromere ( 33 ). Recent studies showed that plant centromere satellite repeats undergo dynamic amplification, expansion, inversions, and adaptation during evolution ( 34 , 35 ). Despite the most abundant DNA repeats in the genome, it remains lar gely unkno wn whether depositing centromere satellite repeats into acti v e euchromatic regions can induce chroma tin sta te transition and heterochroma tin forma tion.
Recently de v eloped epigenome editing technology offers promising tools to engineer DNA methylation states in a locus-specific manner ( 36 ). Currently, the fusion of epigenetic modifiers with editing machinery such as zinc finger nucleases, transcription activa tor-like ef fector nucleases, or deactivated Cas9 has been developed to create an epiallele at a specific site in plants (37)(38)(39)(40)(41). Howe v er, these methods heavil y rel y on the enzyme binding efficiency and catalytic activities at the targeted chromatin environment and thus hav e se v eral limita tions such as low ef ficiency, less specificity, and unstable transmission across generations (38)(39)(40)(41). The de v elopment of powerful tools to engineer stable epialleles site-specifically by introducing functional DNA elements will be highly valuable.
In this study, we de v eloped a cis -regulatory sequencedirected locus-specific epigenome editing strategy with CRISPR / Cas9-mediated deposition of repeat sequences in Arabidopsis thaliana . We inserted centromeric 180bprepeats ( CEN180 ) into the 1 st intron of euchromatic ABI5 loci using CRISPR / Cas9-mediated knock-in system and demonstra ted tha t the deposition of two copies of CEN180 repeats is necessary and sufficient to induce heter ochr oma tin nuclea tion a t the insertion site. DNA methyla tion and H3K9me2 were positi v ely correlated with the inserting repeat number and were restricted at the insertion sites by an H3K9me2 demethylase IBM1. Knocking out IBM1 can induce the outward heter ochr omatin spreading leading to ABI5 silencing that is stable and tr ans-gener ationally inherited for at least fiv e generations. The degree of methyla tion varies drama tically among individual F5 ABI5 epialleles, causing a wide variation in ABA sensitivity. Collecti v ely, these results demonstrate the promise of employing a cis -engineering system for the creation of stable epialleles in a locus-specific manner. Our findings further dissect the mechanism of CEN180 insertion-induced heter ochr omatin nucleation and spreading at the intragenic region.
Seeds were germinated on 1 2 MS plates containing 1% sucrose and 0.8% agar after 2 days at 4 • C and then transferred to long-day light cycles (16 hours light / 8 hours dark) at 22 • C. After ∼7-10 days of growth on plates, seedlings were collected for experiments or transferred to soil and grown at 22 • C under long-day conditions. For germination assays and scoring ABA sensitivity, 20-100 seeds were plated on 1 2 MS plates with 1 M ABA.

Plasmid construction
The CAS9 vector backbone driven by EC1.2 enhancer and EC1.1 promoter ( 42 ) promoters is from addgene (#71288). The sequences of Pol-III promoters U6-26p , U6-29p , and sgRNA scaffolds were replaced by AAGCTTGGTAC-CGGGCCCCCATGG sequence. The AtU6 promoterdri v en gRNA and donor sequence with the 804bp-left homology arm of ABI5 , CEN180 , and 942bp-right homology arm sequences of ABI5 were constructed in pRI909 (Clotech #3260) as the donor construct. Various copy of CEN180 r epeats wer e genera ted by PCR amplifica tion from Arabidopsis Col-0 genomic DNA. The primers used in plasmid construction are shown in Supplementary Table 2.

CRISPR / CAS9-mediated knock-in by sequential plant tr ansf ormation
The fragment knock-in process was performed as previously described ( 43 ). The Arabidopsis Col-0 plants were transformed with plasmids containing an egg-cell specific enhancer and promoter-dri v en CAS9 enzyme via Agr obacterium -media ted floral dipping. After hygromycin selection of T1 plants, immunoblots were used to select a homozygous T2 line with stab le e xpression of CAS9 as the parental line. The donor construct with a sgRNA targeting ABI5 intron1, CEN180 repeats, and homology arms were then transformed into this CAS9-containing parental line.
Positi v e T1 lines were selected on 100 mg / l K anamy cin containing MS plates for 7-12 days and transplanted into soil. PCR-based methods were used to identify positi v e knockin plants. The siliques of kanamycin resistant T1 plants (one silique per T1 plant) were then pooled together for genotyping. Positi v e pools were then separated to further genotype for indi vidual positi v e T1 plants. The homozygous knockin plants were back-crossed with wild-type Col-0 to remove the donor and CAS9 transgene.

Site-specific DNA methylation analysis
Genomic DNA was isolated from one-week plants using CTAB buffer (100 mM Tris-HCl pH8.0, 20 mM EDTA, 1.4 M NaCl, 2% CTAB, 1% PVP). For chop-PCR, 200 ng genomic DNA was digested with McrBC enzyme (New England Biolabs, M0272L) at 37 • C for 2 h followed by heatinactivation of enzyme at 65 • C for 20 min. Both digested and undigested DNA was amplified by loci-specific primers.

Quantitative RT-PCR analysis
Total RNAs were isolated from whole 7-day-old seedlings with or without ABA treatment by using TRIzol reagent (Invitro gen, #15596026). For mRN A expression anal ysis, after the RNase-free DNaseI (NEB, M0303S) treatment, 300 ng total RNAs were used for re v erse transcription by OligodT18VN primer and ProtoScript II (NEB,  #M0368L) re v erse transcriptase according to the manufacturer's instructions. Quantitati v e PCR was performed using the CFX96 Real-Time System (Bio-Rad) and SYBR Green Master Mix (Bio-Rad). At least two biological replicates were used for each sample. Gene transcription le v el was normalized against wild-type Col-0 and internal control gene UBQ10 . The primers for RT-qPCR were listed in Supplementary Table 2.
Small RNA-seq library preparation, sequencing, and analysis Total RNA extraction was performed by using TRIzol reagent (Invitrogen, #15596026) from 7-day-old seedlings, and dissolved in DEPC-treated H2O. Total RNAs were used for library preparation with Real Seq Bioscience RealSeq R -AC Kit (500-00012). The final library products were further purified using 6% polyacrylamide gel (Nov e x ™ TBE Gels, EC6265BOX). The 145-160nt products were excised from the gel for sequencing (single end 50 bp) on a NextSeq 2000 machine (Illumina). The small RNA sequencing data were trimmed using Trimmomatic (v.0.39) and then mapped to the pseudo-genome sequence with insertion of 2 × 180 or 5 × 180 in TAIR10 and called small RNA using ShortStack version 3.8.5 ( 44 ) with parameter setting "-mincov 1rpm -pad 75 -mismatches 0 -nohp.

Whole-genome bisulfite sequencing library construction and analysis
Genomic DNA was extracted from 1-week-old seedlings by CTAB buffer (100 mM Tris-HCl pH8.0, 20 mM EDTA, 1.4 M NaCl, 2% CTAB, 1% PVP) and fragmented into ∼200-400 bp average size with a Covaris S220 sonicator. Sequencing reads were trimmed using Trimmomatic (v.0.39) ( 45 ) and mapped to the pseudo-genome sequence with insertion of 2 × 180 or 5 × 180 in TAIR10 with BSMAP (v.2.90, parameters: -q 20 -v 5 -w 10 -n 1) ( 46 ). The software samtools (v.1.9) ( 47 ) was then used to remove duplicate reads and keep uniquely mapped reads. Methratio.py in BSMAP was used to quantify the DNA methylation of cytosines. Only cytosines covered by more than 4 r eads wer e kept for further analysis. Both MethylKit package ( 48 ) and Fisher's exact test were used to call DMRs, and the DMRs identified by both methods were used for subsequent analysis. DeepTools (v. 3.3.1) ( 49 ) was used to genera te da ta for meta plots. The sna pshots of track data were made by the IGV (2.8.2) browser ( 50 ). Whole genome bisulfite sequencing data of Col-0 used for snapshots were downloaded from the NCBI GEO as accession number GSM4955650 ( 51 ).

Micrococcal nuclease digestion assay
The nuclei purification and cross-link methods followed the same protocol as the ChIP method. The purified nuclei were resuspended by 0.5 ml MNase buffer and divided into two separate 1.5-ml Eppendorf tubes (250 l per aliquot). The aliquoted nuclei were cleaved at 37 • C for 10 min using 1.5 l of MNase enzyme (N3755-200UN, Sigma). Addition of 2.5 l EDTA (0.5M) and 2.5 ul EGTA (0.5M) was used to stop MNase digestion. After re v erse cr oss-linking and pr oteinase K / RNase treatment, DNA was purified by CTAB buffer.

Quantification, statistical analysis and reproducibility
Statistical analyses were carried out using Excel and Graph-Pad Prism 8. Data are presented as mean ± s.d. as indica ted. All sta tistical tests used were two-sided. For the imm unoblots and micro gra phs, at least two independent experiments were repeated with similar results.

Depositing CEN180 repeats induces heter ochr omatin nucleation at ABI5 loci
CEN180 sa tellite repea ts, tandemly arr ay ed with each 180 bp in length ( 29 , 30 ), are the key functional components of Arabidopsis centromere ( 33 ). We chose CEN180 repeats as our insertion targets because they are well organized and highly enriched with silent epigenetic marks such as DNA methylation and H3K9me2 ( 31 ). We inserted CEN180 repeats amplified from Arabidopsis genomic DNA into the euchromatic ABA-insensitive 5 (ABI5) gene using CRISPR-Cas9 knock-in system (detailed in Materials and Methods, Supplementary Figure 1A-E). ABI5 is chosen because of its important function in the ABA signaling pathway ( 53 , 54 ) and easy visualization and detection of the ABA-insensiti v e phenotype caused by abi5 mutation ( 55 ). To determine the minimum repeat unit that confers function, we generated ABI5 knock-in plants carrying 1, 2, 5 and 13 copies (refer as ABI5 1x180 , ABI5 2x180 , ABI5 5x180 , ABI5 13x180 ) of CEN180 repeats (Figure 1 A, Supplementary Figure 1C). Sequence alignment showed that CEN180 repeat sequences in ABI5 knock-in plants are CEN180 repeats shared and common to all fiv e Arabidopsis chromosomes (Supplementary Data 1, Supplementary Table 1). Sequencing of knock-in plants also confirmed that the insertion points for ABI5 1x180 , ABI5 2x180 , ABI5 5x180 , ABI5 13x180 are the same as designed CRISPR target (Supplementary Figure 1D). Site-specific PCR and sanger-sequencing were used to verify the CEN180 copies and sequences in knockin plants (Supplementary Figure 1C, D). We observed homology-dir ected r epair e v ents at the ABI5 target with efficiencies ranging from 0.25-1.16% in T1 knock-in plants (Supplementary Table 3).
Further investigation on the T3 homologous lines revealed that all CEN180 knock-in plants exhibited ABA insensiti v e phenotype (Supplementary Figure 2A)     Consistently, we noted that DNAs in all sequence contexts were methylated and small RNAs were accumula ted a t and surrounding the CEN180 depositing regions in ABI5 2x180 and ABI5 5x180 (Supplementary Figure  2E-J). As a control, we generated a ABI5 knock-in plant carrying ∼1kb non-repeat scrambled sequence and noted onl y subtle DN A methylation downstream of the insertion site (Supplementary Figure 2F). Global DNA methylation in ABI5 2x180 and ABI5 5x180 plants is similar as the nonrepeat control (Supplementary Figure 2K, Supplementary  Table 4), suggesting the specificity of DNA methylation establishment at the ABI5 locus. To exclude the possible effect of donor DNA, we back-crossed the T3 knock-in plants with wild type to remove the CAS9 and donor DNA (Supplementary Figure 1E). Surprisingly, all knock-in plants without donor DNA were sensiti v e to ABA (Figure 1 B) although a slight decrease in ABI5 transcript and protein le v el was noted in ABI5 2x180 , ABI5 5x180 , and ABI5 13x180 plants ( Figure 1 C-E). These results indicate that the donor DNA induced a transcriptional silencing of ABI5 . To rule out any possible effect of donor DNA, we backcrossed ABI5 nx180 -Donor plants with Col-0 and obtained ABI5 nx180 knock-in plants without donor DNA, which were used for all subsequent studies (Supplementary Figure 1E).
Next, we performed an McrBC-based methylation assay and found increased DNA methylation within the CEN180 insertion sites in ABI5 1x180 , ABI5 2x180 , ABI5 5x180 , ABI5 13x180 knock-in plants, and the increased le v el was in a repeat dosage-dependent manner (Figure 1 F). Bisulfite sequencing results further confirmed that DNA methylation particularly CHG and CHH methylation was mostly restricted within the depositing CEN180 sequences in ABI5 2x180 and ABI5 5x180 , whereas CG methylation spreads along the intragenic region (Figure 1 G). Consistently, the small RNAs were also mainly located within the depositing CEN180 (Figure 1 G, Supplementary Table 4), suggesting that small RNAs were involved in the establishment of intr agenic DNA methylation. Similar ly, we observed a significant enrichment of repeat-dosage-dependent H3K9me2 at the CEN180 insertion sites in ABI5 2x180 , ABI5 5x180 and ABI5 13x180 (Figure 1 H). We further performed the MNase assay to determine the chromatin accessibility and found that depositing more than two copies of CEN180 repeats can induce chromatin condensation at the insertion adjacent regions (i.e. promoter regions, Figure 1 I). This is consistent with the decreased ABI5 transcript and protein le v els in ABI5 2x180 , ABI5 5x180 , and ABI5 13x180 plants (Figure 1 D, E).
In Arabidopsis , histone H3 variant H3.1 is associated with heter ochr omatin r egion in the genome, wher eas H3.3 is associated with transcriptionally acti v e regions ( 56 ) and CENH3 is mainly co-located with CEN180 repeats in centromer e r egion ( 57 ). The original ABI5 gene region is occupied by H3.3 ( 56 ). To investigate whether depositing CEN180 at the ABI5 locus can switch the histone H3 variant loading, we performed H3.1 / H3.2 and CENH3 ChIP-qPCR assay and found no enrichment of H3.1 / H3.2 and CENH3 at the depositing CEN180 regions at the ABI5 loci (Supplementary Figure 3A, B). This result suggests that 13 copies of CEN180 r epeats ar e insufficient to recruit CENH3 into the euchromatin, consistent with previ-ous r esear ch on mini-chr omosomes and ring chr omosomes that large CEN180-repeat clusters (a pproximatel y 500 kb or longer in length) are needed to possess normal centromere function in Arabidopsis ( 58 , 59 ).
Together, these results demonstrated that the deposition of CEN180 repeats can induce heter ochr oma tin nuclea tion (i.e. DNA methylation and H3K9me2) at the ABI5 insertion region and that the efficiency is in a repeat dosagedependent manner. Small RNA-directed DNA methylation at the CEN180 insertion sites had a subtle effect on ABI5 transcription. Deposition of 13 or fewer CEN180 repeats is insufficient to incorporate CENH3 at the euchromatic loci.

CEN180 insertion-induced heter ochr omatin nucleation requires both CG and non-CG methylation
We ne xt inv estigated the factors involv ed in the heter ochr oma tin nuclea tion of CEN180 insertion sites by crossing ABI5 2x180 knock-in plants into various DNA methylation mutants. We found a complete loss of CHG and CHH methyla tion a t the CEN180 insertion site in drm1drm2cmt2cmt3 ( ddcc, a quadruple knockout of all four non-CG methyltr ansfer ases) (60) and a strong decrease of CG methylation accompanied with a moderate CHG / CHH reduction in met1 mutant (CG methyltr ansfer ase) ( Figure  2 A-D, Supplementary Figure 4A). Consistently, we noted that the CHG and CHH methylation were greatly reduced in two small RN A bio genesis-deficient m utants, nrpd1 and rdr2 (Figure 2 B, Supplementary Figure 4B). Small RNA sequencing data further confirmed that RN A pol ymerase IV is responsible for the small RN A bio genesis at the depositing CEN180 site (Supplementary Figure 4C). Interestingly, we found that 24nt or > 24nt are the main small RNAs that function in the establishment of intragenic DNA methylation (Supplementary Figure 4D, E). Examination of histone marks at the CEN180 insertion site showed nearly no enrichment of H3K9me2 in met1 and nrpd1 mutants (Figure 2 E) despite containing an appreciable amount of CHG methylation (Figure 2 A-C, Supplementary Figure 4A). This is distinct from the previously established H3K9me2-CHG methylation feedback loop ( 61 ), suggesting tha t CG methyla tion and small RNAs may also be involved in the establishment of H3K9me2 at the CEN180 depositing site. We then examined the nucleosome positioning and found that the CEN180 insertion-induced nucleosome occupancy was significantly reduced in met1 and ddcc mutants at both the insertion and adjacent regions (Figure 2 F). Interestingly, the mutation in the chromatin remodeler DDM1, which exhibited a significant decrease of DNA methylation and H3K9me2 at the CEN180 insertion site (Figure 2 A, B and E, Supplementary Figure  4B), only showed nucleosome impairment at the adjacent r egions (Figur e 2 F). This is consistent with a r ecent r eport that CG and CHG methylation have a larger impact on chromatin accessibility than small RNA-mediated CHH methylation ( 62 ), whereas DDM1 remodels nucleosomes inef ficiently a t euchroma tic loci. Despite the DNA methylation and H3K9me2 loss, we surprisingly found no significant change in ABI5 transcript le v els in met1, ddcc, and nrpd1 mutants (Figure 2 G)   lation and H3K9me2 over CEN180 repeats alone are insufficient to regulate ABI5 transcription in ABI5 2x180 plants.
Together, these results demonstra ted tha t CEN180 insertion-induced DNA methylation, H3K9me2, and changes in nucleosome occupancy depend on both CG and non-CG DNA methyltr ansfer ases.

H3K9 demethylase IBM1 blocks heter ochr omatin spreading at CEN180 insertion sites
Heter ochr omatin can spread along chromosomes from the nucleation site in a DNA sequence-independent manner ( 63 ). In the ABI5 2x180 plants, both DNA methylation and H3K9me2 wer e r estricted to the CEN180 insertion sites and unable to spread to the adjacent regions to fully silence ABI5 (Figure 1 D-G), suggesting that certain fac-tors may block the heter ochr omatin pr opagation at the ABI5 loci. Yeast JmjC domain protein, Epe1, is a putati v e histone H3K9me demethylase and is r equir ed for centr omeric heter ochr omatin integrity by pre v enting heter ochr oma tin spreading a t sites lacking known boundary elements ( 64 , 65 ). In Arabidopsis , IBM1 (Increase in BONSAI methylation) is an H3K9me2 demethylase targeting the ectopic H3K9me2 regions in the genome and its loss of function induces gene-body DNA h ypermeth ylation and se v ere de v elopmental defects ( 66 , 67 ).
To test the function of IBM1 , we crossed the ibm1 mutant into CEN180 knock-in plants. While ABI5 1x180 ibm1 showed ABA sensiti v e phenotype similar to wildtype Col-0 , ABI5 2x180 , ABI5 5x180 and ABI5 13x180 plants with ibm1 mutation exhibited ABA insensitivity (Figure 3 A). We also found notably decreased ABI5 transcript and protein levels     in ABI5 2x180 ibm1, ABI5 5x180 ibm1 and ABI5 13x180 ibm1 , but not in ABI5 1x180 ibm1 plants (Figure 3 B, C). This is consistent with the observation that a single copy of CEN180 repeat was insufficient to induce a high le v el of DNA methylation and H3K9me2 at the insertion sites ( Figure 1 F-H), suggesting that heter ochr omatin spreading depends on preexisting nucleated chromatin. We next examined DNA methylation and found high DNA methylation le v els at the adjacent regions in ABI5 2x180 ibm1, ABI5 5x180 ibm1, and ABI5 13x180 ibm1, but not ABI5 1x180 ibm1 plants (Figure 3 D).
To further explore whether small RNAs are involved in the ibm1 -induced heter ochr omatin spreading, we performed small RNA sequencing in ABI5 2x180 ibm1 plants and found that small RNAs propagated into the adjacent regions similar as DNA methylation (Supplementary Figure 5A). Intriguingl y, w hile the left border and CEN180 insertion site are mainly enriched with 24nt or > 24nt small RNAs, the right border accumulates 22nt, 23nt, 24nt and > 24nt small RNAs (Supplementary Figure 5B-E). This observation indicates that there may be unexplored small RNA biogenesis mechanisms in the intragenic heter ochr omatin spreading.
To identify the DNA methyltr ansfer ase(s) responsible for the methylation spreading, we introduced various DNA methyltr ansfer ase mutants into the ABI5 2x180 ibm1 background and found greatly reduced DNA methylation and H3K9me2 le v els coupled with ABI5 tr anscriptional restor ation in ABI5 2x180 ibm1ddcc (Figure 3 E-H). As a further confirmation, our ABA phenotypic analysis showed that only ∼34% of ABI5 2x180 ibm1ddcc plants exhibited ABA insensitivity, significantly lower than the ∼98% in ABI5 2x180 ibm1 plants (Figure 3 I, J), suggesting that non-CG methylation is involved in the heter ochr omatin spreading in the ABI5 intragenic region.
Altogether, these results demonstrated that IBM1 blocks the spreading of CEN180 insertion-induced DNA methylation and H3K9me2 at ABI5 loci.

Heter ochr omatin spreading is trans-generationally inherited at CEN180 deposition sites
In Arabidopsis , tr ansgener ational inheritance has been documented for se v eral TEs and their neighboring genes, such as BONSAI ( 68 ) and FWA ( 69 ). Gi v en that depletion of IBM1 can induce the spreading of DNA methylation and H3K9me2 (Figure 3 ) at the ABI5 , we examined whether these heter ochr omatic marks can be stably inherited upon the reintroduction of IBM1. We crossed the ABI5 2x180 ibm1 with wildtype Col-0 and found that the majority of F1 progenies were insensiti v e to ABA treatment accompanied by partially r estor ed ABI5 transcript and DNA methylation le v els compared with ABI5 2x180 ibm1 (Figure 4 A-E), suggesting that ibm1 induced heter ochr omatin spreading can be inherited. The inheritance is biparental because F1 progenies from the recipe cross between ABI5 2x180 ibm1 and Col-0 showed a similar phenotype (Figure 4 A-E). Interestingly, 78.8%, 78.3%, and 82.5% F2 progeny of ABI5 2x180 ibm1 x Col-0 , ABI5 5x180 ibm1 x Col-0 , and ABI5 13x180 ibm1 x Col-0 crosses, respecti v ely, e xhibited ABI5 silencing related ABA insensiti v e phenotype (Figure 4 F, G), suggesting that CEN180 insertion-induced ABI5 silencing in ibm1 mutant is likely a non-mendelian inheritance.
Next, we focused on the F3 progenies containing homozygous ABI5 2x180 and wild-type IBM1 (ABI5 2x180 IBM1) (Figure 5 A). We found that 4 out of 9 (ABI5 2x180 IBM1 #1-F3, #5-F3, #6-F3 and #7-F3) showed ABA insensiti v e phenotype and low ABI5 transcript le v el (Figure 5 B-D) accompanied with moderate DNA methylation le v els at the insertion adjacent regions, similar as ABI5 2x180 ibm1 (Supplementary Figure 6A). This observation suggests that ibm1 mutation-induced DNA methylation and ABI5 silencing state can be transgenerationally inherited e v en after the re-introduction of IBM1 . As a further confirmation, we determined the genome-wide DNA methylation from the two r epr esentati v e epialleles (ABI5 2x180 IBM1 #5-F3 and #7-F3) and found the maintenance of CG, CHG and CHH methylation at the ABI5 loci, to a similar extent as ABI5 2x180 ibm1 ( Figure 5 E). Besides ABI5 , we found 266 and 354 CHG h ypermeth ylated regions in ABI5 2x180 IBM1 #5-F3 and #7-F3, respecti v ely, compared to the ABI5 2x180 parental plants despite the similar global CHG methylation (Supplementary Figure 6B, C). This observation suggests that while the majority of ibm1 -induced DNA h ypermeth ylation spreading is unstable, DNA methylation at certain loci (i.e. ABI5 ) can be tr ans-gener ationally inherited.
Further investigation of F5 generation re v ealed that while all 8 tested ABI5 2x180 IBM1 #5-F5 progenies maintained the DNA methylation and ABI5 silencing state, the ABI5 2x180 IBM1 #7-F5 allele demonstrated a phenotype segregation with some progenies had decreased DNA methylation and increased ABI5 transcription (Supplementary Figure 7A-D). Similarly, H3K9me2 was efficiently maintained in F3 progenies of ABI5 2x180 IBM1 #5 and #7 (Supplementary Figure 5E). Surprisingly, ABI5 2x180 IBM1 #7-F5-2 maintained high H3K9me2 le v els at ABI5 loci despite the loss of DNA methylation ( Supplementary Figure 7F), suggesting that the inheritance of DNA methylation and H3K9me2 might be mediated through different mechanisms. Since non-CG methylation is responsible for the heter ochr omatin spr eading (Figur e 3 E-J), we crossed ABI5 2x180 ibm1 plants with ddcc mutant and found that 46.33% of the F2 plants were insensiti v e to ABA, significantly less than that of F2 progeny of ABI5 2x180 ibm1 backcross with Col-0 ( Figure 5 F). Together, these results showed that CEN180 insertion-induced heter ochr omatin silencing sta te a t ABI5 can be inherited a t least for fiv e generations.

CEN180 insertion induced ABI5 epiallele is transgenerationally inherited
To further understand the stability of these epialleles, we removed the CEN180 repeats by backcrossing ABI5 2x180 ibm1 into wild-type Col-0 and investigated F3 progenies without CEN180 insertion and with wild-type IBM1 ( Figure  6 A). We named it as ABI5 epi , which is genetically identical to the Col-0 . We noted that 3 out of 7 ABI5 epi lines (ABI5 epi -4, ABI5 epi -5, and ABI5 epi -6) showed mild ABA insensiti v e phenotype (Figure 6 B, C) coupled with moderate DNA methylation le v el and ABI5 transcriptional r epr ession ( Figure 6 D-F). Howe v er, the phenotype is not as strong as   Figure  8A-D). This result suggests that the DNA methylation of ABI5 epiallele although is inheritable but less stable in the absence of CEN180 repeats.

DISCUSSION
Much attention of tr ansgener ational epigenetic inheritance has been gi v en to the natural epigenetic variation, which is mostly attributed to the nearby repeat DNA sequences and transposons. Howe v er, the fundamental questions regarding the engineering of locus-specific epigenetic inheritance and the impact of the induced epialleles on organisms remain largely unknown. Here, we de v eloped an epigenome editing approach to engineer a locus-specific epiallele by depositing CEN180 tandem repeats in a euchromatic locus (Figure 7 ). Using the CRISPR / Cas9-mediated knock-in system, we demonstra ted tha t DNA repea ts with various copies can induce heter ochr oma tin nuclea tion in the intragenic region and two CEN180 repeats are necessary and sufficient to induce DNA methylation and H3K9me2 at the insertion sites. The heter ochr oma tin sta te is maintained by both CG and non-CG methylation but is restricted within the depositing regions by an H3K9me2 demethylase IBM1. Depletion of IBM1 enables outward heter ochr omatin pr opagation that is tr ans-gener ationally inherited e v en in the absence of CEN180 repeats (Figure 7 ).    One outstanding question regarding the CEN180 insertion-induced heter ochr oma tin nuclea tion is how the deposited CEN180 repeat sequences are first recognized and then recruit the epigenetic machinery to direct de novo DNA methylation and H3K9me2 in the ABI5 locus. Small RNA-deficient plants showed sharply decreased non-CG methylation and H3K9me2 at the CEN180 insertion site (Figure 2 B-D, Supplementary Figure 4B). Small RNA sequencing data further confirmed that these Pol IV-dependent 24nt small RNAs are involved in the establishment of DNA methylation at the depositing CEN180 site (Figure 1 G and Supplementary Figure 4C-E). It suggests that the recognition of CEN180 sequences at least partially relies on small RNAs, consistent with the wellestablished concept that small RNAs have a recognized role in defense mechanisms of silencing RNA viruses and transposable elements ( 70 ).
Our r esults r e v ealed that a minimum of two CEN180 repeat copies is r equir ed to initiate the nucleation and the establishment of DNA methylation and H3K9me2 at the insertion site (Figure 1 F-I). In Arabidopsis , DNA methyltr ansfer ases CMT2 and CMT3 are shown to preferentially methylate the dinucleosomal DNA substrates and the maintenance of heter ochr oma tin sta te in volves a self-reinf orcing feedback loop between H3K9me2 and DNA methylation ( 71 ). This is consistent with our findings that the establishment and maintenance of DNA methylation and H3K9me2 at the ABI5 locus r equir e at least two CEN180 copies ( ∼360 bp) equivalent to two nucleosomes in length.
The incorporation of multiple histone variants with wellknown epigenetic mechanisms such as histone and DNA modification plays important roles in the di v ersity of nucleosome structure and function ( 72 ). Ther e ar e thr ee major histone H3 variants in both plants and animals: DNA replication-coupled canonical H3.1 / H3.2, the replacement variant H3.3, and centromere-specific CENH3 ( 72 , 73 ). Only ∼15% of the CEN180 repeats are bound by CENH3 in the centromere ( 57 ), suggesting that large parts of CEN180 r epeats ar e associated with H3.1 / H3.2 or H3.3 in the genome. Our results showed that the deposition of 13 × or fewer CEN180 repeat copies in H3.3-bound euchromatin can induce the establishment of DNA methylation and H3K9me2 (Figure 1 F, G)   . All bars r epr esent mean + s.d. from thr ee biological r eplica tes. Dif ferent letters r epr esent significant differ ences ( P < 0.05 by a two-tailed t -test) between samples. Figure 3A, B). It indicates that 13 × CEN180 arrays would be too small and insufficient to exchange the histone H3 variants and form a functional centromere structure in the genome, which is consistent with previous research on Arabidopsis mini-chromosomes and ring chromosomes ( 58 , 59 ).
H3.3 variant incorporated in the euchromatic region, associated with acti v ely e xpressed genes, involv es in regulating gene body DNA methylation in Arabidopsis ( 74 ). Depositing CEN180 repeats induced heter ochr omatin formation in the H3.3-bound region but cannot spread adjacently and fully silence the target ABI5 gene (Figure 1 B-E), suggesting that there are unexplored mechanisms that restrict the DNA r epeat-dir ected gene silencing in the gene coding region. In S. pombe , H3K9 demethylase Epe1 counteracts RNAi-and H3K9 methyltransferase-mediated het-er ochr omatin maintenance and inheritance ( 75 ), re v ealing a read-and-write mechanism in propagating epigenetic information independent of DNA sequences ( 76 , 77 ). Interestingly, a recent study identified an important role of DNA elements on epigenetic memory of pre-existing H3K9 methylation in the heter ochr omatin ( 78 ), demonstrating a DNA-sequence-dependent epigenetic propagation mechanism. In this study, we showed that depositing CEN180 repeats can induce nucleation with a high le v el of DNA methylation and H3K9me2 only at the insertion site, but not the surrounding r egions (Figur e 1 B-G). Deletion of histone H3K9 demethylase IBM1 enab les outwar d heter ochr omatin pr opagation and allows the maintenance of DNA methylation and H3K9me2 for se v er al gener ations (Figure 3 A-D). The DNA methylation spreading includes Nucleic Acids Research, 2023, Vol. 51, No. 12 6051 Figure 7. Model of CEN180 insertion-induced intragenic heter ochr oma tin forma tion, propaga tion, spreading, and inheritance. Deposition of CEN180 sa tellite repea ts into euchroma tic gene bod y regions induces nuclea tion a t the insertion site accompanied with the establishment of DNA methyla tion and H3K9me2. The heter ochr oma tin sta te is maintained by the CG methyltr ansfer ase MET1 and the self-r einfor cing loop between non-CG methylation and H3K9me2. Histone demethylase IBM1 blocks the outward heter ochr omatin pr opagation and depletion of IBM1 enables the heter ochr omatin spreading from depositing sites to adjacent regions, leading to transcriptional silencing. This silencing state is tr ans-gener ationally inheritable for at least fiv e generations e v en in the absence of CEN180 repeats. This CEN180 repeat depositing system proves that gene bod y methyla tion is functional in heter ochr omatin formation and transcriptional gene r epr ession and provides new opportunities for epigenetic-based crop improvement. Created with BioRender.com. all CG, CHG and CHH sequence contexts (Figure 3 G), suggesting that this heter ochr omatin pr opagation is DNA sequence independent. Thus, our result supports the readerwriter coupling model of heter ochr omatin maintenance and spreading through the cell division ( 63 ).
In Arabidopsis , the establishment of H3K9me2 and non-CG DNA methylation depends on a self-r einfor cing r eaderwriter loop for the stable heter ochr omatin maintenance ( 60 , 79 ), while CG methylation is established and maintained by Variant In Methylation (VIM) proteins and MET1 ( 80 ). Of the three meth ylation contexts, meth ylation in CG dinucleotides is considered as the most prone to transgenerational inheritance ( 81 ). Here, we showed that non-CG methylation can also be tr ans-gener ationally inherited. We further re v ealed an important function of IBM1 in restricting both DNA and H3K9 methylation within transcribed gene r egions (Figur e 3 E-J). An intriguing observation is that ibm1 -induced heter ochr omatin spreading is terminated at the transcription start site (Figure 3 G, H), suggesting the existence of other factors or boundary elements in preventing further heter ochr omatin pr opagation. It will be interesting to investigate whether there are spreading boundaries mediated by other histone marks, transcription factors, or chromatin remodelers in moderating the heter ochr omatin propagation along the chromosome.
Paramutation describes as trans -homolog interactions that lead to inheritable epigenetic changes in the gene regulation ( 82 ). Depositing CEN180 repeats in the ABI5 intron region induces cis CEN180 small RNA-dependent epige-netic silencing marks, which cannot spread adjacently and silence ABI5 (Figure 1 ). Depletion of H3K9 demethylase IBM1 induces locus-specific paramutation behaviors (Figures 5 and 6 ). IBM1 mutation enables the small RNA, DNA methylation, and H3K9me2 outward propagation, which leads to the formation of paramutation at CEN180 depositing ABI5 loci (Supplementary Figure 5 and . It suggests that the programmable locus-specific paramutation is highly associated with the self-r einfor cing loop that is dependent on both small RNAs and epigenetic silencing marks. The inheritance of the ABI5 r epr ession phenotype in the descended plants with depositing CEN180 is much more stable than without CEN180 ( Figure 5 -6 ), suggesting that DNA repeats and associated epigenetic silencing marks play a crucial role in the tr ansgener ational inheritance.
Natur al par amutation and epialleles are well-known to use the read-write DNA and H3K9 methylation system to maintain the epigenetic memory across generations ( 63 ). Howe v er, the key step of epigenome engineering at a euchromatic locus is the initiation of heter ochr omatin establishment. Currently, the fusion of epigenetic modifiers with DN A reco gnition domains (i.e. zinc finger, TAL effector, or deactivated CRISPR) has been de v eloped to manipulate locus-specific epigenome engineering (38)(39)(40)(41).
Howe v er, these methods hav e the limita tions of low ef ficiency and less specificity with high off-target rates. Our cis insertion-directed epigenome editing strategy reported in this study is efficient and specific. We have demonstrated that depositing two CEN180 repeats is sufficient to induce heter ochr oma tin nuclea tion specifically a t the insertion regions. The identification of IBM1 in pre v enting the heter ochr omatin spreading into the neighboring regions further supports the editing specificity.
In summary, this study de v eloped an innovati v e epigenome editing strategy to engineer a locus-specific, inheritable epiallele by depositing functional DNA elements into the non-coding region of targeted genes. This strategy expands the current genome and epigenome editing scope from gene coding region to whole genomic loci including promoter and introns and provides new opportunities for cr op impr ovement and clinical applications.

DA T A A V AILABILITY
All WGBS and small RNA-seq data produced in this study were deposited into Gene Expression Omnibus under accession number GSE201629.