A genetic screen identifies BEND3 as a regulator of bivalent gene expression and global DNA methylation

Abstract Epigenetic mechanisms are essential to establish and safeguard cellular identities in mammals. They dynamically regulate the expression of genes, transposable elements and higher-order chromatin structures. Consequently, these chromatin marks are indispensable for mammalian development and alterations often lead to disease, such as cancer. Bivalent promoters are especially important during differentiation and development. Here we used a genetic screen to identify new regulators of a bivalent repressed gene. We identify BEND3 as a regulator of hundreds of bivalent promoters, some of which it represses, and some of which it activates. We show that BEND3 is recruited to a CpG-containg consensus site that is present in multiple copies in many bivalent promoters. Besides having direct effect on the promoters it binds, the loss of BEND3 leads to genome-wide gains of DNA methylation, which are especially marked at regions normally protected by the TET enzymes. DNA hydroxymethylation is reduced in Bend3 mutant cells, possibly as consequence of altered gene expression leading to diminished alpha-ketoglutarate production, thus lowering TET activity. Our results clarify the direct and indirect roles of an important chromatin regulator, BEND3, and, more broadly, they shed light on the regulation of bivalent promoters.


INTRODUCTION
The de v elopment and life of organisms depend on the e xecution of precise tr anscriptional progr ams, which allow the different cell types to respond to stimuli and fulfill their biological function.Part of the information underpinning these programs is hardwired in the genome: promoters and enhancers dri v e the conte xt-dependent e xpression of genes, in response to transcriptional regula tors tha t recognize specific sequences ( 1 ).In addition to this sequence-based informa tion, chroma tin constitutes an additional regulatory layer that is vital for proper gene expression.Nucleosomes ar e omnipr esent in the genome, and the modifications borne by their histone tails profoundly influence gene expression.This regulation occurs through modulating chromatin compaction and directly recruiting or repelling nucleosome remodelers, as well as transcriptional activators and r epr essors ( 2 ).These chromatin marks can be remodeled in response to certain cues and can be (at least partially) carried through mitosis, thereby constituting an epigenetic 'memory' ( 3 ).
Chromatin marks do not just turn genes 'On' or 'Off', at least one other state exists, the 'Poised' situation.Poised genes sim ultaneousl y harbor activating and r epr essi v e histone marks on their promoter, rendering it 'bivalent' ( 4 ).Bivalency is especially prevalent on key de v elopmental genes, where it may facilitate their rapid induction during de v elopmental transitions when they resolve into acti v e or r epr essed states.Bi valency e xists in different cell types, yet a particularly useful model to study bivalency is mouse ES cells, in which poised promoters were in fact initially discovered (5)(6)(7).Work with this system over the past 15 years has led to se v er al key insights, y et the molecular foundations are not well understood.One important question that remains to be elucidated is: which factors maintain bivalent promoters in a poised state?
This paper aims to identify regulators of bivalent genes using minimally biased discovery tools ( 8 ).To accomplish this, we built a reporter cell line in mouse ES cells and carried out a primary CRISPR KO screen, followed by a custom secondary screen using Fluidigm screening.Through this approach, we discover that BEND3, a transcription factor previously known to act at ribosomal DNA ( 9 ), G4 quadruplexes ( 10 ) and pericentric heter ochr omatin ( 11 , 12 ), binds and regulates hundreds of bivalent promoters, via a consensus site that contains a CpG and is only bound when unmethylated.This confirms a recent report, which arri v ed at similar conclusions by independent approaches ( 13 ).Beyond confirming these data, we describe a new role of BEND3, as we show that Bend3 loss leads to a decrease of DNA h ydroxymeth ylation and a global increase of DN A methylation, w hich indirectl y affect hundreds of non-bivalent genes.

Cloning of sgRNA, transfection and transduction in mES cells
Single guide RNAs (sgRNAs) were designed using Benchling / CRISPOR software.The sgRNAs were cloned into either vector PX459 (Addgene #62988) or vector lentiCRISPRv2 (Addgene #52961).For the transfection of ESCs, we used an Amaxa 4D nucleofector (Lonza).Lentiviral particles were produced by calcium phosphate transfection of HEK293T with psPAX2 and pMD2.G plasmids in a BSL3 tissue culture system.Oligonucleotide sequences are listed in Supplementary Table S1.

Generation of DASH ( da zl-m S carlet-H ygromycin R ) reporter cell line
A P2A-mScarlet-T2A-HygroR reporter cassette (Gen-Script) was inserted in-frame into exon 6 of the mouse Dazl gene.The cassette was flanked by Dazl homology arms corresponding to the endogenous intron 5-exon 6 and intron 6 sequences, respecti v ely.To pre v ent re-cutting by Cas9 after cassette insertion by homologous recombination, the Protospacer Adjacent Motif (PAM) sites of the two sgRNAs targeting Dazl exon 6 were mutated within the homology arms.The synthesized cassette was cloned into pUC57-Simple.Two sgRNAs targeting Dazl exon 6 were cloned into the pSpCas9(BB)-2A-GFP backbone (Addgene #48138).Homologous incorporation of the reporter cassette into one of the Dazl alleles was confirmed by PCR and sequencing.Oligonucleotide sequences are listed in Supplementary Table S1.

Uhrf1 KO: validation of DASH mES cells
Two sgRNAs targeting Uhrf1 exon 2 were designed, cloned in the PX459 vector and transfected into DASH cells to test the reporter, followed by a short pulse of 1 g / ml puromycin for 24 h.Knockout efficiency was assessed by western blotting.Oligonucleotide sequences are listed in Supplementary Table S1.

CRISPR KO screen and g ener ation of clonal bend3 KO lines
The screen and the CRISPR KOs were performed as detailed in Gupta et al. ( 14 ).Screen results (top 100 hits) are listed in Supplementary Table S2.

Microfluidic RT-qPCR (fluidigm): cDN A synthesis , RT-qPCR and analysis
Total RNA was extracted and quantified as described above.RNA quality and integrity were analyzed by capillary electrophoresis with the Fragment Analyzer (Agilent Technologies) to calculate the RNA quality number (RQN) for each sample.Defined on a scale ranging from 1 to 10, the mean RQN of the 52 samples was 9.9, indicating very good RNA quality.The cDNA synthesis was performed using Re v erse Transcription Master Mix (Fluidigm) accor ding to the manufacturer's protocol in a final volume of 5 l containing 200 ng total RNA.Re v erse transcription was performed using a Ne xus Thermocy cler (Eppendorf) with the following parameters: 5 min at 25 • C, 30 min at 42 • C followed b y heat-inactiv ation of re v erse transcriptase for 5 min at 85 • C and immediately cooled to 4 • C. cDNA samples were diluted 5 × by adding 20 l of low TE buffer (10 mM Tris, 0.1 mM EDTA, pH 8.0) (TEKNOVA) and stor ed at -20 • C befor e specific target pr e-amplification.Diluted cDNA was used for multiplex pre-amplification in a total volume of 5 l containing 1 l of 5 × PreAmp Master Mix (Fluidigm), 1.25 l of cDNA, 1.25 l of pooled assay (Exiqon) with an original concentration of each assay of 10 M and 1.5 l of nuclease-free water.The cDNA samples were subjected to pre-amplification with the following conditions: 95 • C for 2 min, followed by 10 cycles at 95 • C for 15 s and 60 • C for 4 min.The pre-amplified cDNA was diluted 5 × by adding 20 l of low TE buffer and stored at -20 • C before RT-qPCR.The expression of 48 target genes was quantified in 52 samples by quantitati v e PCR using the highthroughput platform BioMark HD System on two 48 × 48 GE Dynamic Arrays (Fluidigm).Pre-amplified cDNA was used in a total volume of 15 l containing 7.5 l of 2 × Gene Expression PCR Master Mix (Thermo Fisher Scientific), 3 l of pre-amplified cDNA, 0.75 l of 20 × EvaGreen (Biotium), 1.5 l of each primer (Exiqon) and 2.25 l nucleasefree water.Thermal conditions for qPCR were: 50 • C for 2 min before initial hea t-activa tion of DN A pol ymerase at 95 • C f or 10 min, f ollowed by 40 cycles at 95 • C for 15 s and 60 • C for 1 min.Melt curve analysis was performed with an increase of 0.2 • C / s, starting from 65 • C for 5 s and ending a t 95 • C .The parameters of the thermocycler were set with ROX as a passi v e r efer ence and a single probe EvaGreen as a fluorescent detector.To determine the quantification cycle Cq, data were processed by an automatic threshold f or each assa y, with linear deri vati v e baseline correction using BioMark Real-Time PCR Analysis Software 4.0.1 (Fluidigm).The quality threshold was set at the default setting of 0.65.Second data analysis was performed using R software to measure the relati v e gene expression using the comparati v e Cq method with the efficiency corrected method of Pfaf fl, after normaliza tion with r efer ence genes (Cq mean of Actinb , Ppia and Rplp0 ).Fold change between experimental and contr ol gr oups was calculated for each sample as the difference of Cq between reference genes and the gene of interest (GOI) in control and experimental conditions.

Flow cytometry
The number of cells expressing mScarlet was determined in the ImagoSeine Core Facility (Institut Jacques Monod) using the Influx or FACSAria Fusion cell sorter (BD Biosciences) with yellow laser (561 nm).The percentage of positi v e cells was determined by flow cytometry using the mScarlet signal intensity threshold (minus background fluorescence from wild-type mESCs).Data were analyzed using FlowJo software (BD Biosciences).

Isolation of genomic DNA
Genomic DNA was isolated from cells by a 200 g / ml proteinase K treatment at 55 • C overnight, followed by 20 g / ml RNase A treatment at 37 • C for 1 h, and extraction using the standard phenol / chloroform / alcohol method.Genomic DNA was suspended in water and quantified with the Qubit dsDNA BR Assay kit on a Qubit 2.0 Fluorometer (Thermo Fisher Scientific) system (Agilent), and only samples with DNA Integrity Number > 9 were used for subsequent analysis.

Identification of CRISPR-induced mutation
Genomic DNA was isolated as described above.sgRNA target region amplicons were generated by PCR using 100 ng genomic DNA, 100 nM primers and Platinum Taq polymerase (Thermo Fisher Scientific).The amplicons were then sequenced.Oligonucleotide sequences are listed in Supplementary Table S1.

Rescue experiments, piggyBac system
In the rescue experiments, the Bend3 coding sequence was synthesized (GenScript), with silent mutations incorporated within the PAM to pre v ent cutting by the sgRNAs that may still be expressed in the KO clone cell lines.This CDS sequence was cloned into a piggyBac vector and cotransfected with PB transposase for stable insertion ( 15 ).Empty piggyBac v ectors serv ed as controls.Transfected cells were selected with 5 g / ml of blasticidin for 5 days and processed for phenotypic and molecular assays.

RT-qPCR
Total RNA was extracted from cells using the RNeasy Plus Mini kit (Qiagen) and quantified using the Qubit RNA BR Assay kit on a Qubit 2.0 Fluorometer (Thermo Fisher Scientific).One microgram of total RNA was reverse transcribed using SuperScript IV Reverse Transcriptase (Thermo Fisher Scientific) and Oligo dT primers (Promega).Actinb , Ppia and Rplp0 were used for normalization.Oligonucleotide sequences are listed in Supplementary Table S1.

DNA methylation analysis: luminometry methylation assay (LUMA)
To assess global CpG methylation, 500 ng of genomic DNA was digested in parallel reactions with MspI + EcoRI and HpaII + EcoRI (NEB), with EcoRI as an internal r efer ence.The rate of CpG methylation is defined as the HpaII / MspI ratio.Samples were analyzed using a PyroMark Q24 Advanced Pyrosequencer (Qiagen).

DNA methylation analysis: LC-MS / MS
Genomic DNA was extracted with RNase A as described above, plus an additional digestion step: 1 g of DNA was treated with 10U DNA Degradase Plus (ZymoResearch) at 37 • C for 4 h, followed by inactivation of the enzyme at 70 • C for 20 min and then Amicon Ultra-0.5 ml 10 K centrifugal filters (Merck Millipore) were used to filter the solution.The reaction mixture retained on the centrifuge filter was processed for LC-MS / MS anal ysis; anal ysis of total 5-mdC and 5-hmdC concentrations was performed using a Q exacti v e mass spectrometer (Thermo Fisher Scientific).The instrument was equipped with an electrospray ionization source (H-ESI II Probe) coupled to an Ultimate 3000 RS HPLC (Ther mo Fisher Scientific).A Ther moFisher Hypersil Gold aQ chromato gra phy column (100 mm × 2.1 mm, 1.9 m particle size) heated to 30 • C was injected with digested DNA.The flow rate was set to 0.3 ml / min and the column was run for 10 min in isocratic eluent consisting of 1% acetonitrile in water containing 0.1% formic acid.Parent ions were fragmented in positi v e ion mode, parallel reaction monitoring (PRM) mode at 10% normalized collision energy; MS2 resolution was 17 500, AGC target was 2e5, maximum injection time was 50 ms, and separation window was 1.0 m / z .The inclusion list contained the following masses: dC (228.1),5-mdC (242.1) and 5-hmdC (258.1).Extracted ion chromatograms ( ±5 ppm) of basic fragments were used for detection and quantification (dC: 112.0506Da; 5-mdC: 126.0662Da; 5-hmdC: 142.0609).Calibration curves were previously generated using synthetic standards in the ranges of 0.2-10 pmol injected for dC and 0.02 to 10 pmol for 5mdC and 5hmdC.Results were expressed as % of total dC.

RNA-sequencing: library preparation
A total amount of 1 g total RNA per sample was used as input material for the RNA sample preparations.RNA samples were spiked with ERCC RNA Spike-In Mix (Thermo Fisher Scientific).Sequencing libraries were generated using NEBNext UltraTM RNA Library Prep Kit for Illumina (NEB) following the manufacturer's recommendations.Briefly, mRN A was purified from total RNA using poly-T oligo-attached magnetic beads.Fragmentation was carried out using divalent cations under elevated temperature in NEBNext First Strand Synthesis Reaction Buffer (5 ×).First-strand cDNA was synthesized using a random hexamer primer and M-MuLV Reverse Transcriptase (RNase H-).Second strand cDNA synthesis was subsequently performed using DNA Polymerase I and RNase H.In the reaction buffer, dNTPs with dTTP wer e r eplaced by dUTP.The remaining overhangs were converted into blunt ends via exonuclease / polymerase activities.After adenylation of 3 ends of DN A fragments, NEBNext Ada ptor with hairpin loop structure was ligated to prepare for hybridization.To select cDNA fragments of pr efer entially 250-300 bp in length, the libr ary fr agments were purified with the AMPure XP system (Beckman Coulter).Then 3 l USER Enzyme (NEB) was used with size-selected, adaptor-ligated cDNA at 37 • C for 15 min followed by 5 min at 95 • C before PCR.Then PCR was performed with Phusion High-Fidelity DN A pol ymerase, Uni v ersal PCR primers and Index (X) Primer.Lastly, products were purified (AMPure XP system) and library quality was assessed on the Agilent Bioanalyzer 2100 system.

RNA-seq: transposable element quantification
Repea tMasker (last upda ted 2012-02-06) was downloaded from the UCSC Table Browser.To measure the expression of transposable element families, PCR duplicates were removed and all reads, including uniquely mapped and multimapped r eads, wer e enumerated using VisRseq.Multimapped r eads wer e counted once, and all individual elements wer e aggr ega ted to calcula te family-wide expression in read count (for differential expression analysis) and RPKM values.

Whole-genome-bisulfite sequencing (WGBS)
Genomic DNA was extracted as described for LC-MS / MS; libr ary prepar ation f or WGBS was perf ormed using the previously described tPBAT protocol ( 17 , 18 ).For library preparation, 100 ng of genomic DNA spiked with 1% (w / w) unmethylated lambda DNA (Promega) was used.Sequencing was performed by MacroGen Japan, Inc. using a HiSeq X Ten system; 8 lanes were allocated for analysis of 20 samples.Sequenced reads were mapped in BMap and compiled in an in-house pipeline as previously described, and custom scripts wer e ar chi v ed using GitHub ( https://github.com/FumihitoMiura/Project-2).Basic metrics for the methylome data are described in Supplementary Table S4.DNA methylation levels on CpGs covered by at least fiv e reads were averaged over the target region of 2 kb bins genome-wide, over CpG island (cpgIslandExt, n = 16 023, last updated 2012-02-09), enhancer elements (Ensembl Regula tory fea tur es r elease 81, n = 73 796), promoters (NCBI Refseq, TSS ± 1 kb, n = 24 371), gene bodies and transposable elements (RepeatMaster, n = 5 147 736).

Principle of the genetic screen
Epigenetic silencing is dynamic in mouse ES cells and responds to culture conditions.In particular, a number of genes ar e expr essed in 2i condition, but ar e r epr essed when the cells are switched to serum (Figure 1 A).The principle of our screen was to select one such gene and replace it with selectable markers to carry out a CRISPR KO screen in serum, and to positi v ely select the cells in which the KO caused a loss of silencing (Figure 1 A).To select the gene in question, we consider ed thr ee criteria: mechanism(s) of epigenetic r epr ession, biological function(s) of the gene and expression le v el in r epr essed and induced condition (to ensur e high enough expression of the selection markers).These criteria led us to select Dazl as our target of choice.Dazl encodes an RNA-binding protein which is critical during gametogenesis ( 19 ), but it is also expressed during other dev elopmental e v ents, as in the 2-cell-embryo ( 20 ).The promoter of Dazl overlaps a large CpG island (1.1 kb); in serum-grown ES cells, the 5 half of the CpG island is heavily methylated on CpGs, while the 3 half is mostly void of DNA methylation.In addition, the promoter of Dazl contains both the activating H3K4me3 and the r epr essi v e H3K27me3 marks and can ther efor e be classified as a bivalent promoter (Figures 1 B and S1A).Dazl expression is low in serum, howe v er genetic e xperiments hav e shown that this r epr ession can be lifted either by impeding DNA methylation ( 21 , 22 ), or by interfering with the ncPRC1.6complex ( 23 , 24 ).One of our goals was to identify new actors in this process (Figure 1 C).
Having chosen Dazl as our reporter gene of interest, we then used CRISPR / Cas9-guided homologous recombination to insert two selectable markers in the gene: mScarlet, encoding a bright red fluorescent protein ( 25 ), and Hygro R , a bacterial enzyme that renders cells resistant to Hygromycin.The two markers were inserted in-frame in the Dazl exon 6 and were separated by self-cleaving peptides (Supplementary Figure S1B).Recombinant clones were picked and characterized by genomic PCR and sequencing.One heterozygous integrant was further characterized; it contains one wild-type Dazl allele, and the mScarlet-Hygro R insertion in the other allele (Figure 1 D).We will refer to this line as the DASH ( Dazl -Scarlet-Hygro) reporter line.
We next carried out a proof-of-concept experiment to verify that the selectable markers could be induced in serum upon the KO of an epigenetic r epr essor.For this, we used CRISPR to knock out a key factor in DNA methylation maintenance, UHRF1 (Supplementary Figures S1C and  S1E).As expected ( 26 , 27 ), the Uhrf1 KO caused a global loss of DNA methylation as seen by LUMA (Figure 1 F).The DASH cells grown in serum became mScarlet-positi v e and Hygromycin-resistant upon Uhrf1 KO (Figures 1 G, H,  and S1D), confirming the feasibility of a genetic screen.

Primary screen
We grew the DASH cells in serum and infected them with a lentiviral library of ∼80 000 vectors co-expressing Cas9 and sgRNAs targeting ∼20 000 genes ( 28 ).The multiplicity of infection was ∼0.1, the coverage ∼150 infected cells per guide and two independent screens were carried out in parallel.After an initial Puromycin selection to eliminate non-infected cells, Hygromycin selection was applied and the fluorescent cells were purified by FACS (Figure 2 A).The sgRNAs present in these cells were amplified by PCR and sequenced by NGS, and the sequencing data statistically analyzed using MAGeCK ( 29 ).This procedure yielded an ordered list of candidates, ranked b y P -v alue, and a highly stringent cutoff ( P -v alue lower than 5 × 10 −6 and FDR lower than 0.01) yielded a list of 12 hits (Figure 2 B).Four of the top 12 hits (ranked #2, 4, 5 and 11 and shown in light blue in Figure 2 B) correspond to factors already known to r epr ess Dazl : UHRF1, E2F6, MGA and DNMT1.Their presence on the list was expected and supports the validity of the screen.Fi v e of the hits (in light green in Figure 2 B) correspond to genes involved in pluripotency or in Heparan Sulfate synthesis.

Secondary screen on the top hits and focus on BEND3
We next sought to validate the results of the primary screen while also prioritizing the hits for further analysis.For this,  we carried out a secondary screen based on gene expression analysis.
We first generated, by lentiviral infection with CRISPR / Cas9 v ectors, ne w KO populations for ten of the 12 top hits (and 3 different negati v e control populations), as depicted in Supplementary Figure S2A.For technical reasons, Grb2 was excluded from the analysis and, instead of knocking out B4galt7 , we targeted B3galt6 , a gene in the same complex and with the same function.Rather than carrying out a whole-genome RNA-seq on each hit, we examined the expression of 21 genes of interest (along with 3 normalizers) in the aforementioned knockouts using the Fluidigm technology.
Unsupervised analysis of the Fluidigm data re v ealed that the KOs were segregated into 4 clusters.Cluster number 1 contained both the Dnmt1 and Uhrf1 KOs, which expectedly had very similar expression profiles.In accordance with published data ( 30 ), these two KOs reactivated the expression of all germline genes, as well as certain repeats, in particular GLN.The Ptpn11 KO was the third and final member of this cluster.
Cluster number 2 included Slc35b2 and B3galt6 KOs, along with Mga and E2f6 , which behaved similarly, also as predicted.Unsurprisingly ( 23 , 24 ), all germline genes were reactivated in this cluster.
The third cluster contained the Bend3 , Spop and Tcf7l1 KOs.Contrary to cluster #2, the germline genes were either unaffected or onl y slightl y upregulated in this cluster.The fourth cluster contained all three negati v e controls, along with the Emc1 KO.Dazl was not reactivated in this cluster, and the other genes were similarly unaffected.
This analysis leads to se v eral useful conclusions.First, 10 out of 11 new KO populations (91%) displayed the induction of Dazl that was also observed in the primary scr een, ther efor e validating the hits.The only false positi v e was Emc1 .Secondly, hits affecting DNA methylation or ncPRC1.6form separate clusters.Third, three hits generate a gene expression pattern that resembles neither DNA methylation mutants nor the ncPRC1.6mutants.These potentially novel hits are Bend3 , Spop and Tcf7l1 .Of those three hits, Spop encodes an adaptor protein for an E3 ubiquitin ligase complex ( 31 , 32 ) and as it seemed least likely to act directly at the Dazl promoter we did not pursue it further in this study.Tcf7l1 is a transcription factor that inhibits the nai v e state in mouse ES cells ( 33 ); as a consequence, cells lacking TCF7L1 tend to remain in a 2i-like state, e v en when grown in serum ( 34 ).Because cells in 2i express Dazl , it seemed possible that the upregulated expression of Dazl in the Tcf7l1 KO cells was merely an indirect reflection of a change of cellular identity towards a more nai v e state.The third hit within the cluster was Bend3 , which encodes a chromatin protein ( 11 ) known to r epr ess transcription at the rDNA ( 9 ).As BEND3 had suitable characteristics for being a new direct repressor of Dazl , we decided to investigate it further.

Isolation of bend3 mutant clones and rescue experiments
To validate the results of the CRISPR screen, we generated new Bend3 mutant populations (Figure 3 A).For this, we infected DASH cells with a mix of two distinct lentiviruses, each expressing Cas9 and a specific sgRNA targeting Bend3 (Supplementary Figure S3A).As Bend3 produces two protein isoforms, we targeted the last exon of Bend3 , which is included in both isoforms.
Two independent Bend3 -mutant populations showed a roughly 10-fold increase of mScarlet-expressing cells relati v e to control, similar to the magnitude observed for Uhrf1 mutant cells (Supplementary Figure S3B).To carry out mor e pr ecise e xperiments, we isolated indi vidual Bend3 mutant clones from these populations (see scheme in Figure 3 A).Three clones were isolated, each of them containing a high proportion of mScarlet-expressing cells (25-35%, relati v e to ∼3% in control cells, Figures 3 B and S3C).These clones also expressed more Dazl mRNA (Figure 3 C) and DAZL protein (Figure 3 D) from the non-targeted allele.Upon sequencing their genomic DNA, we found that two clones had non-sense or frameshift mutations at AA 308, removing about two thirds of the protein.The last clone had a small homozygous in-frame mutation removing four amino acids in the first BEN domain (Supplementary Figure S3D).The BEND3 protein was not detectable in the KO clones (Supplementary Figure S3E), suggesting that the mutant forms are unstable.
We ne xt v erified that the phenotypes could be rescued by reintroducing a WT cDNA (see scheme in Figure 3 A); for this, we used a PiggyBac transposon-based system ( 30 ) driving constituti v e e xpr ession of V5-tagged BEND3.Expr ession of the functional Bend3 cDNA in the Bend3 KO cells decreased the number of mScarlet-positi v e cells to nearbackground le v els (Figures 3 E and S3F), r ender ed the cells sensiti v e to Hygromycin (Figure 3 F), and silenced the expression of the DAZL protein (Figure 3 G).
Together these experiments support the conclusion that BEND3 is a bona fide r epr essor of Dazl expression.

Transcriptome analysis in bend3 mutant cells reveals up-and do wn-r egulated genes
To examine more globally the effects of Bend3 deletion, we performed RNA-seq on the 3 independent mutant clones, grown in serum, which we compared to WT cells (grown in serum or 2i).As shown in Figure 4 A, the mutant cells clustered away from the WT cells, and they were highly distinct from 2i grown cells.This supports the idea that the Bend3 KOs do not reactivate Dazl because they enter a 2i-like state.
Mor e genes wer e r epr essed than induced in the Bend3 KO cells (635 versus 325 r espectively, Figur e 4 B and S4A).As expected, Dazl was among the induced genes (Figure 4 B).A recent publication independently examined the transcriptome of Bend3 -/ -ES cells ( 31 ); we compared our respecti v e RNA-seq results and found that the overlap was significant yet small, with only ∼10-20% upregulated or downregulated genes in common (Supplementary Figure S4B).We speculate this dissimilarity results from different procedures for the mutant cell derivation, combined with different cell culture conditions.
We also examined the behavior of transposable elements in the RNA-seq data (Supplementary Figure S4C).The mutation of Bend3 (left panel) had very little effect on the expression of mouse endogenous retroviruses (ERVs).As a control (right panel), we compared WT cells grown in 2i and serum and predictably found a large induction of ERVs such as IAPs, which validates our analysis pipelines.
Returning to protein-coding genes, we manually inspected some of the differentially expressed (DE) genes and noted that many of them had bivalent marks on their promoter, in line with the r ecent r esults of Zhang et al. ( 13 ).To determine if this observation could be generalized, we used a list of very stringently curated bivalent genes ( 35 ).We cross-r efer enced this list with our RNA-seq data and found that ∼35% of differentially genes in the Bend3 KO relati v e to WT are bivalent; this is a highly significant enrichment ( P = 8 × 10 −54 , Figure 4 C).The enrichment is stronger for genes that are upregulated in the Bend3 mutants (50% of those are bivalent) than for genes that are downregulated in the Bend3 mutant (27% being bivalent, Figure 4 C).Correspondingly, genes induced in the Bend3 mutant were more likely to have a bivalent than a r epr essed status in the WT cells (Figure 4 D).Bivalent chromatin marks often mark developmental genes in mouse ES cells, and in line with this we found that the genes differentially expressed in Bend3 KO cells were markedly enriched for de v elopmental regulators (Supplementary Figure S4D).
Lastly, we performed RT-qPCR in Bend3 KO cells that had been rescued with the V5-BEND3 construct (Figure 4 E).Genes that had been downregulated in the KO cells ( Cpt1a , Tead4 , Klf15 , Col4a1 , Tbx3 ) wer e upr egulated in the rescue cells; conversely, genes that had been induced in the KO cells ( Dazl , Bend3 (measured with primers that do not amplify the r escue construct), Pax6 , Ccdc106 ) wer e r epressed in the rescue cells (Figure 4 E).All of the genes examined in this experiment are directly bound by BEND3, as determined by the ChIP-seq that will be presented later (Figure 5 ).Altogether, these data show that removing BEND3 from serum-grown ES cells alters the expression of hundreds of genes, with a high proportion of bivalent genes.Nevertheless, of the ∼3600 bivalent genes, only about 10% are regulated by BEND3 (Figure 4 C).To understand this situation, we next sought to identify the direct targets of BEND3.

ChIP-sequencing identifies the direct tar g ets of BEND3
We then used ChIP-seq to identify genes directly regulated by BEND3.Using an antibody directed against endogenous BEND3, we identified ∼7400 binding peaks in the ES cell genome (Figure 5 A).This dataset is highly similar to the dataset of Zhang et al. ( 13 ), who used the same antibody as we did (Supplementary Figure S5A).We also   performed ChIP-seq with an antibody against the V5 tag, in Bend3 KO cells expressing V5-BEND3.This data obtained ov erlapped e xtremely well with the ChIP-seq data obtained on the endogenous protein (Supplementary Figure S5B-C).These two lines of evidence strongly support the quality of our ChIP-seq dataset.
Of the ∼7400 peaks detected in the BEND3 ChIP, 47% occur in promoters, 6% in enhancers, 19% in gene bod-ies and 28% in other intergenic regions (Figure 5 A).A de novo search for motifs enriched in the peaks re v ealed a highly enriched motif with a strong selection for the core sequence CCCACG (Figure 5 B).This motif occurred in 70% of all BEND3 peaks, often in multiple instances (Supplementary Figure S5D), hinting that BEND3 binds it directly, which is supported by recent in vitro work ( 13 , 36 ).The overwhelming majority (94%) of promoters bound by BEND3 contain a CpG island, which is a significant enrichment (Figure 5 C).BEND3 binding is prevalent at bivalent promoters; in contrast it is extr emely rar e at promoters bearing H3K9me3 (Supplementary Figure S5E).
We next intersected our RNA-seq data with our BEND3 binding data (Figure 5 D).Only a minority of promoters (5.5%) bound by BEND3 become significantly activated or r epr essed upon KO.Conversel y, onl y about 23% of genes deregulated upon Bend3 deletion have BEND3 binding at their promoter.The genes bound by BEND3 and differentially expressed in the KO, which are likely to be direct targets, are split equally between upregulated genes and downr egulated genes (Figur e 5 E).Such genes include Dazl , which is r epr essed by BEND3 (Figur e 5 F), but also Col4a1 and Col4a2 , which are activated by BEND3 (Figure 5 G).Again, most direct targets of BEND3 (65%) have a bivalent promoter in WT cells (Figure 5 E, Supplementary Figure S5E-F); they have a variable number of BEND3 consensus motifs, with 10% having no motif, to 20% having four motifs or more (Figure S5G).

Global DNA h ypermeth ylation and reduced 5-hmC in bend3 mutant cells
The results above establish that many of the genes that are up-or down-regulated when BEND3 is absent are not direct targets of the protein.To investigate what might underpin these indirect effects of BEND3, we examined DNA methylation.
We first measured global DNA methylation with a method based on restriction enzymes, LUMA ( 37 ).The WT cells grown in serum and 2i media served as controls and, as expected, global DNA methylation was low in 2i and high in serum (Figure 6 A).Additionally, the le v el of DNA methylation was higher in Bend3 mutant cells than in WT cells (both grown in serum).Importantly, rescuing the loss-of-function mutations with a Bend3 -expressing construct brought the le v el of DNA methylation back to WT le v els (Figure 6 A).
To validate these observations with an orthogonal approach, we turned to LC-MS / MS: this experiment confirmed that 5mC was more abundant in the Bend3 KO cells (Figure 6 B).
We then performed Whole-Genome Bisulfite Sequencing (WGBS) on WT cells, and on each individual Bend3 mutant clone (Supplementary Figure S6A).As anticipated, the 2i-grown WT cells were vastly undermethyla ted rela ti v e to serum-grown WT cells (Figure 6 C), and the methylation of genomic tiles in serum-grown WT cells displayed a typical bimodal profile: a minority of tiles unmethylated and most tiles methylated, with a peak around 80% methylation (Figure 6 C), lending confidence to our WGBS experiment and analysis.Upon comparison to WT cells, we saw that the Bend3 mutants had a methylation distribution that was still bimodal but in which the methylated compartment was shifted towards higher methylation levels (Figures 6 C and  S6B).As the three clones behaved similarly, we aggregated their WGBS reads in subsequent analyses.
To carry out a more precise statistical analysis, we narro wed do wn our calculations to 250-bp tiles containing 5 or more CpGs, for a total of 2.1 million tiles.Of these, 91.3% do not significantly gain or lose methylation upon Bend3 m utation, 8.2% gain DN A methylation and 0.5% lose DNA methylation (Figures 6 D, S6C, D).
We then investigated the genomic distribution of the DNA methylation gains.The increase was global and affected promoters , enhancers , gene bodies and transposable elements (Figure 6 E).Over the whole genome, 8.2% of the tiles gain DNA methylation, as stated abov e, howe v er the le v el of incr ease differ ed between genomic elements: enhancers had a marked tendency to gain more DNA methylation, while promoters were slightly under average (Figure 6 F).Looking at CpG island, we saw that the CpG islands themselves were rarely h ypermeth ylated, the CpG shores gained slightly less methylation, whereas the CpG shelves gained more methylation than average (Figure 6 G).In other words, the gains of DNA methylation occur at a distance from the CpG islands, which is exemplified by the gene Tead4 (Supplementary Figure S6E).
We then crossed our WGBS data with ChIP-seq data reflecting the chromatin state in WT cells (Supplementary Figure S6F).We find that regions that were bivalent or marked by H3K4me3 gain more methylation than average.Regions marked by H3K27Ac, which is typical of enhancers, are also much more likely to gain DNA methylation than the genome avera ge, a gain suggesting that enhancers are targets for DNA h ypermeth ylation in Bend3 KO cells.
Next, we intersected our WGBS and RNA-seq data.We find that half of transcriptionally downregulated genes in Bend3 KO cells gain DNA methylation in their promoter (Figure 6 H).The BEND3-bound peaks themselves, or the genes upregulated in Bend3 KO, gain methylation less than the rest of the genome (Supplementary Figure S6G).
We investigated mechanisms that could possibly underpin the elevated DNA methylation seen in Bend3 mutant cells.Neither our RNA-seq data (Supplementary Figure S6H), nor our western blotting experiments (Supplementary Figure S6I) suggested that BEND3 removal upregulated Uhrf1 , Dnmt1 , Dnmt3a , or Dnmt3b , which are key actors of DNA methylation.Ther efor e we specula ted tha t DNA h ydroxymeth ylation might be modified in the Bend3 KOs.To test this idea, we again used LC-MS / MS, and we indeed saw a significant decrease of 5-hmC le v els in Bend3 mutant cells (Figure 6 I).This effect is not due to transcriptional downregulation of the TET enzymes in the absence of BEND3 (Supplementary Figure S6H), so we considered additional mechanisms that could potentially lead to lower TET activity in Bend3 mutant cells.The TET enzymes are critically dependent on alpha-ketoglutarate, so we tested whether the transcriptional defects occurring in Bend3 KO cells might impinge on the pathways producing this metabolite.A GSEA analysis of our RNA-seq data showed that there was indeed a significant downregulation of the group of genes involved in alpha-ketoglutarate production (Figure 6 J).To investigate this finding in more depth, we examined the expression of individual genes belonging to this GO term group, and intersected the data with our BEND3 ChIP-seq results (Figure 6 K).We find that se v eral genes of the group are bound by BEND3 and downregulated in its absence.In particular, Idh1 and Idh2 , encoding Isocitrate Dehydro genase, w hich produces alpha-keto glutarate, are both directly activated by BEND3.We hypothesize that the lower Idh1 / Idh2 expression causes lower alphaketoglutarate availability, reducing TET activity, and increasing DNA methylation in cells lacking BEND3.

Integrating the local and global effects of BEND3 on chromatin and transcription
Lastly, we set out to integrate the changes occurring at the chroma tin (DNA methyla tion and histone marks) and transcriptional le v els in the Bend3 mutant cells.
For this, we mapped the key histone modification H3K27me3 by CUT&RUN in ES cells with or without BEND3 (Figure 7 A).Bivalent promoters wer e extr emely likely to experience changes in H3K27me3 abundance upon BEND3 removal (Figure 7 A, left panel), reinforcing the idea that BEND3 has a key role in their homeostasis.Of the promoters directly bound by BEND3, 46% experienced changes in H3K27me3 occupancy, again a significant enrichment (Figure 7 A, middle panel).Finally, of the 960 genes differ entially expr essed upon BEND3 loss, mor e than half experienced changes in H3K27me3 presence on their promoter (Figure 7 A, right panel).The correlations between BEND3 binding, H3K27me3 presence, changes in gene expression and changes in DNA methylation, are explored statistically in Supplementary Figure S7A, and the examples of Dazl and Col4a1 / a2 are shown in Supplementary Figure S7B-C.
We also examined some of the interactions more precisely.For instance, we listed the promoters that lose H3K27me3 when BEND3 is absent (Figure 7 B).Fifty-four percent of the upregulated genes were in that category, versus only 25% of the downregulated genes.Therefore, transcriptional induction upon Bend3 mutation often involves H3K27me3 loss, in line with many of the target genes having bivalent promoters.The opposite was true for promoters that gain H3K27me3 upon BEND3 loss: they contain mor e downr egulated than upr egulated genes (Figur e 7 C).
Lastly, we crossed our CUT&RUN and WGBS data (Figure 7 D).The regions that were occupied by H3K27me3 in WT cells were significantly more likely to gain DNA methylation than the genome aver age.Similar ly, the regions that have lost H327me3 in the Bend3 KO cells were significantly more h ypermeth ylated (Figure 7 D).Further examples are gi v en in Supplementary Figure S7D-E.
Altogether these data lead us to propose an integrated model for the local and global effects on BEND3 on chromatin and gene expression (Figure 7 E).

DISCUSSION
In this work, we set out to identify new regulators of Dazl using a CRISPR screen in mouse ES cells.Our primary screen identified a shortlist of 12 candidates, of which 4 were known regulators of Dazl .A secondary screen then led us to focus on one particular hit, BEND3.We combined ES cell genetics with various genomic approaches to show that BEND3 regulates not only Dazl , but hundreds of bivalent promoters.
A recent article by Rui-Ming Xu, Bing Zhu and their cow ork ers, also identified BEND3 as a key regulator of bivalent genes ( 13 ).Zhang et al. reached these conclusions by a w holl y different a pproach, w hich was the generation of Bend3-/ -mice and the derivation of ES cells from these mice.We note that, although our conclusions are parallel, some of the specifics ar e differ ent: for instance, only about 10% of the differ entially expr essed genes in Bend3 mutant cells relati v e to WT are the same between our study and Zhang et al .Technical reasons (including genetic background and culture conditions) likely play a role, but a more fundamental difference also exists: Zhang et al. derived their ES cells from Bend3 -/ -embry os, theref ore BEND3 was ne v present in the ES cells.In contrast, we deleted Bend3 in ES cells that previously expressed the gene.Therefor e, the r esults of Zhang et al. could not discriminate whether BEND3 is involved in bivalent domain establishment, maintenance, or both.Our results establish unequivocally that BEND3 is r equir ed for bivalent genes to maintain their poised state, yet they do not rule out an additional role in establishing these domains.
Our ChIP-seq data show that a major determinant of BEND3 distribution on genomic DNA is the presence of consensus sites, with the sequence CCCACGCG.About 70% of all BEND3 peaks contain one or more such sites, and 20% of all BEND3 peaks contain three or more.Our findings are fully consistent with two recent papers showing that the BEN domain 4 of BEND3 recognizes the sequence CCACGC ( 13), but they differ from another study in which BEND3 was proposed to bind G-quadruplexes ( 10 ).The fact that only 10% of all bivalent promoters are regulated by BEND3 is likely directly linked to the presence or absence of the BEND3 consensus sites, but other contributing factors cannot be ruled out.The consensus BEND3 binding motif contains two CpG sites that can potentially be subjected to cytosine DNA methylation.Our data show that bound motifs are unmethylated, while methylated motifs are not bound, similarly to the binding behavior of BANP, a protein related to BEND3 ( 38 ).Structurally, CpG methylation has been shown to impair BEND3 binding by blocking an arginine-to-cytosine contact ( 13 , 36 ), which is consistent with our ChIP-seq data.
DNA methylation, ther efor e, influences BEND3 binding and is conversely regulated by BEND3.Indeed, a striking finding we report is that mouse ES cells lacking BEND3 have decreased DNA h ydroxymeth ylation, combined with increased DNA methylation, suggesting that TET activity is lower in the absence of BEND3.In agreement with this possibility, while the DNA h ypermeth ylation of Bend3 KO cells is global across the genome, it is especially marked at certain regions such as enhancers and CpG shelves, which are normally protected against de novo methylation by the TET enzymes (39)(40)(41)(42)(43)(44).We have ruled out a direct transcriptional downregulation of TET1, TET2, or TET3 as the cause for decreased TET acti vity.Howe v er, the TET enzymes are critically dependent on alpha-ketoglutarate (alpha-KG) for catalysis, and the promoters of the two key genes for alpha-KG production, Idh1 and Idh2 , are bound and activ ated b y BEND3, ther efor e we speculate that lower le v els of alpha-KG contribute to decreased TET activity in the mutant cells.
Many genes are differentially expressed upon BEND3 removal, e v en though they are not directly bound.It is likely that some of these indirect targets become r epr essed

Figure 1 .
Figure 1.Principle of the reactivation screen, generation and validation of the DASH reporter cell line.( A ) Mouse ES cells can be grown in 2i medium or in serum.A number of genes acquire DNA methylation (black lollipops) and repressi v e histone marks (large gray circle) in serum, which inhibits their expression.Performing a CRISPR loss-of-function screen in serum should lead to the identification of r epr essors.( B ) The Dazl gene promoter contains a methylated CpG island, the activating histone mark H3K4me3 and the r epr essi v e histone mar k H3K27me3: it is bivalent.( C ) DNA methylation and the ncPRC1.6complex are already known to repress Dazl expression in serum.We searched for new repressors.( D ) We built the DASH reporter cell line, in which two positi v ely selectab le mar kers (mScarlet and HygroR) are inserted into e xon 6 of Dazl.The other allele of Dazl in DASH cells is WT. ( E ) CRISPR KO of Uhrf1 in DASH cells is confirmed by western blotting.( F ) LUMA confirms that Uhrf1 KO in DASH cells leads to a global loss of DNA methylation.( G ) Uhrf1 KO in DASH cells r esults in mScarlet r eactivation.( H ) Uhrf1 KO in DASH cells r esults in Hygromycin r esistance; surviving cells stained with crystal violet.

Figure 2 .
Figure 2. Primary screening approach and results.( A ) Timeline of the genetic screen .( B ) Results of the MAGeCK analysis.The top 12 hits are shown, color-coded according to class.The hits in blue wer e alr eady known and were expected in the list.The hits in green influence cellular pluripotency.Pink, red and yellow: other hits.

4 Figure 3 .
Figure 3. BEND3 r epr esses Dazl expr ession.( A ) Procedur e for validation of Bend3 as a bona fide hit .( B ) FACS analysis on serum-grown cells.( C ) RT-qPCR on serum-grown cells.( D ) Western blotting on serum-grown cells.( E ) FACS analysis on serum-grown Bend3 mutant cells, rescued with empty vector or with a Bend3 -expressing vector.( F ) Survival assay on rescued cells.Only one of the three clones is shown, for clarity.( G ) Western blotting shows expression of the V5-tagged BEND3 protein in rescue cells, and concomitant repression of DAZL protein expression.The three independent clones are shown.

Figure 4 .
Figure 4. BEND3 regulates the transcription of bivalent promoters.( A ) Principal Component Analysis on RNA-seq data from the indicated cells (three biological replicates for each condition) .( B ) Volcano plot showing genes that ar e downr egulated (blue) or upregulated (red) in the Bend3 KO cells.( C ) Venn diagrams showing the overlap between genes that are differ entially expr essed (DE, either up or down-r egulated), upr egulated, or downregulated, betweenBend3 KO and WT, and genes that have a bivalent promoter( 35 ).P -value: hypergeometric tests.( D ) Proportion of genes with the indicated promoter marks within up-or down-regulated genes.( E ) RT-qPCR following reintroduction of V5-BEND3 in Bend3 KO cells.Genes that wer e downr egulated upon KO (left panel) regain expression upon rescue, while genes that were upregulated in the KO become repressed upon rescue (right panel).P -value: Student's t -test, corrected by two-stage Benjamini, Krieger, & Yekutieli FDR procedure.* P < 0.05; ** P < 0.01.

Figure 5 .
Figure 5. BEND3 binds bivalent promoters via a CpG-containing consensus site.( A ) Number and distribution of BEND3 peaks detected by ChIPseq.( B ) Top motif identified in a de novo motif search on all BEND3 peaks.( C ) Almost all promoters bound by BEND3 contain a CpG island.Pvalue: hypergeometric tests.( D ) Intersection between promoters bound by BEND3, and genes Differentially Expressed (DE) in the Bend3 KO cells.( E ) Characteristics of the genes directly bound and regulated by BEND3.( F ) Dazl is directly bound and r epr essed by BEND3.( G ) Illustration of two bivalent genes directly bound and activated by BEND3: Col4a1 and Col4a2 .

Figure
Figure Cells lacking BEND3 have increased DNA methylation and decreased DNA h ydroxymeth yla tion.( A ) Evalua tion by a restriction enzyme-based method (LUMA) of the global le v el of DNA methylation in the indicated cellular contexts.Bend3 KO cells are h ypermeth ylated, and the expression of a BEND3 rescue construct brings DNA methylation back to WT le v els.P -v alue: ANOVA followed b y Tukey's post-hoc tests.** P < 0.01; *** P < 0.001; **** P < 0.0001.( B ) Quantitation of 5-mC abundance in the cells by LC-MS / MS.P -value: ANOVA followed by Dunnett's post-hoc tests.* P < 0.05; **** P < 0.0001.( C ) Distribution of methylation le v els on 250-bp tiles containing 5 or more CpGs ( n = 2.1 million).The three individual Bend3 KO clones are shown on the left, together with WT cells.( D ) Dif ferential methyla tion analysis on 250-bp tiles containing fiv e or more CpGs ( n = 2.1 million).( E ) DNA methylation values for the indicated genomic elements in WT cells grown in 2i (pink) or serum (orange), and Bend3 KO cells grown in serum (blue) ( F , G ) Percentage of statistically significant h ypermeth ylated tiles in the indicated genomic compartments, when comparing Bend3 KO to WT cells.The red dashed line shows the value in intergenic regions, for comparison.See Methods for details. ( H ) Venn diagram showing the overlap between genes that ar e downr egulated in Bend3 KO and genes whose promoter gains DNA methylation in Bend3 KO and compared to WT. P -value: hypergeometric tests.( I ) Quantitation of 5-hmC abundance in the cells by LC-MS / MS. p-value: ANOVA followed by Dunnett's post-hoc tests.** P < 0.01; **** P < 0.0001.( J ) GSEA analysis on the RNA-seq data (Bend3 KO vs. WT).( K ) RNA-seq data for the genes of 'Oxoglutarate metabolic process' GO term.Red: higher e xpression, b lue: lower e xpr ession.The genes with promoters dir ectly bound by BEND3 ar e indicated with arrows.