Dual functions of TET1 in germ layer lineage bifurcation distinguished by genomic context and dependence on 5-methylcytosine oxidation

Abstract Gastrulation begins when the epiblast forms the primitive streak or becomes definitive ectoderm. During this lineage bifurcation, the DNA dioxygenase TET1 has bipartite functions in transcriptional activation and repression, but the mechanisms remain unclear. By converting mouse embryonic stem cells (ESCs) into neuroprogenitors, we defined how Tet1–/– cells switch from neuroectoderm fate to form mesoderm and endoderm. We identified the Wnt repressor Tcf7l1 as a TET1 target that suppresses Wnt/β-catenin and Nodal signalling. ESCs expressing catalytic dead TET1 retain neural potential but activate Nodal and subsequently Wnt/β-catenin pathways to generate also mesoderm and endoderm. At CpG-poor distal enhancers, TET1 maintains accessible chromatin at neuroectodermal loci independently of DNA demethylation. At CpG-rich promoters, DNA demethylation by TET1 affects the expression of bivalent genes. In ESCs, a non-catalytic TET1 cooperation with Polycomb represses primitive streak genes; post-lineage priming, the interaction becomes antagonistic at neuronal genes, when TET1’s catalytic activity is further involved by repressing Wnt signalling. The convergence of repressive DNA and histone methylation does not inhibit neural induction in Tet1-deficient cells, but some DNA hypermethylated loci persist at genes with brain-specific functions. Our results reveal versatile switching of non-catalytic and catalytic TET1 activities based on genomic context, lineage and developmental stage.


INTRODUCTION
The study of mammalian cell fate acquisition in the early stages of embryonic de v elopment provides fundamental insights into the processes governing cellular plasticity. Recent single-cell atlases hav e comprehensi v ely charted embryonic transcriptional changes to infer lineage segregation dynamics during mouse gastrulation at high resolution ( 1 , 2 ). In the gestational period between E6.5-7.5, the mouse epiblast dif ferentia tes d ynamicall y and largel y in a biphasic manner into either primiti v e streak fate, which subsequently gi v es rise to endoder m and mesoder m lineages, or else continue de v elopment to become definiti v e ectoderm. Whether this rapid lineage bifurcation process is regulated by stochastic and / or deterministic factors has been difficult to resolve. Classical models based on mouse development postulated the contribution of the extra-embryonic ectoderm as the origin of a bone morphogenetic protein (BMP)-Wingless / Integrated (Wnt)-Nodal growth signalling cascade that dri v es epib last ingression into the primiti v e streak, suggesting that the de v elopmental cues are cell-extrinsic ( 3 ). Howe v er, recent insights into the morphological e v ents of primate and human gastrulation, together with in vitro differentiation models using embryonic stem cells (ESCs), suggest that germ layer lineage segregation can occur independently of extra-embryonic cells (4)(5)(6). A cell-intrinsic mechanism for controlling early cell fate choice ma y in volve dynamic changes in the chromatin landscape -the epigenome comprising DNA methyla tion, histone modifica tions and non-coding RNAs -as the epiblast dif ferentia tes into a 'formati v e' pluripotent state, which confers de v elopmental competence to respond to de v elopmental signals at the onset of gastrulation ( 7 , 8 ). The extent by w hich DN A and histone methylation changes affect the earliest germ layer lineage bifurcation decision remains unclear ( 9 , 10 ).
Ten-Ele v en-Translocation (TET) proteins are Fe 2+ and ␣-ketoglutarate dependent dioxygenases that acti v ely remove DNA methylation by reiterati v e oxidization of 5methylcytosine (5mC) at CpGs to 5-hydroxymethylcytosine (5hmC) and further oxidized products (11)(12)(13). We previously demonstra ted tha t TET1 is the dominant TET protein expressed between E5.5 to E6.5 in the pr e-str eak stage mouse embryo, when it regulates germ layer lineage bifurcation by promoting definiti v e ectoderm and r epr essing primiti v e streak fate entry (14)(15)(16). Others have also reported that loss of TET1 in mouse ESCs leads to both repression and activation of target genes (17)(18)(19). In mouse ESCs, TET1 and its product 5hmC are found associated not only with acti v ely transcribed genes, but also with bivalent promoters marked by both acti v e histone H3 lysine 4 trimethylation (H3K4me3) and r epr essi v e H3K27me3 chroma tin sta tes tha t keep most de v elopmental lineage genes at low basal expression ( 18 , 20 ). Moreover, TET1 has been shown to co-occupy or interact with Sin3a / histone deacetylase (HDAC) and the Polycomb Group Repressive Complex 2 (PRC2) components Ezh2 and Suz12 (17)(18)(19)(20)(21). Recently, there has been more attention in dissecting the catalytic versus non-catalytic functions of TET1 in mouse ESCs. The interaction between TET1 and PRC2 has been validated to regulate embryonic de v elopment independently of TET1's catalytic acti vity ( 22 ). Moreov er, TET1 r epr esses endogenous retroviruses independent of DNA demethylation ( 23 ). Howe v er, those studies hav e limited their analyses to nai v e ESC cultur es. It r emains unr esolved how TET1's dual function in transcriptional activation and r epr ession is coordinated during lineage priming along bifurcating trajectories.
In this study, we used complementary bulk and single-cell sequencing methods to chart temporally changes in gene expression, chroma tin sta tes and DNA methylation caused by Tet1 loss of function in an in vitr o dif ferentia tion model that models neural induction. Our analysis unravelled a cellintrinsic mechanism by which TET1 r epr esses pr e-matur e primiti v e streak-inducing de v elopmental signalling and dissected the nuanced contributions of TET1's catalytic and non-catalytic activities in germ layer lineage bifurcation.
For this study we used two littermate pairs of mouse b lastocyst-deri v ed Tet1 knockout (KO) and wild-type (WT) (B6 × 129S6)F1-Tet1 tm1Koh male ESC lines ( 16 ), and one b lastocyst-deri v ed Tcf7l1 KO line and R1 ESCs as a control with a similar genetic background ( 24 ). The genetic backgrounds of the (B6 × 129S6)F1 ESC lines have been characterized using a 384 SNP panel (Charles Ri v er lab), of which 196 SNPs are informati v e in discriminating B6 from 129S6 (also known as 129SvEvTac) allelic variants, as follows: WT7, 189 SNPs called, 60

Creation of Tet1 mutant cell lines
ESC WT7 was used as a parental line for generating a mutated version of Tet1 (MUT) that is catal yticall y dead. We generated three Tet1 MUT and three isogenic Tet1 mock transfected clonal lines using Cas9 homology directed repair (HDR) as described in a previous study ( 25 ). Briefly, a single guide RN A (sgRN A) targeting the catal ytic domain of TET1 in exon 11 was combined with Alt-R ® S.p. HiFi Cas9 Nuclease V3 (IDT, 1081061) to form a Cas9 ribonucleoprotein (RNP) comple x accor ding to manufactur er's instructions. Cells wer e transfected with the assemb led RNP comple x (1.7 M Cas9 nuclease, 2 M sgRNA), 4 M electroporation enhancer (IDT, 1075915) and 4.2 M single stranded donor oligonucleotide (ssODN), using a mouse nucleofector kit (Lonza, VAPH-1001) on an Amaxa Nucleofector 2b (Lonza). Mock control cells were transfected without the RNP complex. After nucleofection, cells were seeded on feeders in standard ESC medium with 20 M Alt-R ® HDR Enhancer (IDT, 1081072). After 12h, medium was replaced with medium without HDR enhancer. Colonies were picked 72h after the transfection and screened for the edited using PCR amplification of the locus (345 bp), followed by restriction enzyme digestion with HaeIII (Biok é / NEB, R0108S). Clones with biallelic HDR edits were chosen for further analysis. For confirmation of HDR editing, gel-purified PCR products and 8 subcloned PCR fragments per clonal line were Sanger sequenced. All oligonucleotides used are listed in Supplementary Table S1.
In brief, 7 × 10 5 HEK293T cells were seeded per well in 6-well plates and transfected the following day with 0.75 g pCMV-dR8.91, 0.25 g pCMV-VSV-G, and 1 g of the specific lentiviral expression constructs using FugeneHD (Promega, E2311) in Opti-MEM (Invitrogen, 31985070). One day after transfection, the culture medium was replaced with ESC medium supplemented with 0.01 ng / ml recombinant murine LIF (Peprotech, 250-02). The same day, ESCs were plated in 6-well plates on gela tin a t a density of 10 5 per well in the same medium. Lentivirus-containing medium was collected from HEK293T cells 48h and 72h after transfection and added in 1:1 ratio with fresh medium to recipient ESCs after being filtered. Two days after infection, ESCs were washed thoroughly with PBS, medium r efr eshed, and appropriate selection antibiotics applied for 10 days. First, parental lines were infected with pLVX-tet-On-Advanced construct and selected with 200 g / ml G418 on neomycin resistant inactivated MEFs. After expansion of the lines, the ESCs containing the rtTA construct were infected with the pLVX-Tcf7l1 construct and selected with 2 g / ml puromycin on SNLP 76 / 7-4 feeders. After selection, colonies were picked and screened for Tcf7l1 induction after 48h of treatment with 2 g / ml of DOX using RT-qPCR. Ov er e xpression of Tcf7l1 after 48 h of 2 g / ml DOX treatment in serum cultured ESCs on feeders was confirmed using western blots.

Dot blotting
Genomic DN A (gDN A) was extracted using PureGene genomic DNA extraction kit (Invitrogen, K182001) according to manufacturer's instructions. 200-250 ng of gDNA was serially diluted two-fold in nuclease-free water followed by dena tura tion in 0.4 M NaOH / 10 mM EDTA at 95 • C for 10 min, neutralized in equal volume of ice-cold 2 M ammonium acetate and kept on ice for 10 min. Samples were then spotted onto a Zeta-Probe nylon membrane (Bio-Rad, Cat #162-0165) with a 96-well Bio-Rad Bio-Dot apparatus. The spotted membrane was subsequently washed excessi v ely with 0.4 M NaOH, air-dried for 5-10 min and UV cross-link ed tw o times at 120 000 J / cm 2 on a UVP HL-2000 HybriLinker. The membrane was then blocked in 5% non-fat dry-milk in Tris-buffered saline with 0.1% Tween-20 (TBS-T) for 1 h and incubated overnight at 4 • C with primary antibodies diluted in 5% non-fat dry-milk in TBS-T. Antibodies used were: anti-5hmC (Acti v e Motif, 39769; 1:10 000) and anti-dsDNA (Abcam, ab27156, 1:1000; which cross-reacts with both single and double-stranded DNA). Subsequently, membranes were incubated with secondary antibodies diluted 1:5000 in 5% non-fat dry milk in TBS-T (HRP-conjugated anti-rab bit, Dako , P0217, 1:5000; or HRP-conjugated anti-mouse, Dako, P0447, 1:5000). The signal was detected using enhanced chemiluminescence using ClarityWestern ECL substrate (Bio-Rad 1705060) captured using light-sensiti v e film and de v eloped using an AG-FACurix 60 Film Processor (Bio-Rad). On e v ery membrane, a serial dilution of 0.1 ng DNA of 5hmC PCR product was spotted. To assess the quantities of 5hmC, signal intensities of 5hmC were measured using FIJI (ImageJ 1.53c) software and calibrated against the linear range of the standar d curv es.

Immunofluorescence
Cells were cultured on sterilized coverslips or regular tissue culture plates. To stain, cells were fixed in 4% PFA solution for 10 min at room temperature and permeabilized with 0.5% Triton X-100 in PBS, followed by two washes with 0.2% Tween in PBS (PBS-T). Blocking was performed 1 h at room temperature with blocking solution (10% normal donkey serum, 0.2% Tween-20, 2% fish gelatin in PBS) and followed by overnight incubation in a moist chamber at 4 • C with primary antibodies (anti-T, Santa Cruz Biotechnology sc17743, 1:200; anti-MEF2C, Abcam, ab211493, 1:500) diluted in antibody solution (1% normal donkey serum, 0.2% Tween-10, 0.2% fish gelatin in PBS. Corresponding secondary antibodies (1:500) diluted in antibody solution were incubated for 1 h at room temperature, followed by a wash with PBS-T, counterstaining of the nuclei with 0.1 g / ml DAPI in PBS-T for 5 min, followed by a wash with PBS-T and PBS. Coverslips were mounted with Prolong Gold and imaged using a Nikon Eclipse Ti-2 microscope. Images were processed using FIJI (ImageJ v1.53c).

TOP-flash dual-luciferase assay
On day 1 of dif ferentia tion, cells were transiently transfected with a mix of Opti-MEM I Reduced-Serum Medium (Thermo Fisher 31985047), 2.5 g of plasmid DNA and 6 l of T ransIT-LT1 T ransfection Reagent per reaction in a standard 6-well plate. The DNA was a mix of 2.25 g TOP / FOP-Flash reporter and 0.25 g pRL-TK ( 30 ). The Dual-Luciferase Reporter Assay system (Promega, E1910) was used for collection and luminescence measurement, according to manufacturer's instructions. In short, on day 3 or 4 of dif ferentia tion the cells were washed with PBS and 500 l Passi v e Lysis Buffer (provided in the kit) was added and incubated for 15 min on an orbital shaker at room temperatur e. Cells wer e scra ped and l ysates were passed through a 26G needle to aid in lysis and homogenization. 20 l of each lysate was used to measure both firefly luciferase and Renilla luciferase activity in a white 96-well plate (Perkin Elmer, 6005500) using the Biotek Synergy HTX multi-mode reader with a 10 s measurement period for each sample per assay (firefly or R enilla ). Ev ery sample was measured in technical triplicates.

Quantitativ e rev erse tr anscription-polymer ase chain r eaction (RT-qPCR)
Total RNA was extracted using Trizol (Invitrogen, 15596018) or RNeasy plus mini kit (Qiagen, 74136) according to the manufacturer's instructions. 200 ng-1 g of RNA was used for re v erse transcription reactions using the Superscript III cDNA synthesis (Thermo Fisher 11752-050) followed by RNase treatment, according to the manufacturer's instructions. Quantitati v e real time PCR r eactions wer e set up in technical triplicates using cDNA diluted 1:2-1:10 (depending on RNA input) and SYBR-green PCR master mix (Thermo Fisher 11733-046) and 5 nM primers (Supplementary Table S1) using a 384 wells ViiA7 real-time PCR system (Applied Biosystems). Expression le v els of target genes was calculated accor ding to the 2 -Ct method using QuantStudio Real-Time PCR software (v1.3). Gapdh was used as a r efer ence gene for normalization and fold induction was calculated relati v e to expression in control ESCs.

Embryo isolation
Tet1 tm1Koh heterozygous males and females were naturally ma ted to genera te embryos ( 15 ). E8.5 and E11.5 embryos were collected from timed-pregnant females. The morning a copulation plug was found, is considered as E0.5. Individual decidua were collected in a dish with cold PBS, and uterine tissue was removed with a fine pair of dissection scissors, followed by extraction of embryos using a of sharp Dumont #5 f orceps. Embry onic brain or anterior neural tissues were dissected as previously described ( 15 , 31 ). Dissected tissues were snap frozen in liquid nitrogen and stored at -80 • C for later processing. For E8.5 samples, the remainder of the embryo was used for genotyping, while for E11.5, the yolk sac was used. Primers for both genotyping of Tet1 and sex are listed in Supplementary Table S1.

Targeted amplicon bisulfite seq
Genomic DN A (gDN A) was extracted from minimall y 10 6 cells or E11.5 brains using the Purelink genomic DNA Mini Kit (Invitrogen, K182001), according to the manufacturer's instructions. The quality of the gDNA was assessed using Nanodrop and the absence of RNA contamination was checked by running the samples on a 0.8% agarose gel stained with SyberSafe. gDNA was extracted from E8.5 anterior neural (headfold) tissues (roughly 10 4 cells) by incubating the tissues with lysis buffer (10 mM Tris-HCl pH 8.3, 50 mM KCl, 2.5 mM MgCl 2 , 0.5% NP-40) and 0.4 mg / ml Proteinase K (Thermo Fisher Scientific, AM2546) for 1 h at 56 • C and subsequently 30 min with 2 mg / ml RNAse A (Qiagen, 19101) at 37 • C. After a 1 × clean-up using AMPure XP beads (Beckman Coulter, A63881), gDNA was eluted in 20 l (100-200 ng yield) and used entirely for bisulfite conversion. For other samples, 1.5 g of gDNA was used for bisulfite conversion, using the EpiTect Fast DNA Bisulfite kit (Qiagen, 59824) and eluted in 15 l of elution buffer provided in the kit, according to the manufacturer's specifica tions. A separa te 20 l PCR reaction was used for each amplicon with 0.5 l of bisulfite converted gDNA, 300 nM of both a forward and re v erse primer containing P7 and P5 tails, and Platinum ™ Taq DN A pol ymerase High Fidelity (Invitrogen, 11304-011) and provided buffers. PCR cycling parameters are listed in Supplementary Table S1. Amplicons were loaded on a 1.5% agarose gel and individual bands were gel-extracted using PureLink Quick Gel Extraction kit (Invitrogen, K210012). The concentration of each amplicon was measured using Qubit ™ dsDNA HS Assa y kit (In vitrogen, Q32854) and diluted to 15 nM. The amplicons were pooled equimolar per sample and used for secondary PCR reaction to generate libraries.
The quality of the pooled amplicons was assessed using fragment analyzer (Agilent) and the Qubit ™ dsDNA HS Assa y kit (In vitrogen, Q32854). The amplicon pools were diluted to max 5 ng / l and combined as follows: 9 l DNA, 0.5 l custom p7 primer (125 nM), 0.5 l custom p5 primer (125 nM) and 1 × 10 l Phusion ® High Fidelity PCR master Mix with HF buffer (Biolabs new England M0531S). The following program was used in a thermocycler: 94 • C 30 sec; 15 × 94 • C 10 s, 51 • C 30 s, 72 • C 30 s; 72 • C 1 min. The custom primers are provided with unique dual indexes to label the samples. The resulting library was purified with a 1 × clean-up using AMPure XP beads following manufacturers protocol. The final quality of libraries was analysed using fragment analyzer (Agilent) and pooled equimolar. The concentration of the final pool was measured using qPCR (Kapa sybr fast, Roche, KK4600) and loaded on a NovaSeq for PE150 sequencing for a minimum of 200 000 reads per amplicon and on average 350 000.
Using Trim Galor e! (v0.6.7), r eads wer e trimmed based on quality (PHRED < 20), adapters were removed and only reads with a minimum length of 20 bp were kept. Using Bismark (v0.23.1), the trimmed reads were aligned to GENCODE mm10 (GRCm38.p6) with a maximal insert size of 500 bp, followed by methylation extraction. Only CpGs were kept with a minimal coverage of 1000 × and plotted over the seven different assayed regions using a custom script in R (v4.0.3).

RNA-seq library preparation
Total RNA was extracted from cells using Trizol using the manufacturer's instructions. RNA sequencing (RNAseq) libraries were prepared from 4 g of total RNA using the KAPA stranded mRNA-seq kit (Roche, KK8421) according to manufacturer's specifications. 100 nM KAPAsingle index adapters (Roche, KK8700) were added to Atailed cDNA, and libraries were amplified for 10 cycles. Finally, 1x library clean-up was performed using Agencourt AMPure XP beads (Beckman Coulter, A63881). Library fragment size was assessed using Agilent Bioanalyzer 2100 with the High Sensitivity DN A anal ysis kit (Agilent, 5067-4626) and concentration was determined using Qubit ™ ds-DNA HS Assay kit (Invitrogen, Q32854). Each library was diluted to 4 nM and pooled for sequencing on an Illumina Hiseq4000, aiming at 15-20 million SE50 reads per sample (19 million reads on average).

10xGenomics Single-cell (sc)RNA-seq library preparation
For single-cell RNA-seq library preparations, cells were washed with PBS and incubated for 5 min with Try-pLE, which was subsequently quenched with advanced DMEM / F12. Single cells were washed 3x with 0.4% BSA in PBS. Cells were counted using the LUNA counter and an AO / PI fluor escent stain. Cells wer e loaded to target 5000 following the 10xGenomics protocol for single cell 3' prime reagent kit v3.1 (single indexes). Quality control was performed and pooled according to instructions and sequenced following the 10xGenomics guidelines at 28-8-0-91 on a NovaSeq 6000 Instrument. First a shallow sequencing run was performed to perform quality control on the libraries and estimate the number of captured cells, followed by a deeper sequencing run to reach a mean of 20 000 reads per cell. In total we captured 54 708 high quality cells after filtering, with on average 5471 cells per sample.

o xWGBS libr ary pr epar ation
Genomic DNA was extracted from minimally 10 6 cells using the Purelink genomic DNA Mini Kit (Invitrogen, K182001), according to the manufacturer's instructions. The quality of the gDNA was assessed using Nanodrop and absence of RNA contamination was checked by running the samples on a 0.8% agarose gel stained with SYBER Safe. Per sample, 1.5 g of gDNA in a total volume of 50 l was sheared using COVARIS M220 for 2 × 60 s with at an intensity of 5 at 7 • C followed by a 1.8 × clean-up using AMPure XP beads (Beckman Coulter, A63881) to concentrate fragmented DNA. Fragmentation was assessed Agilent Bioanalyzer 2100 with the High Sensitivity DNA analysis kit (Agilent, 5067-4626) and concentration was determined using Qubit ™ dsDNA HS Assay kit (Invitrogen, Q32854).
Libraries were made and (oxidati v e) bisulfite conv erted using the Ultralow Methyl-Seq kit (Tecan / Nugen), according to the manufacturer's instructions. For BS and oxBS, libraries were made in parallel starting from the same fragmented gDNA per sample. For each library, 300 ng of fragmented gDNA was used as input and fragments were end r epair ed and ligated to single-indexed sequencing adapters provided in the kit, followed by final r epair. Befor e oxidation and bisulfite conversion, libraries were purified and washed 3 × with 80% acetonitrile to get rid of any residual ethanol, followed by incubation at 37 • C for 5 min with dena turing buf fer provided with the kit to dena ture the DNA. oxBS libraries were oxidized by addition of TrueMethyl oxidation solution while BS libraries were mock treated using ultra-pure water during an incubation at 40 • C for 10 min. For each library, the optimal amplification was optimized using qPCR; 1 / 6th of the libraries was added to amplification master mix containing SYBR Green and run on an Applied Biosystems StepOnePlus Real-Time PCR system for 30 cy cles. Relati v e log-fluorescence vs amplification cy cle was plotted out to manually determine the appropriate amplification cycles, selected within the middle to late exponential phase of amplification. The BS libraries were amplified between 7 and 12 cycles, while oxBS libraries were amplified between 10 and 15 cycles. Following amplification, bisulfite converted libraries were purified finally with a 1 × clean-up using AMPure XP beads. The quality of the libraries was assessed using an Agilent Bioanalyzer 2100 with the High Sensitivity DN A anal ysis kit (Agilent, 5067-4626) and concentration was determined using Qubit ™ dsDNA HS Assay kit (Invitrogen, Q32854). The libraries were first sequenced shallow to estimate library quality, bisulfite conversion efficiency, and duplica te ra te, followed by multiple deep sequencing PE150 runs to obtain an average coverage of 5 × per library. For sequencing the custom sequencing primer MetSeq Primer 1 was used, as per manufacturer specification. Bisulfite conversion rate was > 99% for each library.

A T A C-seq library preparation
We used the Omni-ATAC protocol to pr epar e ATAC-seq libraries as previously described ( 32 , 33 ). Although viability of the cells was high ( > 90-95%), we treated them nonetheless by adding 0.1 mg / ml DNase I (Worthington, LS002005) directly to the growth medium supplemented with 0.5 mM CaCl 2 , 2.5 mM MgCl 2 , and incubating for 30 min at 37 • C. Cells were then washed with PBS and incubated for 5 min with TrypLE which was subsequently quenched with medium. After two washes with ice-cold PBS, 50 000 cells were pelleted in a pre-cooled 1.5 ml tube by centrifugation for 5 min at 600 x g at 4 • C in a microcentrifuge. Supernatant was removed and 50 l of ATAC-lysis buffer was added (10 mM Tris-HCl pH 7.4, 10 mM NaCl, 3 mM MgCl 2 , 0.1% NP40, 0.1% Tween-20, 0.01% digitonin) and incubated on ice for 3 min, followed by addition of 1 ml of ATAC-wash buffer (ATAC-lysis buffer, without NP40 and digitonin) and the tube was inverted 3 times. The nuclei were pelleted for 10 min at 600 rcf at 4 • C in a microcentrifuge. Supernatant was removed, and nuclei were resuspended in 50 l transposition reaction buffer (5 l H 2 O, 15 l PBS, 0.5 l 10% Tween-20, 0.5 l 1% digitonin, 25 l TD buffer, 2.5 l Tagment DNA Enzyme 1) (Tagment DNA Enzyme and Buffer Small Kit, Illumina, 20034197) and incubated for 30 min at 37 • C at 1000 rpm in a thermomixer. The transposed DNA was cleaned up immediately following transposition using the ZymoDNAClean & Concentrator-5 kit (Zymo D4014) following manufacturer instructions.
The transposed DNA was amplified in one preamplification step (5 cycles) and a final amplification (5-7 cycles) in the same 50 l reaction using 1 × NEBNext Highfidelity PCR master mix (Biok é / NEB, M0541S) and 125 nM of Ad1 noMX primer and Ad2.index primer with the following program: 72 • C 5 min, 98 • C 30 s; 5 to 7 × 98 • C 10 s, 63 • C 30 s, 72 • C 1 min; 72 • C 1 min. For each library, the optimal amplification was optimized using qPCR; 1 / 5th of the pre-amplified libraries was added to amplification master mix containing SYBR Green and run on an Applied Biosystems StepOnePlus Real-Time PCR system for an additional 20 cy cles. Relati v e log-fluorescence versus amplification cycle was plotted out to manually determine the appropriate amplification cycles. The number of cycles was selected to reach 1 / 3 of the final relati v e fluorescence unit value.
Following amplification, the libraries were purified using ZymoDNAClean& Concentrator-5 kit (Zymo D4014) followed by 0.55 ×-1.75 × dual size selection using AM-Pure XP beads. The quality of the libraries was assessed using an Agilent Bioanalyzer 2100 with the High Sensitivity DN A anal ysis kit (Agilent, 5067-4626) and concentration was determined using Qubit ™ dsDNA HS Assay kit (Invitrogen, Q32854). The libraries were diluted to 4 nM and pooled together for SE50 or PE50 sequencing on an Illumina Hiseq4000, aiming for minimally 25 million mapped and unique reads.

Cleavage Under Target & Release Using Nuclease (CUT&RUN) library preparation
CUT&RUN was performed using the CUTANA ™ KIT (Epicypher, 14-1048) according to the manufacturer's instructions with minor modifications ( 34 , 35 ). Briefly, 5 × 10 5 cells were collected as a single cell suspension and washed three times with 100 l / sample wash buffer (20 mM HEPES pH 7.5, 150 mM NaCl, 0.5 mM Spermidine and 1 Roche complete tablet / 50ml). Cells were then bound to activated Concanavalin A beads and incubated for 10 min at RT, followed by incubation with 0.5 g anti-H3K27me3 (Invitrogen, MA5-11198) antibody or IgG (Epicypher, 18-1401) diluted in antibod y buf fer on a nutator overnight at 4 • C. After three washes of magnet-bound beads with digitonin buffer, pAG / MNase was added and incubated for 10 min at RT, washed and resuspended in digestion buffer for a 30-minute incubation on ice. The reaction was stopped by addition of STOP buf fer and incuba ted for 10 min a t 37 • C . The supernatant containing enriched DNA was then extracted using the supplied DNA Clean-up Columns. Libraries were prepared using the NEBNext ® Ultra II DNA library Prep Kit for Illumina (New England Biolabs ®, NEB E7645L) according to Epicypher's instructions, and amplified by PCR using NEBNext Dual Index Primers (New England Biolabs, E6440S) with the following settings: 98 • C 45 s; 14 × 98 • C 15 s, 60 • C 10 s; 72 • C 1 min. Libr ary concentr ation was quantified using Qubit HS dsDNA Quant kit (ThermoFisher, Q32851). Quality and size distribution were further assessed on an Agilent 2100 Bioanalyzer using a High Sensitivity DN A Anal ysis Kit (Agilent, 5067-4626). Libraries were diluted to 4 nM and pooled together for PE50 sequencing on Ne xtSeq500 or Ne xtSeq2000 for minimally 5 million reads per sample.

10X scRNA-seq analysis
Reads were processed and aligned using Cell Ranger (v3.0.2) against mm10. Count matrix was imported in Seurat (v4.0.1) ( 40 ). Cells were selected which have 2000 to 8000 detected features per cell and a mitochondrial content between 1 and 10% as good quality cells. The cells are normalized per sample using the NormalizeData function in Seurat, using the 'LogNormalize' method and a scaling factor of a 1000. The samples are integrated using the 'fastMNN' method. The first 50 dimensions from the MNN reduction was used to determine the UMAP reduction and find clusters. These clusters were used in the 'findAllMarkers' function to find genes that are differ entially expr essed between clusters. SingleR (v1.4.1) was used to annotate cell types using the mouse gastrulation dataset r efer ence ( 2 , 42 ). pySCENIC (v 0.11.2) was used to determine regulon activity per cell using normalized count matrix ( 43 ). The Seurat function 'findAllMarkers' with the 'wilcox' test was used to determine differential regulon activity.

oxWGBS analysis
Using Cutadapt (v3.7), reads were first trimmed based on quality (PHRED < 20). Custom adapters were removed in paired end mode (' -a AGATCGGAAGAGC -A AAAT-CAAAAAAAC ') and only reads with a minimum length of 15 bp were kept. Using Bismark (v0.23.1), the trimmed r eads wer e aligned to GENCODE mm10 GRCm38.p6 with a maximal insert size of 500 bp, followed by deduplication and methylation extraction. Using a custom script the CpG counts were merged to one strand and made in to a file format which can be used in methPipe software ( 44 ). For DMR analysis methylation calls for replicates were merged together. On average we detected 19 047 561 CpGs with a coverage > 5 × in merged replicates. To find differentially methylated regions (DMRs), first lowly methylated regions were determined for each sample using the hmr program which uses a hidden Markov model approach, and a differ ential methylation scor e for each CpG between samples was calculated using the methdiff program. Using the dmr program, h ypo meth ylated regions and differential CpGs were combined to form DMRs with minimally 2 differential CpGs per DMR.
DMRs were annotated using AnnotatR (v1. 16.0) and ChIPseeker (v1.26.2) packages in R (v4.0.5) using UCSC gene feature and CpG island (CpGi) locations. A DMR was considered to be a 'CpG island associated DMR' if the DMR was within 4 kb (CpGi shore + shelf) of a CpGi. DMR subsets were made using bedtools intersect (v2.30.0). Genes were associated with DMRs using the rGREAT (v1.22.0) package. A DMR was considered to be associated with a bivalent domain (which contains both a H3K27me3 and H3K4me3 peak in mouse ESCs) if it was within 3 kb. GO term enrichment was performed using Cluster Profiler (v3.18.1). 5hmC rate was calculated by subtracting oxBS methyla tion ra te (true 5mC) from regular BS methyla tion ra te (5mC + 5hmC) for CpGs which in both datasets had at least a coverage of 5 ×, while for 5mC CpGs with a 5 × coverage in oxBS dataset were used. Profile plots were made using DeepTools (v3.4.3).

A T A C-seq analysis
ATAC-seq r eads wer e analysed using ENCODE ATAC-seq pipeline ( http://doi.org/10.5281/zenodo.156534 ) de v eloped by Anshul Kundaje's laboratory, which performs quality and adapter trimming, alignment, deduplication, peak calling and quality control in a fully automated manner. The resulting peak files and alignment files were used by DiffBind (v3.0.15) to find differentially accessible regions (DARs) using settings specific for ATAC-seq; ' summits = 150 ' in peak defining and ' background = true ' in read normalization. Transcription factor motif enrichment in DARs was determined using HOMER (v2) package with default settings. Profile plots and heatmaps were made using Deep-Tools (v3.4.3) from bigwig files generated from merged r eplicate r eads with the ATAC-seq pipeline.

ChIP-seq analysis
All public ChIP-seq datasets were analysed in the same way. Khoueiry ( 26 , 46 ). EN-CODE H3K27ac ChIP-seq datasets were directly downloaded directly https://www.encodeproject.org/ as bigwig files without any further processing.
Gene promoters were defined by merging TSS from all transcripts in ENSEMBL that overlapped ± 1.5 kb (a total of 3 kb around one TSS), resulting in a count of 74840 promoters, of which 17930 (24%) are associated with a CpGi, determined using AnnotatR (v1.16.0). Based on overlap with H3K4me3 and H3K27me3 peaks from ESC ChIP-seq datasets from Cruz-Molina et al . (using the same parameters to call peaks ( 26 ), we defined 3594 bivalent promoters, of which 3272 (91%) are associated with a CpGi. Of all CpGi promoters, 3272 are bivalent (18%).

Divergent lineage trajectories of Tet1 + / + and Tet1 −/ − cells over a differentiation time-course
As an in vitro model for neural induction, we used a wellestablished serum-free protocol to convert mouse ESCs into monolayer cultures of anterior neural progenitor cells (antNPCs) in 5 days ( 26 , 27 ) (Figure 1 A). This 'reductionist' monolayer dif ferentia tion model was chosen (over cellular aggregation into embryoid bodies or gastruloids) to achie v e better reproducibility and homogeneity of neural cell generation with anterior-dorsal identity, which can enhance the robustness of downstream epigenome analyses. We previously characterized the dif ferentia tion of 4 independent littermate pairs of Tet1 + / + (wild type, WT) and Tet1 −/ − (knockout, KO) ESC lines (all male) deri v ed from (B6 × 129S6)F1-Tet1 tm1Koh mouse blastocysts under 'neurobasal dif ferentia tion' conditions, in which basic fibroblast growth factor (bFGF) is supplemented during the first  ( 16 ). In that previous study, we observed efficient induction of antNPC gene markers ( Pax6 , Sox1 and Lhx5 ) in WT cells on day 5 and significant loss of expression in Tet1 KO cells across all replicates ( 16 ). In this study, we examined 2 pairs of WT and Tet1 KO ESC lines (see Methods for str ain char acterization) further by collecting cells at daily intervals over the 5-day neurobasal dif ferentia tion time-course for RNAsequencing (RNA-seq). Expression of Tet1 and Tet2 in WT cells diminished ra pidl y within the first 2 days (Figure 1 B, Supplementary Figure S1A, B). Tet3 expression increased marginally by day 5, but remained lowly expressed, and 5hmC diminished to undetectable levels by the third day, suggesting a dearth of TET activity during the last 3 days of dif ferentia tion (Figure 1 Table S2). Using a stringent threshold (FDR adjusted P -value < 0.05 and log 2 |fold change| > 1) to find biolo gicall y significant differentially expressed genes (DEGs) between Tet1 KO versus WT per day, we observed roughly equal numbers of up-regulated (1359) and down-regulated (1074) DEGs starting on day 3 (Figure 1 E). The majority of DEGs detected on days 4 ( > 80%) and 5 ( > 60%) were de novo , suggesting highly dynamic transcriptomic changes as dif ferentia tion progressed. By combining TET1-dependent DEGs together with temporal DEGs defined over multiple time points (total of 4967 unique genes) and then performing k-means clustering, we defined se v en clusters based on the composite DEGs (Figure 1 F). Other than classic marker genes, we used several datasets, consisting of single cell RNA-seq (scRNA-seq) obtained from gastrulating E6.5-E8.5 mouse embryos ( 2 ), bulk RNA-seq from E5.0, E5.5, and 6.0 mouse epiblasts ( 7 ), and mouse ESCs and antNPCs ( 26 ), to annotate each cluster (Figure 1 G, Supplementary Figure S1D). Clusters 5, 6 and 7, consisting of genes associated with the nai v e ( Essrb , Zfp42 ), formati v e ( Pou2f2 ) and primed ( Dnmt3b ) pluripotency states respecti v ely ( 8 ), were sequentially downregulated over the first 3 days with comparable dynamics in both Tet1 KO and WT cells (Figure 1 F, G). The transcriptome profiles positioned formati v e epib last-like cells (EpiLCs, see Materials and Methods), an in vitro correlate of the E5.5-6.0 pr e-str eak stage epiblast ( 28 , 29 ), and postimplantation epib last-deri v ed stem cells (EpiSCs), resembling the E7.5 anterior primiti v e streak ( 47 ), at day 1 (Supplementary Figure S1E) and day 2-3, respecti v ely, in the dif ferentia tion trajectory. Cluster 1 (enriched for early ectoderm markers including Otx2 , Ncam1, Zic2 ) and Cluster 3 (enriched for antNPC markers Pax6 , Sox1 , Lhx5 ) were activated sequentially on day 2 and day 4, respecti v ely, in WT but induction was se v erely compromised in Tet1 KO cells. Conversely, cluster 2 (enriched for mesoder m / endoder m markers Nkx2-5 , Gata4 , Sox17 , Pdgfra ) and cluster 4 (primiti v e streak, e.g. Eomes , Mixl1 ) genes were lowly expressed in WT cells but ectopically induced in Tet1 KO cells between day 3 and 4 ( Figure 1 F, G).
Further, we benchmarked our bulk gene expression timecourse data against marker gene sets obtained from collated cell types and lineages in the r efer ence mouse gastrulation scRNA-seq dataset ( 2 ) (Supplementary Table S3), confirming a transient upregulation of primiti v e streak genes on day 3, endoderm and mesoderm genes on day 4, and downregulation of ectoderm genes from day 2 in Tet1 KO relati v e to WT cells (Supplementary Figure S1F). These drastic differences in germ layer cell proportions between Tet1 KO and WT cells during in vitro differentiation were also recapitulated when we deconvoluted our bulk RNA-seq time-course data into different lineages based on the mouse gastrulation r efer ence ( 2 ) (Supplementary Figure S1G, H).
To address definiti v ely the e xtent of cellular heterogeneity in our dif ferentia tion cultures, we performed scRNA-seq of single cells collected on day 3 and day 5, which our bulk RNA-seq data suggest to be the time-points of gastrulation onset and completion, respecti v ely. On day 3, Tet1 KO cells clustered together in one major cluster with WT cells, which  Figure S1I, J). By day 5, howe v er , Tet1 K O cells aggregated as two major clusters expressing lineage markers for mesoderm ( Twist1, Mef2c, Myl7 ) and endoderm ( Foxa2, Spink1, Sox17 ); these were radically distinct from a single major cluster constituted by WT counterparts expressing ectoderm ( Pax6, So x1, So x2 ) and neural ( Tubb3 , Onecut2 ) genes ( Figure 1 H-J, Supplementary Figure S1I, J). A small cluster ( < 1%) common to both genotypes was identified as visceral endoderm ( Lama1 , Sox17 ) ( Supplementary Figure S1I, K). Otherwise, only < 4% of day 5 Tet1 KO cells were found in the ectoderm cluster and vice versa, < 3% of day 5 WT cells in the mesoderm and endoderm clusters, indicating a nearly complete lineage identity switch in the absence of TET1. The cluster identities were verified in a UMAP projection based on transcription factor (TF)-based gene-regulatory network-modules, also known as regulons, defined using pySCENIC ( 43 ). Here again, Tet1 KO cells at day 5 showed mainly mesoderm and endodermal tr anscriptional progr ams, whereas WT cells showed mainly an ectodermal program ( When we annotated our in vitro scRNA-seq datasets using the mouse-gastrulation r efer ence, we confirmed that cells at day 5 scored as post-gastrulation E8.5 cell types (Figure 1 L, M, Supplementary Figure S1N). The ectoderm cluster (consisting of WT cells) scored mainly as 'neur opr ogenitor' cell type, while the endoderm and mesoderm cluster ( Tet1 KO cells) scored as expected 'endoderm' and 'mesoderm' (Figure 1 L). Although day 3 WT and Tet1 KO clustered together as epiblast cells on UMAP plots, the more precise annotation using the gastrulation r efer ence distinguished a more mature day 3 WT subpopulation as 'ectoderm', wher eas a mor e matur e day 3 Tet1 KO subpopulation scored as 'primiti v e str eak' (Figur e 1 L, M). In agreement, expression of core primiti v e streak genes ( Eomes , T, Foxa2 ) was already elevated in a significant fraction of Tet1 KO cells in the Epiblast cluster by day 3 (Figure 1 N).
We note that pre-existing Tet2 expression in ESCs and low basal Tet3 expression in this dif ferentia tion model may compensate for Tet1 loss-of-function, since the complete absence of TET proteins in triple KO ESCs will se v erely compromise dif ferentia tion ( 9 , 48 ). Nonetheless, our stud y re v eals a dominant role of TET1 at the first lineage bifurcation e v ent of germ layer segrega tion tha t determines either neuroectoderm or primiti v e streak fate.
Activation of Wnt / ␤-catenin signalling by de-repression of the Tcf7l1 gene-regulatory network in Tet1 −/ − cells Cor e r egulatory networ ks dri ving transcriptomic changes may be identified through an interrogation of TF binding motifs in promoters of DEGs using the TRRUST database ( 39 ). From this analysis of Tet1 KO versus WT cells at day 3, TFs associated with the top scoring motifs included Ctnnb1 , which encodes ␤-catenin in canonical Wnt-signalling, and Smad2 , a key TGF-␤/ Nodal signalling transducer (Figure 2 A). Other factors identified were lineage-specific TFs, which were either lowly expressed ( Hnf4a , Runx1 ), expressed later during differentiation ( Snai1 , Foxh1 , Tcf4 ), or wer e not differ entially expr essed ( Pou5f1 , Rbpj , Sp1 ) (data not shown). In line with an activation of Wnt / ␤-catenin and Nodal signalling, Wnt positi v e regulators ( Axin2 , Sp5 , Tcf7 , Lgr4 , Wnt9b ) and Nodal signalling targets ( Nodal , Tdgf1 , Lefty1 , Lefty2 ) were collecti v ely acti vated in bulk Tet1 KO cells (Figure 2 B) and in single Tet1 KO epiblast cells (Figure 2 C) on day 3. By performing transient transfection of a Wnt reporter construct, we confirmed a 2-3-fold increase in Wnt signalling activity in Tet1 KO cells on day 3, w hich was completel y suppressed by a Wnt inhibitor XAV939, but not by the Nodal / TGF-␤ receptor inhibitor SB431542 (Supplementary Figure S2A). Using phosphorylation of SMAD2 as an indicator of Nodal activity, we observed that both XAV939 and SB431542 effecti v ely b locked ectopic Nodal activation in Tet1 KO cells on day 3, confirming tha t Wnt / ␤-ca tenin acts upstream of Nodal in the signalling cascade (Supplementary Figure S2B).
In our previous study using an embryoid body differentiation model, we have shown that treating Tet1 KO cells with Wnt and Nodal signalling inhibitors can rescue neuroectodermal fate ( 16 ). To assess cellular heterogeneity and anterior-posterior identity, we performed scRNA-seq in Tet1 KO and WT cells on day 5 after treatment with XAV and SB from day 2 (Figur e 2 D). In this dir ected differentiation protocol, both Wnt and Nodal inhibition reverted Tet1 KO cells from mesoderm and endoderm identity towards neuroectoderm, insomuch that > 81% of treated  Figure S2A). Inhibiting bone morphogenetic protein (BMP) signalling by LDN193189 treatment from day 2 did not affect differentiation (Supplementary Figure S2H), as previously shown in murine cells ( 49 ), indicating that hyperacti v e Nodal signalling via phosphorylation of SMAD2 causes the loss of neuroectoderm in Tet1 KO, but not hyperacti v e BMP signalling via SMAD1 / 5 / 8.
To identify regulators of Wnt pathway which are targets of TET1, we examined differ ential r egulon activities between Tet1 KO versus WT specifically in the Epiblast cell cluster. We selected the top 40 up-and 40 downr egulated r egulons, and then order ed them by the number of overlapping DEGs (Figure 2 H, showing the top 10). Among lineage TFs, we observed as expected up-regulation of classic primiti v e str eak r egulons ( Foxa2 , Nanog, T ) and down-regulation of classic ectoderm regulons ( Sox2 , Sox1 ) in KO compared to WT (Supplementary Figure S2I). The majority of lineage TFs wer e differ entially expr essed la ter in dif ferentia tion (after day 3), leading our attention to Tcf7l1, which was expressed early during differentiation, and showed differential expression and a significantly down-r egulated r egulon in the KO Epiblast single-cell cluster (Figure 2 H). Tcf7l1 , also known as Tcf3 and a canonical direct Wnt repressor in mouse ESCs, has a role in nai v e to primed pluripotency transition but is also r equir ed for lineage specification and mesoderm differentiation ( 50 , 51 ). Consistent with a role in r epr essing Wnt signalling in WT cells, Tcf7l1 was downregulated early within days 1-3 in bulk Tet1 KO cells and expressed at lower le v els in single epiblast cells on day 3 (Figure 2 I, Supplementary Figure S2J). Suggesting a direct gene regulation, the promotor of Tcf7l1 was bound by TET1 in both ESCs , EpiLCs , and until day 2 in this neurobasal dif ferentia tion assay (Figure 2 J, Supplementary Figure S2K). To demonstrate that WT cells can overcome cell intrinsic repression of Wnt / ␤-catenin pathway, we treated WT cells with a canonical Wnt signalling activator CHIR99021 (CHIR) and observed induction of TOP-flash activity on day 3 by up to + 30-fold; similarly, Tet1 KO cells responded to CHIR with a further + 5-fold elevation (+20-fold compared to WT without CHIR) in TOPflash activity (Supplementary Figure S2L). The sensitivity of both WT and KO cells to exogenous Wnt activators suggests that a cell-intrinsic de-r epr ession of signalling transduction contributes to the 2-3 fold elevated basal Wnt / ␤catenin activity in KO cells on day 3, rather than ectopic hyperactivation.
Next, we asked whether loss of TCF7L1 in mouse ESCs will result in a lineage switch similar to that observed with the loss of TET1. We used established mouse blastocystderi v ed Tcf7l1 KO cells ( 24 ) and subjected them to the neurobasal dif ferentia tion assay (with signalling inhibition). As r eported pr e viously (24), we observ ed a delay in e xit from pluripotency and dif ferentia tion in Tcf7l1 KO cells, reflected by a persistence in expression of the nai v e pluripotency marker Zfp42 on day 1 and lack of expression of differentiation markers on day 3 as detected by quantitati v e RT-PCR (qPCR) (Figure 2 K, Supplementary Figure S2M). Howe v er, by day 5 primiti v e streak mar kers T and Foxa2 were induced in Tcf7l1 KO cells at similar le v els as in Tet1 KO cells (Figure 2 K). Furthermore, we observed similar extents of mesoderm ( Twist1 , Myl7 ) and endoderm ( Sox17 , Spink1 , Cpm ) gene induction and neuroectoderm ( Sox1 , Lhx5 , Pax6 ) loss in Tcf7l1 KO and Tet1 KO cells by day 5 of dif ferentia tion (Figure 2 L). These results suggest that a TET1-TCF7L1-WNT transcriptional axis may regulate germ layer lineage bifurcation.
To demonstrate whether modulating TCF7L1 le v els can affect lineage switching in Tet1 KO cells, we generated doxycy cline (DOX)-inducib le Tcf7l1 ov er-e xpression (OE) ESC lines on both WT and Tet1 KO genotypes. Clonal replicate lines were selected based on inducibility of TCF7L1 protein expression in ESCs upon 48-hour DOX treatment (Figure 2 M, Supplementary Figure S2N). We tested addition of DOX on either day 1 or day 2 of neurobasal dif ferentia tion and analysed cells at day 3, to determine whether enhancing Tcf7l1 expression upon pluripotency exit can pre v ent ectopic primiti v e streak gene e xpression in Tet1 KO cells. Indeed, a 2-4 fold increase in Tcf7l1 expression by DOX treatment starting on either day 1 or day 2 significantly reduced expression of T and Foxa2 in Tet1 KO cells, whereby reduction was greater when DOX was added on day 1, while not affecting expression of primed markers Otx2 and Dnmt3b (Figure 2 N, Supplementary Figure S2O). In line KO15.7 the induction of TCF7L1 was more variable compared to other lines in dif ferentia tion (Supplementary Figure S2O, top line), which correlated with a variable rescuing effect observed in this line (Figure 2 N, outliers). Treatment with DOX at day 1 or day 2 of parental Tet1 KO lines without the TCF7L1 OE construct did not affect primiti v e streak gene induction (data not shown). Moreover, Tcf7l1 OE from day 1 significantly reduced the activity of the Wnt reporter in Tet1 KO cells (Figure 2 O), validating TCF7L1 as a canonical Wnt r epr essor r egulated by TET1 to safeguard against precocious primiti v e streak fate entry.

Contribution of 5mC oxidation by TET1 in the r epr ession of primitive streak fate
Since TET1 has a N-terminal domain with distinct regulatory functions from its C-terminal catalytic domain ( 52 ), we asked how lineage bifurcation choice is dependent on its ca talytic or non-ca talytic function. We used CRISPR / Cas9 and homology directed repair to introduce H1620Y and D1622A substitutions within the Fe 2+ chelating acti v e site coding sequence in endogenous Tet1 to disrupt its 5mC oxida tion (i.e. ca talytic) function ( 11 ) (Figure 3 A). Three independent pairs of Tet1 catalytic mutant (MUT) and isogenic mock transfected (MOCK) ESC lines were generated in the (B6 × 129S6)F1 strain (Supplementary Figure S3A). We validated complete loss of 5hmC in Tet1 MUT lines upon in vitro conversion to EpiLCs, a state when Tet1 is expressed in the absence of Tet2 and Tet3 , and during the neurobasal dif ferentia tion time course ( Supplementary Figure S3B, C). All three Tet1 MUT ESC lines expressed fulllength TET1 protein and down-regulated expression with similar kinetics as MOCK cells during antNPC differentiation (Supplementary Figure S3D-E). ChIP-qPCR analysis at lineage-specific enhancers and promoters also indicated similar binding affinities and dynamics between WT and MUT at the early stages of differentiation (Supplementary Figure S3F).
We and others previously identified the Lefty1 and Lefty2 loci, encoding the Nodal antagonist LEFTY, to be direct targets of DNA demethylation by TET1 in mouse ESCs ( 14-15 , 53 ). Sustaining Lefty expr ession pr events pr emature primiti v e str eak differ entiation ( 54 ). Her e, we verified that a specific loss of TET1 catalytic activity was sufficient to induce an increased and sustained phosphorylation of SMAD2 in Tet1 MUT cells from day 3 onwards, although the hyperactivation was less compared to Tet1 KO cells, suggesting that Nodal signalling is at least partially repressed by TET1's catalytic activity (Figure 3 B). The western blot analysis also indicated higher levels of active ␤catenin protein at day 5 in Tet1 MUT and KO cells compared to WT controls (Figure 3 B); howe v er, differential le vels of acti v e ␤-catenin protein were not detectable at day 3. By performing the Wnt reporter activity assay, we detected elevated Wnt / ␤-catenin activity in Tet1 KO cells on day 3 and day 4, but no significant increase in signalling activity in Tet1 MUT cells above background at the same time-points (Figure 3 C). (Limitations of transient transfection after 2 days of dif ferentia tion precluded a readout of the Wnt reporter on day 5.) Thus, Wnt / ␤-catenin signalling initiation upon TET1 dysfunction appears to r equir e complete loss of TET1 protein. In this dif ferentia tion model, stimulating WT cells on day 2 exogenously with the Nodal agonist Activin A can activa te Wnt / ␤-ca tenin, verifying a cross-talk between Nodal and Wnt / ␤-catenin signalling (Supplementary Figure S3G). These observations suggest that the specific loss of TET1 catalytic activity first triggers hyperactivation of Nodal / Smad signalling, which subsequently activa tes Wnt / ␤-ca tenin, resulting in a positi v e feedback loop of Wnt and Nodal pathways amplifying each other.
To understand transcriptomic differences between Tet1 MUT and KO cells during dif ferentia tion, we performed bulk RNA-seq on day 3 and day 5. In PCA plots, Tet1 MUT cells clustered together with WT cells and away from Tet1 KO cells, on both day 3 and day 5 (Supplementary Figure  S3H). Using the same stringent criteria (FDR adjusted Pvalue < 0.05 and log 2 |fold change| > 1) to define DEGs on day 3 and day 5 between Tet1 MUT versus WT, we observed only about 50 DEGs on day 3, mostly downregulated in Tet1 MUT but with no functional enrichment. On day 5, genes up-regulated in Tet1 MUT were enriched in GO terms associated with mesoderm de v elopment, while downregulated genes showed no functional enrichment (Supplementary Figure S3I, J, Supplementary Table S2). These results are similar with those by a recent stud y tha t also examined Tet1 catalytic mutant ESCs and observed normal differentiation towards neuroectoderm, much like WT ESCs ( 22 ). Ther efor e, TET1's non-catalytic function is dominant in driving neural fate induction.
Howe v er, by performing qPCR, we could reliably detect expression of markers of all three lineages in Tet1 MUT cells on day 5 (ectoderm: Pax6 , Sox1 , Lhx5 , Wnt1 , En1 ; endoderm: So x17 , Fo xa2 , Spink1 ; mesoderm: Myl7 , Mef2c , Tnnt2 , Hand2 , Twist1 , Gata4 ) (Figure 3 D). Although primiti v e streak markers such as T were not detectable in Tet1 MUT cells on day 3, the mesoderm TF MEF2C was clearly detected in both T et1 MUT and T et1 KO cells by immunofluorescence on day 5 (Figure 3 E,F), reflecting a delayed activation of Wnt / ␤-catenin and Nodal / Smad signalling in Tet1 MUT cells during dif ferentia tion. These results suggest that Tet1 MUT cells may have gained trilineage potential, losing the restriction that directs WT cells unilaterally towards neuroectoderm and Tet1 KO cells towar ds primiti v e streak fate. To v erify the lineage heterogeneity of Tet1 MUT cells on day 5, we performed scRNAseq of Tet1 MUT and MOCK cells. MOCK cells aggregated as one major cluster overlapping fully with WT ectoderm and neuronal cells (expressing Pax6 , So x1 , So x2 , Tubb3 ); only < 1% clustered together with endoderm (expressing Fo xa2 , So x17 ) and mesoderm cells (expressing These results suggest that a bipartite mode of TET1 activities at an ear ly br anch point of germ layer lineage bifurcation involves (i) a non-catalytic regulation that initially represses Wnt / ␤-catenin signalling to promote epiblast transition into neuroectoderm, and (ii) a 5hmC-dependent catalytic regulation that is r epr essi v e of Nodal activation and subsequently Wnt signalling, pre v enting premature entry into primiti v e str eak fate (Figur e 3 L).

Pr efer ential engagement of ectodermal enhancers by TET1
Since the distinct regulatory modes of TET1 appear associated with lineage-specific outcomes, we examined whether the genomic occupancy patterns of TET1 may identify lineage-specific determinants. We hav e pre viously reported a d ynamic repa tterning of TET1 genomic occupancy from mainly promoters in nai v e pluripotency to both promoters and poised enhancers in formati v e pluripotency ( 16 ). By further incorporating a published H3K27ac ChIP-seq da taset tha t defined acti v e enhancers and promoters in  ( 46 ), we associated TET1-bound loci to genes classified by their individual germ layer tissue-specificity of expression (Supplementary Figure S4A). By this analysis, we observed that TET1 pr efer entially occupied ectodermal distal enhancers in formati v e pluripotency (mimicked in vitro by EpiLCs), prior to the expression of the associated genes at E7.5 (Figure 4 A, B). Howe v er, at gene proximal promoter regions uniquely linked to e v ery lineage-specific enhancer, TET1 occupancy was equally distributed among the three lineages (Figure 4 A, right panel). Enhancers are often CpG poor, whereas 60-70% of mammalian proteincoding gene promoters are CpG-rich ( 55 , 56 ) and feature CpG islands (defined as stretches of DNA on average 1000 bp long with an observed over expected CpG ratio ≥ 0.6) ( 57 ). Ther efor e, we specula ted tha t these dif ferential CpG densities between gene proximal and distal loci may be intricately linked with the dependence on TET1's 5mC oxidation activity.

TET1 demethylates CpGi shores and enhancers in primed pluripotency
Based on the kinetics of TET expression and genomic 5hmC content (Figure 1 Figure S4D), suggesting that they may be de novo sites of acti v e DNA demethylation in the complete absence of TET1 protein.
As also described by a recent study by Chrysanthou etal. , the catalytic dead full length TET1 protein may be protecting these Tet1 KO-specific hypo DMRs from compensatory DNA demethylation by TET2 or TET3 ( 22 ).
Hyper DMRs in either Tet1 KO or MUT cells were enriched in CpG island (CpGi) shores (2-kb long regions flanking CpGi) ( 59 ) and gene promoters (Supplementary Figure S4E, F). To focus on DMRs that can be directly attributed to loss of TET1 catalytic function, we selected 3422 hyper DMRs common in KO and MUT cells (i.e. the overlap of hyper DMRs), which recapitulated a roughly equal distribution among CpGi shores and distal regions, for further analysis (Figure 4 D-F). These common hyper DMRs e xhibited on av erage a 40% gain in CpG methylation levels compared to WT cells and a width spanning 200-500 bp (Supplementary Figure S4G, and data not shown). We divided these hyper DMRs into two classes: DMRs within 4kb of a CpGi as 'CpGi-associated' ( n = 1694), and those further away as 'distal' ( n = 1728). While 61% of CpGiassociated DMRs were within 1 kb of a CpGi, our threshold distance of 4kb was chosen to encompass also CpGi shores and CpGi shelves, defined to be within 0.2 kb and 2-4 kb, respecti v ely, of a CpGi ( 59 ). Using pub licly availab le H3K4me3, H3K27me3, H3K4me1, and H3K27ac ChIPseq datasets profiled in mouse ESCs ( 26 ), we noted that CpGi-associated DMRs displayed hallmarks of promoters marked by active H3K4me3 and r epr essive H3K27me3 signatures, known as bivalent domains ( 60 ) Figure S4I). By this classification, 82% of all genes (total n = 1701) associated with CpGi-associated DMRs are within 10 kb of their transcription start sites (TSSs), w hile onl y 25% of all genes (total n = 1728) associated with CpGi poor distal DMRs are within 10 kb (Supplementary Figure S4J, K).
W hen CpGi-associa ted DMRs wer e centr ed on the CpGis, we observed that CpGs within CpGi centres were constituti v ely unmethylated ( < 10% methylation) in all three genotypes, as expected (Figure 4 G, left panel). The methylation gains in CpGi-associated hyper DMRs were found  6 , Lhx5 , Wnt1 , En1 , Sox1 ), endoderm ( Sox17 , Foxa2 , Spink1 ), and mesoderm ( Twist1 , Hand2 , Myl7 , Mef2c , Tnnt2 , Gata4 )   More striking methylation differences between WT and Tet1 -deficient cells were seen at the centre of distal DMRs (Figure 4 G, right panel). As evidence for TET-mediated DNA demethylation in both DMR classes, 5hmC was detectable only in WT cells at the regions corresponding with methylation gains in KO and MUT cells (Figure 4 G). Moreover, both DMR classes centred at TET1 binding peaks in EpiLCs (Figure 4 G, bottom panel). They are highly enriched in GO terms associated with pattern specification and embryonic morphogenesis across all germ layers, although distal DMRs appeared to be more specifically enriched for GO terms associated with central nervous system de v elopment (Figure 4 H). Different CpGi distance thresholds from 1-4 kb did not affect these results (Figure 4 H,  heatmap).
Next, we asked whether hyper DMRs induced by loss of TET1 early at day 2 would persist to affect neuronal genes later in differentiation. For this, Tet1 KO and WT ESCs were converted to day 5 antNPCs in the presence of the Wnt inhibitor XAV added at day 2, a treatment that rescues the lineage identity switch in Tet1 KO and allows both WT and KO cells to form homogeneous antNPCs of predominantly E8.5 forebrain identity (Figure 2 E-G). Since 5hmC is undetectable by dot blot analysis in day 5 antNPCs (Figure 1 C), we focused on profiling true 5mC changes caused by loss of TET1 by oxWGBS.
We detected 17665 DMRs between Tet1 KO and WT antNPCs on day 5, of which > 98% are hyper DMRs (Figure 4 C). Similar to day 2 hyper DMRs, day 5 DMRs were 150-450 bp wide, showed 40% higher median methylation le v els in Tet1 KO relati v e to WT, and were enriched at promoter proximal CpGi shores with an equivalent number also found at distal regions (Supplementary Figure S4E, F,  H). Of all 17 279 hyper DMRs, 15 338 were de novo DMR sites that did not overlap with hyper DMRs on day 2 (Figure 4 D). They were distributed among promoter proximal CpGi shores ( n = 6991) and distal enhancer sites ( n = 8347) marked by H3K4me3 and H3K4me1, respecti v ely, and display ed moder ate le v els of H3K27ac in ESCs to indicate that they were acti v e regulatory sites prior to differentiation (Figure 4 E, F, Supplementary Figure S4L-N). Methylation differ ences wer e mor e appar ent at distal sites, wher e further DNA demethylation ( > 40% reduction in 5mC le v els) occurred in WT but not in KO antNPCs on day 5 compared to day 2 (Figure 4 I, right). At CpGi-associated regions, de novo DMRs showed a 10-15% reduction in 5mC le v els at CpGi flanking regions in WT antNPCs on day 5, but h ypermeth ylation in KO at CpGi centr es (Figur e 4 I, left). Of note, TET1 protein was barely detectable in WT cells between day 3 and day 5 of dif ferentia tion (Supplementary Figure S1B) but was engaged at de novo DMRs (both CpGi-associated and distal) in EpiLCs (day 1) prior to neural fate induction (Figure 4 I, bottom panel), suggesting that the presence of TET1 a t pre-gastrula tion safeguards against ectopic DNA hypermethyla tion post-gastrula tion. Day 5 de novo hyper DMRs were associated with genes that are enriched in GO terms associated with organogenesis including 'pattern specification process', 'axonogenesis', and 'forebrain de v elopment' (Figure 4 J).
Since many de v elopmental genes are described to be located within long ( > 5 kb) h ypometh ylated regions called DNA methylation valleys (DMVs) or can y ons ( 61-63 ), we asked whether TET1 activity regulates these regions. Using the WT methylome, we defined 566 DMVs on day 2 of differentiation, and 698 DMVs on day 5 (+XAV treatment). In congruence with the profiles of CpGi-associated DMRs, DMVs were affected (10%) in both Tet1 KO and MUT cells on day 2 mainly at their boundaries where 5hmC was detectable in WT cells (Supplementary Figure S4O). On day 5, increase of methyla tion a t DMV boundaries was as much as 10-15% at both boundaries and valleys (Supplementary Figure S4P). These results suggest a DMV protecti v e role for TET1.

Non-catalytic regulation of chromatin accessibility by TET1 at distal enhancers at gastrulation onset
For further insights into the impact of TET1's catalytic and non-catalytic activities on the chromatin state, we investiga ted the chroma tin accessibility changes in WT, Tet1 MUT and KO cells during dif ferentia tion using the assay for transposase accessible chromatin-sequencing (ATAC-seq). On day 2, we did not find any significant differential accessible regions (DARs) between Tet1 KO or MUT versus WT, suggesting that TET1-dependent DNA methylation changes at day 2 precede any opening and closing of chroma tin tha t would dri v e lineage segregation (Supplementary Figure  accessible on day 2 in all three genotypes; interestingly, these regions still exhibited DNA h ypermeth ylation in Tet1 MUT cells on day 2 despite sustaining an open sta te, illustra ting a discordance between DNA methylation changes and chromatin accessibility (Figure 5 A, B). In contrast, regions that opened (open-in-KO)on day 3 were de novo accessible regions previously highly methylated in all genotypes on day 2 (Figure 5 A, B). Only close-in-KO DARs were bound by TET1 in EpiLCs (Figure 5 C); thus, the engagement of TET1 protein at these regions, independently of its catalytic function, is r equir ed to sustain an open chromatin state. On the other hand, DARs that opened were bound very weakly or not at all by TET1, suggesting that chromatin opening at these regions is an indirect effect of loss of TET1 (Figure 5 C).
Both open-in-KO and close-in-KO DARs were devoid of CpGi and CpG poor (Figure 5 C, Supplementary Figure S5C); close-in-KO DARs e v en showed near-zero CpG frequencies, consistent with their inability to engage TET1 whose CXXC domain confers an affinity for CpG-rich loci ( 64 ). Only < 25% of close-in-KO and < 10% of open-in-KO DARs can be annotated as promotors, while the vast majority were annotated as distal ( Figure 5 D, E), marked by H3K4me1 instead of H3K4me3 (Supplementary Figure  S5D). From these data, we classified the DARs as distal regulatory regions.
To determine how chromatin accessibility at DARs are regulated by lineage TFs, we performed motif enrichment analysis on DARs that were classified either as distal (closein-KO d3, n = 625, 68%; open-in-KO d3, n = 902, 84%) or proximal (close-in-KO d3, n = 295, 32%; open-in-KO d3, n = 175, 16%). (A DAR was considered to be distal if the distance to TSS was > 3 kb.) To find biological significant TFs, we filtered our list based on gene expression, including only TFs that wer e expr essed at a minimal TPM of 5 at one point during antNPC differentiation in either Tet1 KO or WT cells (Supplementary Figure S5E, unfiltered list). Close-in-KO DARs enriched for motifs belonging to ectoderm-specific TFs that harbour high mobility group (HMG) ( So x2 , So x3 ), homeobox ( Pou5f1 ) and TAE domains ( Tead1 ). In contrast, open-in-KO DARs enriched for mesoderm-and endoderm-specific TFs with forkhead ( F oxa2 , F oxo3 ) and T-box protein motifs ( T , Eomes ) (Figure 5 F). Proximal regions were enriched for similar motifs as distal regions, although distal regions had a higher enrichment score correlating with their relati v e higher abundance. Interestingly, Wnt signalling factor Lef1 was highly enriched in opened DARs, while not in closed DARs (Figure 5 F). By utilizing H3K27ac ChIP-seq and ATAC-seq datasets profiled in E7.5 germ layer tissues ( 46 ), we verified that close-in-KO and open-in-KO DARs are respecti v ely ectoderm-and mesoderm / endoderm-specific lineage enhancers ( Figure 5 G, H, Supplementary Figure S5F).
Since modulation of Wnt signalling can alter lineage choice, we asked whether the chromatin state of Tet1dependent DARs would also be responsi v e to Wnt / ␤catenin signalling activity. To answ er this, w e performed ATAC-seq on day 3, after a 24-hour treatment of either WT cells with the Wnt signalling activator CHIR or Tet1 KO cells with the Wnt inhibitor XAV. Simulating loss of TET1 but with more e xtensi v e chromatin accessibility changes, CHIR-treated WT cells di v erged from WT and Tet1 MUT in similar directions as Tet1 KO cells on the PC1 axis, wher eas XAV-tr eated Tet1 KO cells r e v erted to cluster together with WT and Tet1 MUT (Supplementary Figure  S5G). We observed that the extent of chromatin accessibility changes at both close-in-KO and open-in-KO DARs can be fully mimicked by treatment with the Wnt agonist in WT cells, whereas all changes can be re v erted to WT state in KO cells by Wnt inhibition (Figure 5 I, Supplementary Figure  S5H), suggesting that the DARs are target loci of Wnt / ␤catenin signalling.
To further address the question whether aberrant methylation in post-gastrulation forebrain cells affects the chroma tin sta te, we performed ATAC-seq of day 5 antNPCs treated with XAV from day 2. We detected 2341 DARs that showed loss of accessibility in Tet1 KO on day 5 (+XAV) compared to WT (loss-in-KO d5 + X), and 4754 regions that gained accessibility (gain-in-KO d5 + X). Loss-in-KO d5 + X DARs were de novo accessible regions, where accessibility was still very low on day 2 or day 3; in contrast, gain-in-KO d5 + X wer e alr eady accessible on day 2 (Figur e 5 J). Inter estingly, loss-in-KO d5 + X DARs, which showed high methylation le v els in both WT and Tet1 KO on day 2, showed loss of methylation in WT cells but not in Tet1 KO on day 5, suggesting that the persistence of high DNA methylation le v els at these loci on day 5 is associated with reduced chromatin opening in KO ( Figure  5 J,K). Both types of DARs were classified as distal enhancers based on histone marks , CpG frequencies , and occupancy by TET1 in EpiLCs (Supplementary Figure S5I-K). Gain-in-KO d5 + X DARs were enriched in H3K27ac in E7.5 ectoderm tissue relati v e to mesoderm / endoderm ( 46 ), consistent with these being 'neural default' loci that ar e alr eady accessib le in the epib last ( Figure 5 L). Lossin-KO d5 + X regions were not strongly enriched in any E7.5 tissues; howe v er, by utilizing ENCODE tissue-specific H3K27ac ChIP-seq datasets, we found that these DARs were highly enriched for H3K27ac in more dif ferentia ted E10.5 fetal brain tissues ( Figure 5 L, M) ( 31 , 46 ). The latter observations suggest that the absence of TET1 causes de novo accessible fetal brain enhancer loci to exhibit a persistently high DNA methylation status coupled with compr omised chr oma tin accessibility post-gastrula tion, prior to gene expression later in development.
These results suggest that a dominant role of TET1 in ESCs is to maintain open chroma tin a t ectodermal-specific distal enhancers until gastrulation onset (on day 3 of differentiation), which promotes 'neural default' differentiation and indirectly keeps mesoderm and endoderm enhancer loci closed from Wnt / ␤-catenin signals driving alternati v e fates. This early function at distal loci does not require its catalytic activity and can occur independently of DNA demethylation. Howe v er, the opening of chromatin post-gastrula tion a t fetal brain distal enhancers is coupled with DNA demethylation.

Functional inter actions betw een TET1 and Polycomb at bivalent gene promoters
Se v eral studies hav e described co-localization of 5hmC with histone bivalent domains, where H3K27me3 marks are mediated by PRC2, in mouse and human ESCs ( 18 , 20 , 21 , 65 , 66 ). Howe v er, the functional impact of 5hmC in the context of histone bivalency on gene expression remains unclear. Because CpG methylation can reduce the binding affinity of PRC2 components to nucleosomes ( 67 ), we asked whether loss of 5mC oxidation may disrupt Polycomb-mediated developmental gene r epr ession. Interestingly, a significant portion (48%) of day 2 hyper DMRs overlapped with 'bivalent' gene promoters ( Figure  6 A). (The expected rate would be ± 18%, the percentage of all CpGi promoters that are bivalent in the mouse genome.) In contrast, only 6% of distal hyper DMRs overlapped. By identifying genes within 10 kb of a CpGi-associated hyper DMR (total n = 1259) that were DEGs of KO versus WT from day 3 onwards during neurobasal dif ferentia tion (without signalling inhibitors, which re v ealed the full extent of lineage segr egation, Figur e 1 D), we observed that more than 66% of up-regulated DEGs (130 of 197) were genes with bivalent promotors in ESCs (Figure 6 B).
To determine the functional effect of DNA hyper methyla tion a t bivalent domains on gene expression, we stratified all up-and down-regulated DEGs in Tet1 KO or MUT versus WT based on whether they were associated with a CpGi hyper DMR and / or a bivalent promoter (Figure 6 C), then compared the fold changes (FC) in expression of both Tet1 KO and MUT over WT cells on day 3 and day 5 of neurobasal dif ferentia tion. On day 3, we observed a significant activation among bivalent DEGs (with or without CpGi-associated hyper DMRs) already in Tet1 KO cells, although no significant gene expression changes were detectable in MUT cells. Howe v er, by day 5, bivalent DEGs with CpGi-associated hyper DMRs showed significantly greater fold increase in gene expression compared to other DEG classes in both KO and MUT cells (average log 2 FC > 2 and 1, respecti v ely) (Figure 6 C, top panels). Such positi v e correla tions between DNA methyla tion gains at pr omoter-pr oximal CpGi shores and gene expression were evident at primitive streak bivalent genes such as T and Sox17 (Supplementary Figure S6A). On the other hand, bivalent DEGs with CpGi-associated hyper DMRs also showed significantly greater fold decrease in expression at day 5, but only in KO cells since much fewer DEGs wer e down-r egulated in MUT cells (Figur e 6 C, bottom panels and S3J). Thus, a functional impact of TET1's catalytic activity on lineage segregation appears restricted to CpGiassociated promoter r egions, wher e a synergy with its noncatalytic activity suppresses alternati v e non-neural fates.
To determine how loss of TET1 affects Polycomb function before and after lineage segregation, we performed Cleavage Under Target & Release Using Nuclease (CUT&RUN) to profile genome-scale changes in H3K27me3 in WT, MUT and KO cells during the differentiation time-course. On the PCA, Tet1 KO cells clustered away from MUT and WT cells on day 0, while on day 2 and day 3 all three genotypes clustered together. Howe v er, both MUT and KO cells clustered separately from WT cells by day 5 without inhibitor treatment ( Supplementary Figure S6B). These dynamics were reflected by the differential peak analysis. On day 0, we observed a loss of H3K27me3 in KO, but not in MUT cells, relati v e to WT; howe v er, the loss in KO r ecover ed by day 2 after which both KO and MUT made dramatic gains of H3K27me3 by day 5 (Figure 6 D). The majority of H3K27me3 gain-in-KO on day 5 overlapped with the gain-in-MUT regions, e v en though only > 15% of cells in MUT were mesoderm or endoderm compared to > 95% in KO, suggesting that the gain of H3K27me3 occurred independently of cell lineage identity (Figure 6 E, 3 J). Further, the majority (87%) of differential H3K27me3 regions lost in KO occurred on day 0, while most regions that gained H3K27me3 on day 2 and day 3 retained the gain relati v e to WT until day 5 (Supplementary Figure S6C, D). To assess whether the gain in H3K27me3 is dependent on the signalling modulation, we treated cells from day 2 onwards with XAV to obtain a homogeneous forebrain population in all three genotypes. Here, KO cells still gained H3K27me3 on day 5, despite the complete conversion to antNPCs. In contrast, MUT cells no longer gained H3K27me3 (Figure 6 D), suggesting that the post-gastrulation H3K27me3 gain resulting from loss of TET1 catalytic activity is dependent on Wnt / ␤-catenin (and downstream Nodal) signalling.
To understand the functional gene categories affected by the dynamic loss and gain of H3K27me3, we classified the differential regions into three groups (Figure 6 F, G): (i) regions that lost H3K27me3 in KO on day 0, and r ecover ed by day 2 without subsequent gain (loss-in-KO, n = 792); (ii) regions that lost H3K27me3 in KO on day 0 and then subsequently gained the mark by day 5 (lossgain-in-KO, n = 748); and (iii) regions that were not significantly differential on day 0 but gained H3K27me3 after (gain-in-KO, n = 1443). All three groups showed bivalent histone marks in ESCs ( 26 ), occupancy by TET1 in EpiLCs ( 15 )  To confirm that the gain of H3K27me3 observed is associated with increased PRC2 engagement, we performed ChIP-qPCR of SUZ12, a PRC2 core-component, at 5 ectoderm-and 5 brain-specific genes that showed increased H3K27me3 in both Tet1 KO and MUT at day 5 (without signalling inhibition). This verified a strong increase of SUZ12 binding in KO compared to WT cells, and also a weaker but detectable increase in MUT cells (Figure 6 L). Altogether, these da ta suggest tha t a predominantl y non-catal ytic function of TET1 in ESCs facilita tes co-opera tion with PRC2 to r epr ess primiti v e steak genes until gastrulation onset, consistent with previous reports of TET1 recruiting PRC2 to these genes in ESCs independentl y of DN A demethylation ( 22 , 23 ). Howe v er, the cooperati v e interaction is ra pidl y superseded by an antagonistic interaction at neuroectodermal promoters post-lineage priming (after day 2). Though still predominantly an effect of non-catalytic TET1 activity, the r epr ession of de v elopmental signalling by TET1 catalytic function also contributes toward repelling Polycomb from neuronal targets.

Early developmental origin of disease-associated DMRs in
Treatment with the Wnt inhibitor XAV from day 2 onwards resulted in a homogeneous population of forebrain cells in both WT and Tet1 KO cells, despite the accumulation of many hyper DMRs over r egulatory r egions in KO, suggesting that DNA h ypermeth ylation caused by loss of TET1 per se does not inhibit neural induction in the absence of alternati v e fate-inducing signalling cues. Further, many of the genes associated with the DMRs were lowly expressed in day 5 WT antNPCs (Supplementary Figure S7A), suggesting a latent effect of promoter DNA h yper meth ylation on gene expr ession. To addr ess whether these genes would be expressed later during development, we used the ENCODE mouse de v elopment dataset, w hich contains bulk RN A-seq data of 16 different tissues at eight different time-points during mouse embryo de v elopment ranging from E10.5 to P0 ( 31 ). Of all 6193 genes associated with de novo hyper DMRs, we kept 4980 genes that had an expression level of minimally 10 TPM in at least one tissue at one time-point to select genes that ar e expr essed and biolo gicall y relevant during de v elopment. By this analysis, we observ ed that e xpression of these genes increased specifically in the ectoderm tissues such as forebr ain, midbr ain, hindbr ain, and neur al tube after E15.5, but either did not change or decrease in expression in endodermal and mesodermal tissues (Supplementary Figure S7B).
Pre vious wor k by the groups of Xu, Dawlaty and Jiang (71)(72)(73) demonstrated that adult Tet1 KO mice that survi v ed embryonic de v elopment wer e impair ed in hippocampal neuro genesis, co gnition and memory function, syna ptic plasticity, and social and maternal care behaviours. Because TET1 expression is detectable in the adult mouse brain, these phenotypes have been attributed to the loss of active DNA demethylation by TET1 at neuronal gene loci in postmitotic neurons in the adult brain, supporting an important role for TET1's catalytic function in neurological disorders. To determine whether aberrant DNA methyla tion a t neu-ronal genes affected in adult Tet1 KO mice can already be detected during gestation, we selected 5 neuronal TET1 target genes ( Cspg4 , Npas4 , Oxtr , Gal and Ngb ) that showed DNA h ypermeth ylation in the adult KO mouse brain in the aforementioned studies. The CpGi promoter regions of these fiv e genes showed strong TET1 binding peaks in EpiLCs, low methylation in both WT and Tet1 KO cells at day 2 of dif ferentia tion, but pronounced gain in 5mC to high le v els in day 5 Tet1 KO antNPCs (XAV treated) relati v e to WT controls (Figure 7 A). Interestingly, these genes were not expressed during different stages of pluripotency, neural induction or NPCs (TPM < 1 in ESCs or antNPCs at day 5 + XAV, data not shown). In the ENCODE mouse development datasets, they were only expressed later in development in WT mice ( > E12.5) and in the post-natal mouse brain ( > P0) (Figure 7 B), in agreement with their suggested role in neurological functions later in life ( 31 , 71-73 ).
Using targeted bisulfite sequencing, we confirmed that these loci were h yper meth ylated in both Tet1 KO and MUT cells (Figure 7 Figure S7F), suggesting that the DMRs persist after gastrulation and are not erased by TET2 and TET3 later in de v elopment. These results support the notion that hyper DMRs affecting these brain-specific genes associated with a neurodegenerati v e phenotype in adult mice, may originate as early as post-gastrulation as a result of catalytic dysfunction of TET1.

DISCUSSION
A dual function of TET1 in gene regulation has been previously described, by which 5mC oxidation activity at promoters and enhancers is thought to promote transcriptional activation while the r epr essi v e function is a result of 5mC oxidation-independent interactions with co-r epr essors ( 15 , 17-19 ).  enhancers is proposed to be preceded by DNA demethylation, although causal evidence in embryonic development is lacking ( 9 ). Here, our results provide fresh insights into the dual function by demonstrating (i) a dynamic switch from a cooperati v e TET1 interaction with PRC2 at primiti v e streak genes pre-gastrulation to an antagonistic ef fect during dif ferentia tion, when TET1's ca talytic activity participate in repelling PRC2 fr om neur onal genes via the control of de v elopmental signalling; (ii) an uncoupling between DNA demethylation and the maintenance of chromatin accessibility at neuroectoderm enhancers at gastrulation onset (day 3). Despite a pr efer ential engagement by TET1 at primed neuroectodermal enhancers, 5mC oxida tion a t CpG-poor distal sites, or the lack of, has barely any impact on enhancer function. Rather, the engagement of full-length catal yticall y dead TET1 suffices to maintain chromatin accessibility to promote neuroectodermal fate entry. A functional impact of TET1's catalytic activity becomes mor e appar ent post-lineage priming, when a synergy with its non-catalytic activity restricted to CpGi-associated promoter regions serves primarily to suppress alternati v e fates. We identified the Wnt r epr essor T CF7L1 as an early target of TET1 as well as longer-term effects of TET1's ca talytic activity tha t sustains demethyla tion a t se v eral loci associated with adult-onset neurodegeneration phenotypes.
Our earlier work has alluded to a dominant role of noncatalytic TET1 function during peri-gastrulation de v elopment in the mouse ( 15 ). Recently two studies were published on distinguishing the catalytic and non-catalytic functions of TET1 in mouse ESCs ( 22 , 23 ). Much in agreement with our findings, both studies showed that catalytic mutation of Tet1 alone did not recapitulate the transcriptomic phenotype observed in Tet1 KO ESCs, despite observing common h ypermeth yla tion signa tures in both MUT and KO cells. Chrysanthou et al. showed that loss of TET1 resulted in d ysregula tion of bivalent genes by disrupting recruitment of EZH2 and SIN3A and H3K27me3 deposition, but the effects were not dependent on the catalytic activity of TET1 because the bivalent genes were not associated with DMRs. Stolz et al . focused on the non-catalytic role of TET1 in the r epr ession of endogenous retroviral elements via the establishment of H3K9me3 and H4K20me3, showing also a decrease in H3K27me3 on a global scale in Tet1 KO but not in MUT ESCs.
Howe v er, we found a subset of hyper DMRs that occurred in both MUT and KO cells at CpGi shores, of which ±50% were in the proximity of bivalent domains and associated with d ysregula ted genes. Further, the loss of H3K27me3 in Tet1 KO ESCs was r estor ed within 2 days of dif ferentia tion. Ther eafter, H3K27me3 incr eased at ectodermal targets by day 5, independently of lineage-directing signalling modulation in KO, but by a Wnt-signalling sensiti v e mechanism in MUT. These differences between our study and those prior can be readily attributed to the distinct de v elopmental time points assessed. Both Chrysanthou et al. and Stolz et al. limited their analyses to mouse ESCs cultured under serum and LIF conditions, a nai v e metastable state that sustains Tet2 expression. We have perf ormed oxWGBS at da y 2 of dif ferentia tion in cells that have reached primed state, when Tet1 is the only TET gene with detectable expression, and a time-course analysis of H3K27me3 occupancy throughout dif ferentia tion to a post-gastrulation stage. The de v elopmental time-points are critical because we have observed highly dynamic changes in TET1 genome occupancy and 5mC / 5hmC patterns when conv erting nai v e mouse ESCs into formati v e EpiLCs ( 15 , 16 ). Thus, we expect the analysis of DMRs in day 2 cultures (equivalent to a stage of lineage priming), together with the impact on differential enrichment of H3K27me3 until day 5 (post-gastrulation) in this study, to re v eal de novo TET1-regulated sites that have functionally relevant impact on de v elopmental gene expression.
Histone bivalency, first observed at ESC promoters marked by both active H3K4me3 and r epr essive H3K27me3 marks, is a hallmark of stem and progenitor cells, but its exact function remains a matter of debate ( 60 ). It is widely accepted that bivalency 'primes' de v elopmental genes for faster activation or silencing via resolution into monovalent states during lineage segregation. This concept also extends to distal regulatory elements, where the copresence of H3K4me1 and H3K27me3 histone modifications at 'poised' enhancers marks their associated genes for activa tion la ter in de v elopment ( 26 ). Howe v er, a recent study proposed that bivalent domains may function instead to pre v ent the premature activation of genes during germ layer dif ferentia tion, thus having a 'reining' effect as opposed to 'priming' ( 74 ). In that study, loss of the 5mC sensiti v e CpGi-binding protein BEND3 led to a reduction of H3K27me3 in ESCs and a pr ematur e activation of de v elopmental genes later during differentiation. Her e, our r esults appear consistent with an interaction of TET1 with a 'reining' function of Polycomb-mediated bivalency, at least during pluripotency and early differentiation. Intriguingly, bivalent hyper DMRs in Tet1 MUT cells are associated with up-regulated expression (predominantly mesoderm genes) by day 5, suggesting that acti v e DNA demethylation by TET1 may also contribute to Polycomb r epr ession of non-neur al genes during neur al induction. In contrast, both catalytic and non-catalytic TET1 activities contribute to preventing an ectopic gain of Polycomb r epr essi v e histone methyla tion a t ectodermal genes post-gastrulation. In this functional antagonism between TET1 and Polycomb during neural induction, TET1 may well be regulating a 'priming' effect of histone bivalency. How an ectopic H3K27me3 gain affects subsequent activation of gene targets, the presence of H3K4me3 and occupancy by Trithorax group proteins (75)(76)(77), as well as interactions with other epigenetic priming factors such as DPP A2 and DPP A4 ( 78 , 79 ), will be interesting questions to resolve to discern clearly between 'reining' versus 'priming' effects of TET1's interaction with histone bivalency, and the possibility of their dependence on genomic and lineage context. Nonetheless, both catalytic and non-catalytic regulation by TET1 are involved in the timely resolution of histone bivalency during lineage segregation, in agreement with the colocalization of TET1 and 5hmC with bivalent gene promoters in ESCs ( 18 , 20 , 21 , 65 , 66 , 80 ).
The subset of hyper DMRs associated with CpGi shores also reflects TET1 function at DMV boundaries (61)(62)(63). DMVs are associated with a majority of de v elopmental genes and are enriched for the Polycomb-deposited H3K27me3. Interestingly, they can display gain in DNA methylation upon gene activation in a similar fashion as we observed at bivalent genes associated with CpGi associated hyper DMRs ( 61 ). The Polycomb complex has a strong affinity for unmethylated CpGs and may regulate h ypometh ylation of DMVs through recruitment of TET1 ( 20 , 67 , 81 ). Conversely, CpG methylation reduces the binding affinity of PRC2 components to nucleosomes and can ther efor e disrupt Polycomb function at bivalent domains ( 67 ). Indeed, it was recently re v ealed that QSER1 interacts with TET1 in DMVs to safeguard developmental programs from de novo methylation by DNMT3 ( 82 ). Although collecti v e e vidence, including this study, suggests that functional interactions between TET1 and PRC2 are predominantly independent of TET1 catalytic activity in ESCs ( 17 , 20 , 22 , 80 ), a contribution of 5mC oxidation by TET1 at bivalent domains in DMVs to stabilize PRC2 association is not m utuall y e xclusi v e from non-catalytic interactions.
A direct effect of TET1 catalytic activity in r epr essing pr ematur e primiti v e steak gene e xpr ession r econciles with previous findings by us and others demonstrating the Nodal antagonist Lefty genes as direct targets of TET ( 14-16 , 53 ). Although Tet1 MUT ESCs dif ferentia te predominantly into antNPCs much like WT ESCs, we could detect enhanced SMAD2 phosphorylation in both Tet1 MUT and KO bulk dif ferentia tion cultures. Collecti v ely, these observations causally link TET1 catalytic activity at Lefty loci with antagonism of Nodal signalling. Yet, we observed a dominance of Wnt / ␤-catenin over Nodal signalling in driving cell fate changes in Tet1 KO cells. In solid cancer lines, loss of TET enzymes and 5hmC has been linked to r educed expr ession of Wnt r epr essors due to promoter DNA h yper meth ylation, resulting in augmented Wnt / ␤catenin signalling and enhanced epithelial-mesenchymal transition ( 83 , 84 ). We discovered loss of the Wnt repressor Tcf7l1 and its regulatory networ k acti vity early during differentiation of Tet1 KO cells, as a cell-intrinsic trigger of mesoderm skewing which can be rescued by over expression of TCF7L1. Howe v er, we did not find TET1-dependent DMRs at direct Wnt pathway regulators. Neither did we detect early Wnt / ␤-catenin signalling activity, Wnt-associated target gene expression or chromatin changes in Tet1 MUT cells at day 3 as observed in KO cells, suggesting that the triggering mechanism for elevated Wnt / ␤-catenin signalling by the loss of TET1 is non-catalytic. Nonetheless, we observed that hyperactivated Nodal / Smad signalling can e v entually acti va te Wnt / ␤-ca tenin through a signalling cross-talk ( 85 ). As further evidence of delayed Wnt activity induced by the loss of TET1 catalytic activity, Wnt pathway inhibition using XAV in Tet1 MUT cells can completely r epr ess expr ession of mesoderm and endoderm fate genes, and abolish ectopic H3K27me3 deposition, suggesting that 5mC oxidation by TET1 sustains the r epr ession of Wnt signalling during late gastrulation. Though not as potent as in Tet1 KO cells, the 'leaky' Nodal signals activating Wnt / ␤-catenin are sufficient to convert a sub-population to adopt mesoderm and endoderm fate in Tet1 MUT.
In our previous ATAC-seq analysis for TET1-dependent DARs during nai v e (ESC) to formati v e (EpiLC) pluripotency transition, we detected mostly loss-in-KO regions in EpiLCs that were TET1 bound sites that gained accessibility in WT EpiLCs but failed to do so in Tet1 KO, of which only a subset were associated with neural lineage enhancers ( 16 ). The minority of gain-in-KO sites were predominantly indirect targets that were already differentially accessible in nai v e ESCs. In this study, we examined cells further along their lineage bifurcation trajectory and found roughly equal numbers of close-in-KO and open-in-KO sites appearing by day 3, which are distal r egulatory r egions harbouring respecti v ely ectoderm-specific and primiti v e streak-specific TF motifs. This dichotomy is strengthened by the preferential occupancy of TET1 to the ectoderm enhancers in the close-in-KO regions and absence of TET1 binding to open-in-KO r egions. Furthermor e, the chromatin state of these regions can be modulated both by inhibiting or activating Wnt signalling. These results suggest that TET1 sustains the accessibility of neuroectoderm enhancers during lineage priming and gastrulation onset. Consistent with a 'neural default' model, a recent single-cell multi-omics analysis during mouse gastrulation showed that the methylation and accessibility landscapes of ectodermal cells are already established in the early epiblast ( 9 , 86 ). Our temporal analysis suggests that loss of chromatin accessibility at 'neural default' enhancers then allows opening of alternati v e enhancers that promote primiti v e streak fates in response to signalling cues, suggesting that the ingression of epiblast cells into the primitive str eak r equir es fundamentally the loss of full length TET1 expression. The precise non-catalytic mechanism by which TET1 maintains accessible chroma tin a t distal regula tory elements remains to be elucidated. Previous studies suggest the involvement of neural fate-specific TFs such as ZIC2 and EGR1 that recruit TET1 to enhancer sites ( 16 , 87 ), where TET1 may function as a binding partner of other chromatin regulators such as the O-linked N-acetylglucosamine (O-GlcNAc) tr ansfer ase (88)(89)(90), or regulate pr omoter-enhancer chr omatin looping.
It has been postula ted tha t enhancer demethylation by TET enzymes and subsequently enhancer activation drives cellular fate in gastrulation ( 9 ). Howe v er, a causal link between enhancer demethylation and gene expression has been difficult to prov e e xperimentally. In this study, selecti v e loss of TET1 catalytic activity in Tet1 MUT cells resulted in DNA h ypermeth ylation as early as day 2 of dif-ferentia tion a t distal close-in-KO regions without inducing cell fa te-altering chroma tin accessibility changes a t day 3. Pr esumably, the pr esence of ca talytic dead TET1 a t neural fate enhancers is compatible with de novo DNA methylation ( 80 , 91 ). The uncoupling of DNA methylation and chromatin accessibility in Tet1 MUT cells suggests that 5mC oxida tion a t enhancers may only be a passi v e byproduct of the dominant non-ca talytic regula tory function of TET1. In agreement, a stud y tha t sim ultaneousl y profiled chromatin accessibility and DNA methylation from single DNA fragments in a monocyte-to-macrophage differentiation model, observed prolonged DNA methylation states at nascent open chromatin regions associated with transcriptional activa tion, suggesting tha t an uncoupling of DNA methylation from chromatin accessibility and gene activity may be a general mechanism ( 92 ). Howe v er, our analysis at a later time-point (day 5, post-gastrulation) re v ealed a class of de novo accessible enhancer loci, where DNA hypermethylation in KO was coupled with reduced chromatin opening, in line with an inverse correlation between DNA methylation and chromatin accessibility. To reconcile these findings, we propose that the persistence of DNA methyla tion a t distal enhancers a t lineage priming is not suf ficient to induce chromatin closure of loci that were previousl y opened, w hereas the loss of DN A methylation is more tightly linked with chromatin opening of de novo accessible enhancers at post-gastrulation. Like in its interaction with PRC2, TET1 may again be switching from a non-catalytic to a catalytic regulation of chromatin accessibility at distal enhancers during peri-gastrulation. Thus, the functional impact of DNA methylation changes on gene expression depends on CpG densities, which are generally much lower at distal enhancers than at proximal promoters, and also on the cellular context and developmental stage.
The signatures of DNA h ypermeth ylation induced by TET1 dysfunction, w hen a ppearing latent in earl y pro genitor cells, may have an impact on cellular functions later in de v elopment. We observ ed that de novo DNA hypermethylated regions increased in numbers as cells dif ferentia ted in vitro from day 2 to day 5 upon exogenous Wnt inhibition, largely due to DNA h ypometh yla tion a t intergenic loci in WT cells but persistence of methylation in KO postgastrulation. Because there is lack of detectab le TET acti vity at the later time-points, these results suggest that the engagement of TET1 in WT ESCs, particularly at neuroectodermal enhancers, marks a class of genomic loci for a subsequent loss of DNA methylation during dif ferentia tion. Perhaps, low basal le v els of TET1 or TET3 during gastrulation may suffice to mediate DNA demethylation at these DMRs, or the early presence of TET1 may occlude DNMT and facilitate passi v e DNA demethylation during subsequent cell di visions. Moreov er, the h ypermeth ylation in KO observed at distal enhancer regions regulating fetal brainspecific genes at day 5 is associated with reduced chromatin accessibility, suggesting that the epigenetic defects may affect subsequent brain de v elopment. Supporting the notion of an early de v elopmental origin of disease, we showed that promoter hyper methylation of brain-specific genes associated with postnatal neurological dysfunction in Tet1 -deficient mice already exhibited gains in methylation in day 5 Tet1 KO antNPCs, and in E8.5 anterior neural tissues and E11.5 brains ( 18 , 71 , 72 ). Ther efor e, pr evious phe- notypes associated with TET1 dysfunction in adult mice may have an aetiology occurring as early as during perigastrulation stages.
Overall, our study resolves the seemingly paradoxical functions of TET1 in cell fate decision at early germ layer lineage bifurcation. We could partition the relati v e contributions of TET1's catalytic and non-catalytic activities to distinct CpGi-associated and distal genomic contexts, where each mode of acti vity e x erts differ ent functional impacts and dominance depending on lineage and de v elopmental stage (Figure 8 ). These results present a more nuanced view of the role of TET1 as a chromatin regulator in embryonic de v elopment. Rather than strictly acting as a 'turn on' switch at regulatory regions to activate gene expression, the catalytic and non-catalytic dual modes confer TET1 with a versatility to switch dynamically between 'activator' and 'r epr essor' roles at different genomic context and stages of de v elopment.