Periphilin self-association underpins epigenetic silencing by the HUSH complex

Abstract Transcription of integrated DNA from viruses or transposable elements is tightly regulated to prevent pathogenesis. The Human Silencing Hub (HUSH), composed of Periphilin, TASOR and MPP8, silences transcriptionally active viral and endogenous transgenes. HUSH recruits effectors that alter the epigenetic landscape and chromatin structure, but how HUSH recognizes target loci and represses their expression remains unclear. We identify the physicochemical properties of Periphilin necessary for HUSH assembly and silencing. A disordered N-terminal domain (NTD) and structured C-terminal domain are essential for silencing. A crystal structure of the Periphilin-TASOR minimal core complex shows Periphilin forms an α-helical homodimer, bound by a single TASOR molecule. The NTD forms insoluble aggregates through an arginine/tyrosine-rich sequence reminiscent of low-complexity regions from self-associating RNA-binding proteins. Residues required for TASOR binding and aggregation were required for HUSH-dependent silencing and genome-wide deposition of repressive mark H3K9me3. The NTD was functionally complemented by low-complexity regions from certain RNA-binding proteins and proteins that form condensates or fibrils. Our work suggests the associative properties of Periphilin promote HUSH aggregation at target loci.


INTRODUCTION
More than half of the human genome consists of transposable elements (TEs). TEs have evolved to fulfill important cellular functions. TEs drive the evolution of transcriptional networks by spreading transcription factor binding sites, promoters and other regulatory elements (1,2). TE-derived regulatory elements are particularly important in embryogenesis, when global hypomethylation promotes transcription. Key pluripotency-associated transcription factors involved in cell fate determination bind to sites within TEs (1). TE genes also serve as a genetic reservoir that can be coopted by the host. For example, TE-derived proteins catalyze V(D)J recombination (3) and syncytiotrophoblast fusion in placental development (1,4).
A subset of TEs can autonomously replicate through an RNA intermediate and reintegrate into the genome like retroviruses. Some of these TEs are endogenous retrovirus (ERV) genomes inherited from ancestral infections of the germline. The other type of autonomously replicating TE in humans are the non-viral LINE-1 (long interspersed nuclear element-1) retroelements. Active ERVs and LINE-1s are transcribed and encode reverse transcriptase and integrase enzymes, which convert the transcripts into DNA and reintegrate it into the host genome (1). This amplifying retrotransposition mechanism has allowed ERVs and L1s to accumulate in the human genome. Approximately 100 human LINEs are replication-competent and cause new integration events in 2-5% of the population (5,6).
Transcription and retrotransposition of TEs must be tightly regulated to prevent harmful gene expression and genome damage. Accumulation of TE transcripts is associ-ated with autoimmune diseases including geographic atrophy, lupus and Sjögren's syndrome (5,7). Aberrant expression of proteins from the human ERV HERV-K is associated with cancer and neurodegeneration (8). Reactivation of ERVs and LINE-1s in somatic cells is also associated with cancer, through disruption of tumor suppressor genes or enhanced transcription of oncogenes (9,10). Disruption of protein coding sequences by transposition events is additionally linked to genetic disorders such as hemophilia and cystic fibrosis (9,10).
A central mechanism cells have evolved to control potentially pathogenic expression and transposition of TEs and infectious viruses alike is epigenetic silencing. Among the most important sources of epigenetic silencing in humans is the Human Silencing Hub (HUSH) complex, consisting of three proteins: Periphilin, TASOR and MPP8 (11). HUSH silences the genomes from newly integrated lentiviruses (11) as well as unintegrated retroviruses via the DNA-binding protein NP220 (12). Vpr and Vpx proteins from lentiviruses including HIV target HUSH for proteasomal degradation, demonstrating the importance of HUSHdependent silencing in controlling lentiviral infection (13)(14)(15). HUSH also silences hundreds of transcriptionallyactive or recently-integrated genomic sequences, with a degree of selectivity for full-length LINE-1s located in euchromatic environments, often within introns of actively transcribed genes (16,17). In the current model of HUSHdependent repression HUSH spreads histone H3 lysine 9 trimethylation (H3K9me3), a transcriptionally repressive mark, by recruiting the H3K9 methyltransferase SETDB1 and its stabilizing factor ATF7IP to existing H3K9me3 marks via MPP8, which binds both H3K9me3 and ATF7IP (11,18). This read-write mechanism for H3K9me3 spreading by MPP8 and SETDB1 alone is insufficient for repression, however, as TASOR, Periphilin and portions of MPP8 each have functions other than binding H3K9me3 that are essential (11). HUSH silencing also requires MORC2, a DNA-binding ATPase thought to be a chromatin remodeler (19,20). The specific contributions of the three HUSH subunits in recognizing target loci and repressing their expression therefore remain unclear.
The biochemical and structural properties of the three HUSH subunits remain mostly unknown. In this study, we delineate the key structural and physicochemical attributes of Periphilin and how they contribute to HUSH function. Periphilin was originally identified as a highly insoluble nuclear protein cleaved by caspase-5 (21). N-terminal sequences contain the determinants for insolubility and the C-terminal region contains predicted ␣-helical heptad repeats proposed to form dimers based on a yeast two-hybrid assay (21). Periphilin is indispensable for development. In mice, homozygous deficiency of Periphilin is lethal early in embryogenesis and heterozygous deficiency is compensated by increased expression from the wild-type allele (22). Overexpression of Periphilin transcriptionally represses certain proteins causing cell cycle arrest (23,24). Isoform 2 of Periphilin, one of at least 8 isoforms, was identified in a genetrap mutagenesis screen as a component of the HUSH complex that binds TASOR but not MPP8 (11,17). Curiously, some of the isoform diversity is driven by TE insertion into Periphilin coding sequences (25). Periphilin was also identi-fied as an mRNA-binding protein in a screen of the protein-mRNA interactome in proliferating human cells (26). Here, we report the crystal structure of a Periphilin-TASOR minimal core complex and identify the key physicochemical properties of Periphilin necessary for HUSH complex assembly and epigenetic silencing. The Periphilin C-terminal region directs HUSH complex assembly by dimerizing and binding a single TASOR molecule through ␣-helical coiledcoil interactions. A disordered N-terminal domain (NTD) mediates self-aggregation through a sequence enriched in arginine and tyrosine residues. The sequence of the NTD is reminiscent of--and functionally complemented by--lowcomplexity regions from RNA-binding proteins, and from certain proteins that self-associate to form biomolecular condensates or phase separations. Our findings suggest Periphilin may contribute to the recognition and co-or posttranscriptional repression of HUSH target loci by binding and sequestering nascent transcripts. This work provides a foundation to design strategies to control HUSH activity, with important potential therapeutic applications.

Protein sequence analysis
We used CIDER software developed for the analysis of intrinsically disordered proteins (27) to extract Periphilin sequence parameters including predicted structural disorder, charge and hydropathy.

Cell lines and lentivirus production
HeLa cells carrying integrated GFP reporter--with or without Periphilin KO (11)--and HEK 293T cells were maintained in RPMI supplemented with 10% fetal calf serum, 50 U/ml penicillin, 50 g/ml streptomycin. Lentiviruses were produced by cotransfection of HEK 293T cells at 95% confluence in six-well plates with 3 g of the following plasmids at a 1:2:2 molar ratio: pMD2.G carrying glycoprotein VSV-G, pCMV 8.91 carrying replicative genes, and the pHRSIN-based lentiviral backbone containing hygromycin resistance and Periphilin. The plasmids were mixed with 200 l serum-free medium, and 15 l PEI, incubated at room temperature for 20 min and applied to cells. Media was exchanged 4 h post-transfection and supernatants containing lentiviruses were harvested 48 h post-transfection by filtering through a 0.45 m filter and stored at −80 • C.

Reporter silencing assay
Reporter silencing activity of WT and mutant Periphilin was measured by infecting the Periphilin KO HeLa reporter cell line with lentiviruses carrying Periphilin variants and monitoring GFP fluorescence for 21 days post-transduction (11). Periphilin KO HeLa cells in 24-well plates were overlaid with 150 l of lentiviral supernatants and 8 g/ml polybrene and centrifuged for 90 min at room temperature at 1000 g. After 24 h incubation cells were trypsinized and seeded into flasks with selection media containing 400 g/ml hygromycin. Fresh media was added every other day. Hygromycin was removed from the media after 7 days in culture. For flow cytometry cells were trypsinized, washed in PBS, counted, and resuspended at 1 × 10 6 cells per ml in PBS supplemented with 2% fetal calf serum. GFP fluorescence was recorded with an Eclipse flow cytometer (iCYT) from >1 × 10 5 cells per sample. The cells were gated on live single-cell population based on forward and side scatter in FlowJo (BD Life Sciences). The geometric mean of the GFP fluorescence of the whole live population was determined without further gating and values exported to Excel (see Supplementary Data Set S1). Since gene expression data are log-normal, we converted GFP fluorescence to percent repression with the formula: %GFP Repression = log 10 (GeoMeanPopulation)*m + a, where m = 100%/[log 10 (GeoMeanWT) -log 10 (GeoMeanKO)] and b = -m*log 10 (GeoMeanKO). This transformation assigned the Periphilin KO population a value of 0% repression and WT HeLa reporter cells (11) 100% repression. The values of m and b used were -59.7 and 196.1, respectively, for all experiments except those in Figure 6C, where they were -59.5 and 210.0. The difference in vertical offset was due to a laser upgrade on the instrument.

Co-immunoprecipitation and Western immunoblotting
For co-immunoprecipitation (co-IP), cells were lysed in 1% NP-40 or IGPAL C-630 in TBS plus 10 mM iodoacetamide and protease inhibitors: 0.5 mM phenylmethylsulfonyl fluoride (PMSF) and benzonase (Sigma-Aldrich) or Complete Protease Inhibitor Cocktail (Roche), for 30 min. The cell lysate was centrifuged at 14 000 g for 10 min and the supernatant (Input) mixed with Protein A and IgG-sepharose resin along with primary antibody. The suspension was incubated for 2 h at 4 • C and the resin was washed three times in lysis buffer. For western blotting, cells were lysed with lysis buffer containing 1% SDS for 30 min at room temperature. For SDS-PAGE analysis, Input and resins from co-IP or cell lysates for westerns were heated to 70 • C in SDS sample buffer for 10 min and run on a polyacrylamide gel. Gels were blotted onto PVDF membranes (Millipore). Blots were blocked in 5% milk in PBS, 0.2% Tween-20 and incubated overnight with primary antibody diluted 1:5000 in blocking solution. As the Periphilin antibody was unable to detect its epitope under NP-40 lysis conditions, we used a mouse antibody against the V5 tag (Abcam, ab27671) as the primary antibody for Periphilin. For TASOR, the primary antibody was rabbit ␣-TASOR (Atlas, HPA006735). The primary antibody for loading controls was rabbit antiactin (Abcam, ab219733). Blots were imaged with West Pico or West Dura (Thermo Fisher Scientific), or with the nearinfrared system of a LI-COR Odyssey fluorescent scanner after incubation with DyLight 680-or 800-conjugated secondary antibodies (Thermo Fisher Scientific) at 1:10 000 dilution for 30 min at room temperature.

Protein expression and purification
Escherichia coli BL21 (DE3) cells (New England BioLabs) were transformed with pRSF-DUET constructs expressing TASOR-Periphilin complex and selected on kanamycin plates. For native protein expression, overnight cultures were diluted 1:200 into 2 l of LB. Cultures were induced with 100 M IPTG at OD 600 0.6, incubated at 37 • C for 2 h, harvested, resuspended in 50 ml Buffer A (20 mM HEPES pH 7.4, 0.5 M NaCl, 0.5 mM TCEP) and frozen in liquid nitrogen. The cells were thawed at room temperature, supplemented with 1 l benzonase (Sigma) and lysed by sonication. The lysates were clarified by centrifugation and filtering over a 0.45 m filter. The clarified lysates were applied to 1-ml HisTrap columns (GE Healthcare), using one column per liter culture, washed with 20 column volumes of Buffer A and eluted in with 0.5 M imidazole in Buffer A. To purify non-cleavable His 6 -tagged Periphilin residues 285-374, the eluate was desalted into Buffer QA (20 mM NaCl, 20 mM HEPES pH 7.4, 0.5 mM TCEP) using a HiPrep desalting column (GE Healthcare), bound to MonoQ ion-exchange column (GE Healthcare) and eluted in a gradient of Buffer QA and BufferQB (1 M NaCl, 20 mM HEPES pH 7.4, 0.5 mM TCEP). Size-exclusion chromatography (SEC) on a Superdex 200 10/300 column (GE Healthcare) in PBS supplemented with 1mM TCEP and 0.05% sodium azide completed purification. For the 15 N and 15 N/ 13 C-labeled protein expression, the overnight starter culture was grown in complete unlabeled minimal medium (M9) and used to inoculate (1:100 v/v) 800 ml complete labeled M9 media. Labeled proteins were purified as described above. For the construct expressing the TEV-cleavable His 6 -tagged Periphilin residues 292-367, Ni-affinity purification was followed by overnight digestion with TEV protease at 22

X-ray crystallography
Crystals were grown at 18 • C by sitting drop vapor diffusion. Purified Periphilin-TASOR complex was mixed with an equal volume of reservoir solution: 0.1 M Citrate pH 4.5, 1 M ammonium sulfate. Crystals were harvested into a 70:30 mix of mother liquor to protein buffer supplemented with 20% DMSO with or without 1 M NaBr. Crystals were frozen in liquid nitrogen. X-ray diffraction data were collected at 100 K at Diamond Light Source (DLS) beamline I04-1. Automatic experimental phasing pipelines implemented at DLS including CRANK2 (28) determined phases with the single anomalous dispersion (SAD) method using bromine as the heavy atom. A polyalanine model built with CRANK2 was used as a molecular replacement search model for the native dataset (without NaBr) in PHENIX (29). The atomic model was built with COOT (30) and iteratively refined with PHENIX (29) at 2.5Å resolution. See Table 1 for data collection and refinement statistics.

Size-exclusion chromatography and multi-angle light scattering (SEC-MALS) analysis
100 l of protein sample was subjected to SEC at 293 K using a Superdex 200 10/300 column (GE Healthcare) preequilibrated in PBS at a flow rate of 0.5 ml min −1 . The SEC system was coupled to a multi-angle light scattering (MALS) module (DAWN-8+, Wyatt Technology). Molar masses of peaks in the elution profile were calculated from the light scattering and protein concentration, quantified using the differential refractive index of the peak assuming a dn/dc of 0.186, using ASTRA6 (Wyatt Technology).

Immunofluorescence microscopy
Cells were grown on glass cover slips and then fixed with 4% formaldehyde in PBS for 15 min. Cells were permeabi-lized with 0.1% Triton X100 in PBS and then blocked with 5% BSA in PBS. Samples were stained with primary anti-Periphilin antibody (Atlas, HPA038902) at dilution 1:500 for 1 h and after washing with blocking buffer with secondary anti-rabbit AlexaFluor 568 antibody diluted 1/500 for 1 h. Cover slips were mounted on microscopy glasses with ProLong Gold anti-fade reagent with DAPI (Invitrogen). Imaging was performed using Nikon Ti microscope equipped with CSU-X1 spinning disc confocal head (Yokogawa) and with Zeiss 780 system.

CUT&RUN H3K9me3 profiling
We followed the protocol detailed by Henikoff and colleagues (31). Briefly, 250 000 cells (per antibody/cell line combination) were washed twice (20 mM HEPES pH 7.5, 0.15 M NaCl, 0.5 mM spermidine, 1× Roche complete protease inhibitors) and attached to ConA-coated magnetic beads (Bangs Laboratories) pre-activated in binding buffer (20 mM HEPES pH 7.9, 10 mM KCl, 1 mM CaCl 2 , 1 mM MnCl 2 ). Cells bound to the beads were resuspended in 50 l buffer (20 mM HEPES pH 7.5, 0.15 M NaCl, 0.5 mM Spermidine, 1x Roche complete protease inhibitors, 0.02% w/v digitonin, 2 mM EDTA) containing primary antibody (1:100 dilution). Incubation proceeded at 4 • C overnight with gentle shaking. Tubes were placed on a magnet stand to remove unbound antibody and washed three times with 1 ml digitonin buffer (20 mM HEPES pH 7.5, 0.15 M NaCl, 0.5 mM Spermidine, 1× Roche complete protease inhibitors, 0.02% digitonin). pA-MNase (35 ng per tube, a generous gift from Steve Henikoff) was added in 50 l digitonin buffer and incubated with the bead-bound cells at 4 • C for 1 h. Beads were washed twice, resuspended in 100 l digitonin buffer and chilled to 0-2 • C. Genome cleavage was stimulated by addition of 2 mM CaCl 2 (final), briefly vortexed and incubated at 0 • C for 30 min. The reaction was quenched by addition of 100 l 2× stop buffer (0.35 M NaCl, 20 mM EDTA, 4 mM EGTA, 0.02% digitonin, 50 ng/l glycogen, 50 ng/l RNase A, 10 fg/l yeast spike-in DNA (a generous gift from Steve Henikoff)) and vortexing. After 10 min incubation at 37 • C to release genomic fragments, cells and beads were pelleted by centrifugation (16 000 g, 5 min, 4 • C) and fragments from the supernatant purified with a Nucleospin PCR clean-up kit (Macherey-Nagel). Illumina sequencing libraries were prepared using the Hyperprep kit (KAPA) with unique dualindexed adapters (KAPA), pooled and sequenced on a No-vaSeq6000 instrument. Paired-end reads (2 × 150) were aligned to the human and yeast genomes (hg38 and R64-1-1 respectively) using Bowtie2 (-local -very-sensitive-local -no-mixed -no-discordant -phred33 -I 10 -X 700) and converted to bam files with samtools. Conversion to bedgraph format and normalization was performed with bedtools genomecov (-bg -scale), where the scale factor was the inverse of the number of reads mapping to the yeast spike-in genome. CUT&RUN experiments to assess H3K9me3 regulation by Periphilin variants were done in two independent replicate experiments. Peaks defined as HUSH-regulated were reported elsewhere (17). Normalized bigwig files were generated (UCSC), displayed in IGV (Broad Institute) and

Statistics
No statistical methods were used to predetermine sample size, experiments were not randomized, and the investigators were not blinded to experimental outcomes. Reporter silencing assays were performed at least three times in independent experiments. Repression activity data are represented as the mean ± standard error of the mean (s.e.m.), calculated with PRISM 8 (GraphPad), with three biological replicates for all experiments except WT with ten replicates.

Both N-and C-terminal regions of Periphilin are required for HUSH function
To identify the subdomains of Periphilin required for HUSH function, we generated various Periphilin constructs with N-or C-terminal truncations and assessed their silencing activity as part of the HUSH complex ( Figure 1A). We used the 374-amino acid isoform 2 of Periphilin (UniProt Q8NEY8-2) as the reference sequence in this study rather than the longer isoform 1 (UniProt Q8NEY8-1), as isoform 2 fully restores HUSH function in Periphilin-deficient cells (11). Repression of a lentiviral GFP reporter in Periphilin knockout (Periphilin KO) HeLa cells was used as a measure of silencing activity. As reported previously (11), GFP reporter expression was repressed in wild-type cells and derepressed in Periphilin KO cells ( Figure 1B) The loss of HUSH function with the 350-374 and 1-127 Periphilin mutants could be due to loss of an intrinsic activity of Periphilin or failure of Periphilin to be recruited to the HUSH complex. To distinguish between these, we measured coprecipitation of Periphilin and TASOR in pulldown assays. Periphilin and TASOR were purified on an immunoaffinity resin (immunoprecipitated) from lysates of Periphilin KO cells complemented with Periphilin deletion mutants. TASOR coeluted with all N-terminal deletion mutants tested, up to 1-297 ( Figure 1C). Conversely, wildtype and 1-70 Periphilin both coeluted with TASOR. In contrast, TASOR did not associate with immunoprecipitated 350-374 Periphilin and 350-374 Periphilin did not associate with TASOR. Hence, only the C-terminal region of Periphilin is required for binding to TASOR, and the N-terminal region of Periphilin must have other properties necessary for HUSH function.

Structure of the core Periphilin-TASOR complex identifies interfaces required for HUSH function
Having established that the C-terminal region of Periphilin (residues 297-374) is required for binding to TASOR, we sought to identify the structural determinants of Periphilin-TASOR assembly. We recently mapped the Periphilin binding region in TASOR to a small region within residues 1000-1085 (17). Initial attempts to crystallize Periphilin-TASOR complexes failed until we determined that residues 285-291 and 368-374 of Periphilin were disordered, exploiting partial assignment of NMR spectra with 15 N-and 13 C-labeled Periphilin (Supplementary Figure S1). Thus, a crystal structure of a Periphilin fragment spanning residues 292-367 bound to a TASOR fragment spanning residues 1014-1095 was determined using the single anomalous dispersion (SAD) phasing method, with bromine as the anomalous scatterer ( Table 1). The structure contains two Periphilin molecules and a single TASOR molecule (Figure 2A). The Periphilin fragments form helical hairpins with a mixture of ␣-helix and 3 10 -helix secondary structure. The two Periphilin hairpins pack against each other via a 118Å 2 hydrophobic interface formed by the hydrophobic side chains of Leu326, Leu333 and Ile337. The resulting Periphilin homodimer has twofold symmetry. The TASOR molecule forms two ␣-helices that wrap around the outer surfaces of the Periphilin dimer. The TASOR helices add a third helix to each Periphilin helical hairpin to form two three-helix coiled-coils. Each TASOR helix forms leucine zipper-type hydrophobic contacts, which typify helical coiled-coils. Unusually, however, each Periphilin subunit binds to a different TASOR sequence (residues 1014-1052 and 1072-1093, respectively) with an identical binding surface (Figure 2A). Notably residues 1055-1071 of TASOR, between the two Periphilin-binding segments, are disordered, but these 17 residues could easily span the 35-40Å trajectory needed to connect residues 1054 and 1072 in the Periphilin-TASOR complex. Binding of TASOR to Periphilin buries a total of 428Å 2 . The 2:1 stoichiometry of the Periphilin-TASOR core complex was confirmed in solution by size-exclusion chromatography coupled with multiangle light scattering (SEC-MALS; Figure 2B) and nondenaturing mass spectrometry (Supplementary Figure S2).
To determine whether the binding interfaces observed in the Periphilin-TASOR complex are required for HUSH function, we used our structure to design point mutations in Periphilin predicted to interfere with Periphilin-TASOR complex assembly and measured the silencing activity of the mutants in the GFP reporter assay described above. Variants L326A and L333A/I337A were generated to target the Periphilin dimer interface; Periphilin L356R was generated to target both Periphilin-TASOR interfaces (Figure 2A). The L356R variant failed to rescue reporter repression in Periphilin KO cells, whereas the L326A and L333A/I337A variants each had approximately half of the repression activity of wild-type Periphilin ( Figure 2C). Immunofluorescence microscopy with an anti-Periphilin antibody confirmed that all variants were expressed with the same nuclear localization as wild-type Periphilin ( Figure  2D). Immunoblots against the V5 tag on Periphilin showed that the variants were expressed at higher levels than wildtype Periphilin ( Figure 2E).
To determine whether the engineered mutations inhibited Periphilin-TASOR complex formation, we measured coprecipitation of Periphilin and TASOR in a pulldown assay. TASOR was purified on an immunoaffinity resin from lysates of Periphilin KO cells complemented with the Periphilin mutants. All three variants failed to bind TASOR despite being present at higher levels than wild-type Periphilin in the input cell lysate supernatant ( Figure 2F). The partial repression activity of L326A and L333A/I337A suggests that these variants may have residual TASOR binding affinity inside the cell, where conditions are more conducive to binding than in the pulldown assay. We conclude that the leucine zipper interactions at the Periphilin dimer and Periphilin-TASOR interfaces are required for HUSH function and that the minimal core Periphilin:TASOR complex has a 2:1 stoichiometry. Whether fully active HUSH complex with full-length subunits contains a Periphilin homod-  imer and a single TASOR molecule or forms higher-order assemblies in the nucleus remains to be determined.

Sequences with predicted disorder in the Periphilin Nterminal region required for HUSH function
The requirement of Periphilin residues 1-127 for HUSHdependent silencing but not HUSH complex assembly (Figure 1) raises the question of how this N-terminal domain (NTD) contributes to HUSH function. Residues 20-291 of Periphilin are predicted to be unstructured ( Figure 3A). The sequence is more polar than hydrophobic, with clusters of alternating positive and negative net charge but an approximately neutral overall net charge. The NTD of Periphilin has a greater than average number of serine, arginine, tyrosine and negatively-charged residues (see Figure 5A below). Residues 147-222 (140-215 in isoform 1) are a serine-rich domain with six candidate serine phosphorylation sites, and a further three candidate sites at nearby residues 117, 121 and 140 (32)(33)(34). To shed light on the role of these elements in HUSH-dependent silencing we generated Periphilin variants with various deletions in the NTD and measured their reporter repression activity over 21 days. Consistent with the reporter repression data shown in Figure 1A, the 1-70 mutant repressed reporter expression to the same extent and at the same rate as wild-type Periphilin, whereas the 1-127 variant had no repression activity ( Figure 3B). Unexpectedly, however, addition of residues 1-70 to the 1-127 variant restored repression activity to 70% of wild-type activity. Hence, deletion of Periphilin residues 1-70 does not affect HUSH activity but these residues restore activity if residues 71-127 are deleted. Western blots confirmed that all variants were expressed at similar levels ( Figure 3C). We conclude that the presence of either residues 1-70 or 71-127 is sufficient to confer significant HUSH-dependent silencing activity, but neither segment is essential. Together, the amino acid sequence and partially-redundant activities of the Periphilin NTD suggest that it is intrinsically disordered and hence that its contribution to HUSH activity stems from primary sequence attributes rather than tertiary structure.

The NTD and TASOR-binding site are required for H3K9 methylation at HUSH-regulated loci
Deposition of the repressive epigenetic mark H3K9me3 by SETDB1 is an essential component of HUSH-dependent silencing (11). To assess the importance of the Periphilin Nand C-terminal regions in H3K9 trimethylation, we measured the genome-wide distribution of H3K9me3 in cells expressing different Periphilin variants with the CUT&RUN (Cleavage Under Targets and Release Using Nuclease) epigenomic profiling method (31). We found that in Periphilin KO cells H3K9 methylation was lost or markedly reduced at hundreds of loci, representing ∼1-2% of global H3K9me3 loci (Figure 4 and Supplementary Figure S3). The sites of H3K9me3 loss recapitulate those seen in previous ChIP-seq (chromatin immunoprecipitation followed by sequencing) studies on cells in which TASOR, MPP8 or Periphilin were knocked out -that is, a subset of 'host' gene exons and young intronic LINE-1 retrotransposons (11,16,17). Complementation of Periphilin KO cells with full-length wild-type Periphilin robustly restored H3K9me3 levels. However, the L356R TASOR-binding point mutant or the 1-127 NTD deletion mutant failed to restore H3K9 methylation at HUSH-regulated loci (Figure 4). This effect was specific: H3K9me3 levels were unaffected at HUSHindependent loci (Supplementary Figure S3). Taken together, the data indicate that both the disordered NTD and the folded TASOR-binding domain of Periphilin are required for HUSH-dependent H3K9 methylation.

Arginine and tyrosine residues in the NTD contribute to HUSH function
The physicochemical properties of the Periphilin NTD are reminiscent of the properties that govern the self-assembly of proteins into biomolecular condensates, in particular the Fused in Sarcoma (FUS) family of RNA-binding scaffold proteins. FUS family proteins contain N-and C-terminal intrinsically disordered regions with low sequence complexity resulting from a preponderance of specific subsets of amino acids (35). The N-terminal disordered region, known as the prion-like domain for its genetic association with prion-like inheritance in yeast and age-related neurodegenerative diseases in humans, is enriched in serine, glycine, tyrosine, glutamine, asparagine and proline (35,36). The Cterminal region comprises one or more folded RNA recognition motifs (RRMs) interspersed with low-complexity sequences enriched in arginine and glycine. Arginine-tyrosine interactions and -stacking of tyrosine-containing strands into kinked ␤-sheet fibrils in these disordered regions can non-covalently crosslink the polypeptide chains into liquidor gel-like condensates, which manifest in the cell as phase separations or membraneless compartments (37)(38)(39)(40). The arginine-tyrosine interactions that promote phase separation of FUS family proteins are stabilized by complementary negative electrostatic charges from aspartate and glutamate residues in the prion-like domain (38). An excess of negative charge in FUS from multiple serine phosphorylation (or phosphomimetic mutations) decreases phase separation (41). The NTD of Periphilin contains a similar sequence bias as the disordered C-terminal regions of FUS family proteins, with a marked enrichment of serine, arginine, tyrosine, aspartate and glutamate residues ( Figures 5A  and 6A). Moreover, Periphilin has approximately the same number of positively and negatively-charged residues, and has 10 potential serine phosphorylation sites, a similar number as FUS.
To assess the potential of Periphilin to form condensates we expressed various Periphilin recombinant protein constructs in E. coli. Full-length Periphilin and a construct spanning the NTD alone (residues 1-127) were both insoluble and could not be purified from cell lysates under native conditions ( Figure 5B). The 1-127 variant lacking the NTD was soluble but lacked repression activity as noted above. In contrast to FUS family proteins, which undergo phase separation and form hydrogels reversibly at low salt concentrations, Periphilin constructs containing the NTD remained insoluble even at higher than physiological salt concentrations (0.3 M NaCl). Full-length Periphilin could be solubilized and purified in the presence of 8 M urea but   (17), centered on each peak, with a ±30 kb window. Both replicates are shown for WT, L356R and 1-127 Periphilin variants. The mean binned signal is shown above each heatmap. H3K9me3 is lost specifically over HUSH-regulated peaks, but is unaffected in HUSH-independent peaks (see Supplementary Figure S3C). upon dilution of the urea to below 1 M the protein came out of solution, reversibly, forming solid aggregates detectable by absorbance in the visible light spectrum and by differential interference contrast microscopy ( Figure 5C, D). Hence, the NTD, which is required for HUSH activity, induces Periphilin to self-aggregate without undergoing phase separation or hydrogel formation as seen in FUS family proteins.
Among the amino acids enriched in the disordered regions of Periphilin and FUS family proteins, tyrosine and arginine residues govern the phase separation properties of FUS family proteins (38). Mutation of tyrosine residues to serine, or arginine to alanine, in FUS disordered regions diminishes or abrogates phase separation and hydrogel formation (38,39). To determine whether arginine and tyrosine residues contribute to Periphilin self-aggregation, we generated Periphilin variants with all 24 arginine residues in the NTD mutated to lysine, NTD(R>K), or with all 13 tyrosine residues mutated to serine, NTD(Y>S) and measured the silencing activity of the mutants in our GFP reporter assay. Both variants reduced HUSH-dependent repression ( Figure 5E). The NTD(R>K) variant, despite retaining the net charge of wild-type Periphilin, had <10% of wild-type activity 7 days post-transduction, and approximately one quarter of wild-type activity after 21 days. The NTD(Y>S) variant appeared less impaired, with 20% of wild-type activity after 7 days and 50% after 21 days. However, the NTD(Y>S) variant was expressed at significantly higher levels than the NTD(R>K) variant, suggesting that the impairment of the NTD(R>K) and NTD(Y>S) variants would be comparable at identical expression levels ( Figure  5F).
Arginine-tyrosine interactions have been proposed to be stabilized by negatively-charged residues in FUS family proteins. Mutation of negatively-charged residues in FUS fam- ily proteins decreases their overall phase separation potential (38). In contrast, the negatively charged residues in the NTD were not required for silencing. Indeed, Periphilin variant NTD(DE>NQ) with all 22 aspartate or glutamate residues mutated to asparagine or glutamine, respectively, had repression activity similar to wild-type ( Figure 5E) despite having a lower expression level than the wild-type, Y>S and R>K variants ( Figure 5F).

Disordered polypeptides with self-associating or RNAbinding properties partially complement NTD deletion
The similarity of the amino acid sequence bias in the Periphilin NTD and the disordered regions of RNA-binding domains from FUS family proteins raises the question of whether these low-complexity sequences have similar biophysical properties, which in Periphilin contribute directly to HUSH-dependent silencing. Insertion of the disordered portion of the RNA-binding domain from FUS ( Figure 6A) into the 1-127 variant restored HUSH repression activity to approximately 25% of wild-type ( Figure 6B, NTD::FUS-RBD). To determine whether self-association of a disordered polypeptide per se is sufficient to support HUSH silencing we generated variants containing the prion domain from yeast SUP35, or the prion-like domain of FUS in place of the NTD (NTD::SUP35 and NTD::FUS-PLD, respectively). Prion domains have a different sequence bias: enrichment of glutamine, asparagine and tyrosine and depletion of charged residues ( Figure 6A) (36). Prion domains form highly stable steric zipper-type amyloid fibers distinct from the reversible associative polymers formed by FUS (37,42). Nevertheless, the NTD::SUP35 and NTD::FUS-PLD variants restored repression activity to 30% and 20% of wild-type ( Figure 6C), respectively, suggesting prion-like aggregation partially functionally complements the NTD, to a comparable extent as the arginine/glycine-rich region of FUS.
Tyrosine residues are essential for the aggregation of FUS and the amyloidogenic properties of prion proteins (38,39,42,43). Mutation of all tyrosine residues to serine in the complementing sequences of FUS-PLD and SUP35 abrogated their silencing activity ( Figure 6C, NTD::FUS(Y>S) and NTD::SUP(Y>S)).
A large proportion of proteins with low-complexity disordered regions bind RNA via arginine-rich sequences. Many of these RNA-binding proteins self-associate into liquid or gel phases, or form amyloid (or amyloid-like) fibers (44,45). The same arginine/glycine-rich regions of FUS family proteins that mediate phase separation also bind RNA, and RNA binding nucleates higher-order assembly of FUS (40). Moreover, prion proteins are strongly enriched for RNA-binding proteins (36). Formation of ribonucleoprotein complexes (RNPs)--in particular with mRNA--through phase separation or fibril formation is emerging as a central mechanism of co-and posttranscriptional regulation (44). To assess the potential contribution of RNA binding by Periphilin to silencing, we replaced the NTD with RNA-binding polypeptides from two RNA-binding proteins ( Figure 6A), Y-box-binding protein 3 (YBX3) and Aly/RNA export factor 2 (ALYREF2). YBX3 a member of the cold shock domain (CSD) pro-tein family that binds mRNA without sequence specificity via a disordered C-terminal tail rich in aromatic, basic and phosphorylated residues (46,47). YBX3 was recently shown to repress translation of certain mRNAs (48). ALYREF2 contributes to mRNA export by packaging mRNA into RNPs through interactions with an arginine-rich disordered N-terminal tail. The NTD::YBX3 variant restored HUSH repression activity to the greatest extent of any of the complementing sequences tested, with 50% of wild-type Periphilin 21 days post-transduction ( Figure 6B). In contrast, the NTD::ALYREF2 variant did not restore repression. We note that the complementing YBX3 sequence and the Periphilin NTD both have a greater number of alternating positively and negatively-charged residues than the other complementing sequences ( Figure 6A). Western blots showed that some of the complementing NTD sequences altered the expression levels: the inactive FUS(Y>S) sequence boosted expression and the ALYREF2, FUS-RBD and YBX3 sequences reduced expression relative to wildtype ( Figure 6D). Efforts to measure RNA binding by Periphilin were hampered by the insolubility of purified protein constructs containing the NTD.

DISCUSSION
We have identified the key structural and biochemical properties of Periphilin necessary for epigenetic silencing by the HUSH complex. The C-terminal coiled-coil domain directs HUSH complex assembly by dimerizing and binding TASOR through ␣-helical coiled-coil interactions. How the N-terminal region (NTD) of Periphilin contributes to silencing is more difficult to pinpoint due to its intrinsic structural disorder and its propensity to aggregate. We note that self-aggregation of the NTD correlates with HUSH function, as truncations that inhibit aggregation also inhibit silencing. As in self-associating disordered regions from many other proteins including FUS-family proteins, the NTD of Periphilin is enriched in tyrosine and arginine residues. These residues are required for HUSH transgene repression activity. Arginine-tyrosine -stacking interactions are essential for the aggregation of FUS, and tyrosine residues contribute to the amyloidogenic properties of prion proteins. Hence, NTD self-association through argininetyrosine -stacking interactions could play a role in HUSH silencing. Consistent with this notion, lysine failed to functionally substitute for arginine in silencing assays with our NTD(R>K) Periphilin variant.
Alongside the broad similarities between the Periphilin NTD and disordered regions from other arginine/tyrosinerich proteins, the NTD has certain distinguishing properties. The sequence complexity of the NTD is not as low as in the disordered regions that drive phase separation of FUS family proteins. Periphilin lacks tyrosine residues flanked on both sides by serine or glycine to form [G/S][Y/F][G/S] motifs, which are hallmarks of fiber-forming Low-complexity Aromatic-Rich Kinked Segments (LARKS). Periphilin also contains a greater proportion of charged residues than FUS-family and prion proteins and may acquire further negative charges through phosphorylation. Moreover, the NTD induces the formation of solid aggregates rather than liquid-like phases, hydrogels or fibers. These distinguishing features may explain why the NTD was not fully complemented by the disordered regions from FUS or Sup35 in our HUSH silencing assays.
On balance, the similarities between the NTD and selfassociating disordered regions from FUS-family proteins outweigh the differences. Indeed, counterbalancing the differences listed above, the NTD does contain sequences that resemble LARKS or are predicted to have amyloid-forming potential, for example SFYSSHYA, with a stacking free energy of -27 kcal/mol predicted by ZipperDB (49). Second, the negatively-charged residues in the NTD, though more abundant than in FUS and Sup35, are not essential for HUSH function. Furthermore, membraneless compartments formed by biomolecular condensates have been reported previously to have the characteristics of a solid rather than a liquid or gel (50,51). Hence, the most plausible mechanism for Periphilin NTD aggregation is via argininetyrosine -stacking interactions, like FUS-family proteins but with greater cooperativity, resulting in a more abrupt transition from the soluble state to a solid aggregated state. Whether and how self-association via this mechanism translates into silencing activity in the HUSH complex remains unclear.
The biochemical properties of the NTD suggest that one of its functions in HUSH-dependent silencing may be to bind RNA. A majority of proteins with argininerich disordered regions bind RNA and self-associate into ribonucleoprotein (RNP) fibers or condensates (44,45). Polymerization of proteins--such as proteins from cold shock domain (CSD) family--on mRNA is emerging as a central mechanism to repress protein expression co-or post-transcriptionally (44). Notably, the disordered RNAbinding region of CSD-family protein YBX3, which binds certain mRNAs and represses their translation (48), functionally complemented the Periphilin NTD to a greater extent than any of the other sequences we tested. As in other CSD proteins, the disordered RNA-binding region of YBX3 is enriched in aromatic, basic and phosphorylated residues and binds mRNA without sequence specificity (46,47). Whether Periphilin forms RNPs with mRNA remains unknown. Binding of Periphilin to mRNA could explain the known propensity of HUSH to silence genes that are actively being transcribed (16,17). In support of HUSH binding to nascent transcripts, Periphilin and TASOR (as C3orf63) were identified in a proteomic screen for protein-mRNA interactions in human cells (26,52). Moreover, artificially increasing transcription of a transgene increases recruitment of HUSH to that locus (16). Why the HUSH complex preferentially binds to intronic LINE-1 elements within actively transcribed genes with some degree of sequence specificity (16,17), remains to be determined. We note that the cold shock domains of YBX3 and other Y-box binding proteins bind to specific sequences or structures in the untranslated regions of their target mRNAs (48,53). Although the HUSH complex does not appear to contain any classical RNA recognition motifs or cold shock domains (17), the possibility that TASOR or MPP8 contain a motif conferring specificity for RNA sequence or structure cannot be excluded.
The work presented and discussed here prompts us to propose the following model for how Periphilin may func-tion in silencing. The NTD of Periphilin may bind nascent transcripts, with multiple Periphilin molecules binding to each target mRNA. The self-aggregation properties of NTD could then lead to the formation of large mRNPs. Transcripts within these mRNPs would be less accessible to transcription and translation machinery, thereby repressing expression. Other HUSH components or effectors, tethered via the C-terminal domain of Periphilin, could then sense and modify the epigenetic landscape or chromatin structure at the target site. This would include recruitment of chromatin-remodeling ATPase MORC2 and deposition of the transcriptionally repressive H3K9me3 mark by SETDB1. Further studies will be necessary to test and refine this model. This work provides a foundation to design new epigenetic therapies targeting HUSH to treat autoimmune diseases, cancer and retroviral infections.

DATA AVAILABILITY
The structure factors and atomic coordinates were deposited in the Protein Data Bank with code PDB: 6SWG, DOI:10.2210/pdb6SWG/pdb. The original experimental X-ray diffraction images were deposited in the SBGrid Data Bank (data.SBGrid.org), with Data ID 714, DOI:10.15785/SBGRID/714. The CUT&RUN data were deposited in the Gene Expression Omnibus (GEO) database under accession code GSE155824, with additional controls and peak list information available in entry GSE155693.