The ATP-dependent chromatin remodelling enzyme Uls1 prevents Topoisomerase II poisoning

Abstract Topoisomerase II (Top2) is an essential enzyme that decatenates DNA via a transient Top2-DNA covalent intermediate. This intermediate can be stabilized by a class of drugs termed Top2 poisons, resulting in massive DNA damage. Thus, Top2 activity is a double-edged sword that needs to be carefully controlled to maintain genome stability. We show that Uls1, an adenosine triphosphate (ATP)-dependent chromatin remodelling (Snf2) enzyme, can alter Top2 chromatin binding and prevent Top2 poisoning in yeast. Deletion mutants of ULS1 are hypersensitive to the Top2 poison acriflavine (ACF), activating the DNA damage checkpoint. We map Uls1′s Top2 interaction domain and show that this, together with its ATPase activity, is essential for Uls1 function. By performing ChIP-seq, we show that ACF leads to a general increase in Top2 binding across the genome. We map Uls1 binding sites and identify tRNA genes as key regions where Uls1 associates after ACF treatment. Importantly, the presence of Uls1 at these sites prevents ACF-dependent Top2 accumulation. Our data reveal the effect of Top2 poisons on the global Top2 binding landscape and highlights the role of Uls1 in antagonizing Top2 function. Remodelling Top2 binding is thus an important new means by which Snf2 enzymes promote genome stability.

200nM supercoiled pUC18 plasmid DNA (scDNA) was incubated for 30mins at 30°C with the indicated amounts of Top2 and either etoposide or ACF. Addition of ACF induced DNA cleavage, seen by the appearance of linear DNA, at lower concentrations than the positive control Top2 poison, etoposide. (D) uls1∆ ACF sensitivity is suppressed the top2-1 hypomorph at the semi-permissive temperature (30°C) but not at the permissive temperature (23°C). Additionally, deletion of TOP1 does not result in ACF sensitivity. These data re-enforce the point that ACF induced lethality in uls1∆ strains is specifically due to Top2 as it can be suppressed by reducing Top2 protein level but not by Two independent biological replicates looking at Top2 protein levels in WT (HFY250) and uls1∆ (HFY252) in the presence and absence of ACF. Signal intensity was quantified using ImageJ and the numbers below each lane display Top2 protein intensity normalised to amount in the absence of ACF.
There is a mild increase in the level of Top2 when ACF is added. However, this increase is not markedly different between WT and uls1∆ strains and is therefore unlikely to explain the dramatically different phenotype of WT and uls1∆ yeast exposed to ACF (D) 10-fold serial dilutions of the indicated genotypes showing that Uls1 needs to be nuclear for its function and that the first 349 amino acids contain a nuclear localisation sequence (NLS). uls1∆ 1-349 (HFY234) phenocopies uls1∆ (HFY71).
However, its function is fully rescued by addition of an SV40 NLS (HFY281).
Supplementary Figure S4. Top2 does not stimulate Uls1's ATPase activity. (A) ATP hydrolysis rates for the indicated proteins. The graph shows the average +/-the standard deviation of three independent experiments. 50nM wildtype Top2 (HFP185) or the ATPase dead E66Q mutant (HFP271) was incubated with or without 100µM salmon sperm DNA. (B) 15nM Uls1 (HFP350) and/or 50nM Top2 E66Q (HFP271) was incubated with or without 100µM salmon sperm DNA. Uls1 has weak DNA-stimulated ATPase activity which is not significantly further stimulated by Top2. Top2 E66Q was used to preferentially monitor the ATPase activity of Uls1.
Supplementary Figure

ChIP seq library preparation
Due to the low quantity of DNA present after immunoprecipitation, to ensure there is enough sample DNA for amplification and sequencing two experimental replicates are combined before ChIP-seq library preparation.
Step 1 of library preparation allows repair of DNA ends. To 80μl pooled DNA, add 20μl MMX1 (1x T4ligase buffer with ATP, 0.4mM dNTPs, 15 units T4 DNA polymerase, 10 units Klenow DNA polymerase, 30 units T4 polynucleotide kinase) and incubate for 60 minutes at 20°C. DNA is purified by addition of a 1:1 v/v of AMPure XP magnetic beads, incubating for 5 minutes before washing twice with 200μl 70% EtOH using a magnetic rack. Residual EtOH is removed before elution of DNA in 41μl H2O.
Step 2 of library preparation adds an additional adenine nucleotide to DNA ends to which adapters will later be ligated. To 41μl DNA, add 9μl MMX2 (1x Klenow buffer, 2mM dATP, 15 units Klenow exo-), incubating for 30 minutes at 37°C. DNA is purified as in step 1 using a 1:1 v/v of Agencourt AMPure XP magnetic beads, eluting in a final volume of 20μl H2O.
Step 3 of library preparation ligates adapters onto DNA which are used later as a template for PCR amplification of ChIPed DNA fragments to generate the final tagged DNA library. Adapter is ordered as two oligonucleotides (HFO424/425, Illumina paired end adapter sequence). HFO425 is phosphorylated as per manual specifications using T4 polynucleotide kinase (Thermo Fisher EK0032) before annealing to HFO424 by mixing an equimolar ratio of the two oligos and heating to 95°C for 5 minutes, reducing the temperature by 5°C every 5 minutes until reaching 25°C. Adapter is then run into a 1% agarose gel, gel purified and stored at -20°C before use. To ligate adapter to DNA, to 20μl DNA add 30μl MMX3 (8nM adapter, 1x T4 DNA ligase buffer with ATP, 400 units T4 DNA ligase), incubating overnight at room temperature. DNA is purified as in step 1 with a 1:1 v/v of Agencourt AMPure XP magnetic beads, eluting in 20μl H2O.
Step 4 of library preparation amplifies the DNA samples using primers complimentary to the adapters now ligated onto each DNA molecule, with each primer also containing an additional unique sequence tag which is used to identify each sample after sequencing of the DNA. Details of the primers used can be found in Table 6-6. For each sample, 3 PCR reactions are set up containing 3μl DNA and 47μl MMX4 (1X HF buffer, 3 units Phusion DNA polymerase, 0.3μM oligos (HFO426 and sequencing primer), 0.4μM dNTPs, 200mM Trehalose). The PCR is carried out by denaturing for 3 mins at 98°C before 16-20 cycles of amplification (98°C 15 secs, 60°C 25 secs, 68°C 1 min) and a final step at 68°C for 5 mins.
The three PCRs are then pooled and ran on a TAE gel containing 1% agarose and 0.5μg/ml EtBr. Once the gel has ran long enough to separate free adapter from amplified DNA, bands are cut from the gel at a size of 600bp and under, avoiding contamination with adapter. DNA is purified from the gel using Quaigen MiniElute columns (Quaigen, 28006) as per kit specifications except gel was melted at 37°C, eluting from the column in 10μl EB.

Building the W303 genome annotation
A whole-genome sequence for the W303 strain of S. cerevisiae was published by Matheson et al.
(2017), containing 16 nuclear chromosomes, the mitochondrial genome and additional plasmids present in the sequenced strain. We were unable to obtain a copy of the Matheson genome annotation so generated our own. Initial annotation was carried out by inputting the W303_LYZE genome sequence tRNA genes were identified, with 299 being present in the S288C reference. Where an unidentified gene had high similarity to Y prime helicase elements, this was labelled as "Y' element" (54 items), and where it had high similarity to gag pol genes, this was labelled as "TKP/TY" (86 items). Any elements that could not be identified were labelled as "unknown" (4/5634 ORFs). Gene lengths were checked against S288C and a note was made for each gene as to whether they were the same or not, with 524/5634 differing in size in W303 compared to S288C. Additional information for each ORF was extracted from the YGP S288C reference using excel, including gene names, aliases and gene ontology information. ARS were identified using Biopython BioSeqIO (Cock et al., 2009) Centromeric sequences were identified by blasting the nucleotide sequence of the S288C CEN1-16. CEN5 was identified by blasting for conserved centromeric elements CDEI and CDEIII as it had little homology to S288C CEN5. Peak calling was carried out using MACS2 subcommands (Zhang et al., 2008). PCR duplicates were filtered using macs2 filterdup. IP pileup track was generated using macs2 pileup, extending reads to the average fragment size. INP local lambda track was generated using macs2 pileup (-B), generating three tracks where reads were extended in both directions by half the average fragment size (termed "d"), 500bp (termed "1kb_slocal"), or 2500bp (termed "5kb_llocal"). 1kb_slocal and 5kb_llocal were normalised to "d" sized fragments using macs2 bdgopt (-m multiply). All tracks were combined to generate "local lambda" background track using macs2 bdgcmp (-m max), also normalising to maximum background noise (number of reads in INP x average fragment length / genome size) using macs2 bdgopt (-m max). IP and local lambda were normalised to counts per million (CPM) using macs2 bdgopt (-m multiply). A p-value statistical track was generated using macs2 bdgcmp (-m ppois). Peak calling was carried out using macs2 bdgpeakcall with a p-value cut-off of 0.1. IP/INP fold enrichment tracks were generated using macs2 bdgcmp.

Differential analysis
Differential analysis compares signal at peak regions between different datasets. To do this, the peaks called in all Top2/Uls1 datasets must be combined to give a list of the regions at which we will be completing differential analysis. This is completed using bedtools intersectBed and mergeBed (Quinlan and Hall, 2010).
For example, for Uls1 ChIP the following commands were used: intersectBed -a $  A bedgraph file is then generated containing signal only at the peak regions (all_peaks_merged.bed) using bedtools intersectBed (-wb) (Quinlan and Hall, 2010).
A boxplot of all coverage and peak signal is then plotted using an R script and datasets compared pairwise for the variable Cohen's D (Cohen, 1988) to see whether the datasets are statistically different, and if so whether any differences are small (>0.2), medium (>0.5) or large (>0.8). This used the following code:  V4, range=0, names=c("all_s1", "all_s2", "peaks_s1", "peaks_s2

Peak annotation
Peaks were annotated to our W303 annotation using bedtools closest (Quinlan and Hall, 2010).

Transcription level analysis
Genic peaks were annotated with their gene transcription level by comparing to a published dataset for S288C where RNA-seq was carried out on cells before and after exposure to hypoxia. We used the dataset at the 0 hour timepoint where cells had been grown to midlog phase (Bendjilali et al., 2016).
Within this dataset we discarded all data where 0 reads mapped and selected for only chromosomal ORFS (not within the 2μ plasmid/mitochondrial genome or corresponding to special RNA structures which had no corresponding ORF). The RPKM values for the remaining genes were ranked from low to high and annotated as being the "bottom 20%" (1280 genes), "mid 60%" (3829 genes) or "top 20%" (1278 genes) in terms of transcription levels. Excel was then used to annotate each genic peak within a dataset with its transcription level using the OFFSET and MATCH functions.

Transcription start site analysis
Analysis was carried out using the R based packages rtracklayer, ChIPseeker and Genomic Features (Lawrence, 2013;Lawrence et al., 2009;R Core Team, 2018;Yu et al., 2015). The input files for this analysis were our MACS2 peak files.

Repetitive region analysis
This analysis uses unfiltered reads where reads which map to multiple regions have not been removed.
This analysis is slightly difficult in that you cannot use signal height to define whether the ChIPed protein is binding or not as the more repetitive the sequence, the higher the signal will be. However, if a pairwise comparison of two datasets within a repetitive region is carried out and one dataset shows significantly higher enrichment than the other, then it can be logically assumed that there is enrichment of the ChIPed protein within the "higher" dataset.
Analysis was carried out at subtelomeric loci (+/-5kb from each chromosome end), tRNA (as within genome reference), TKP/TY transposons (as within genome reference), Y' elements (as within genome reference) and the rDNA locus. We identified a partial fragment of the rDNA locus within our W 303 reference by using blastn (Altschul et al., 1990) to search the sequence for a single rDNA repeat from within chromosome 12 in the S288C reference (Engel et al., 2013), aligning to the sequence of chromosome 12 within our W303 reference. According to SGD (Stanford University, 2012), the sequence of a single rDNA repeat within the S288C reference is on Chromosome 12 at 459,797-468,931. The top hit from this blast search within our W303 reference is on Chromosome 12 at 478,018-478,813, containing the sequence of part of the 35S rDNA locus.
First, a .bed file was generated containing the coordinates of all the regions within one repetitive region, for instance the 141 tRNA genes within our W303 reference. This was made within excel and saved as a tab delimited file. Data tracks were then generated with signal extracted at each repetitive region using intersectBed (Quinlan and Hall, 2010) and a peak list containing the SUM score under each peak in each dataset using the following commands for each dataset:

Top2 vs Uls1 comparative analysis
This analysis allows comparison of Top2 and Uls1 ChIP data at all regions defined as peaks in either Uls1 ChIP or Top2 ChIP.
First, a peak list for all Top2 and all Uls1 peaklists is generated by catenating peak files into two merged peaklists. Next, the Top2 and Uls1 peak lists were compared using intersectBed (Quinlan and Hall, 2010), with the following commands: intersectBed -a $ 3n unique_${PEAK2}_temp.bed > unique_${PEAK2}.bed where: PEAK#= Merged peak file for ChIP to be compared (e.g. all Top2 peaks, or all Uls1 peaks)