Crystal structure of the Tof1-Csm3 (Timeless-Tipin) fork protection complex

The Tof1-Csm3 fork protection complex has a central role in the replisome – it promotes the progression of DNA replication forks and protects them when they stall, while also enabling cohesion establishment and checkpoint responses. Here, I present the crystal structure of the Tof1-Csm3 complex from Chaetomium thermophilum at 3.1 Å resolution. The structure reveals that Tof1 is an extended alpha-helical repeat protein which is capped at its C-terminal end by Csm3, a small helical bundle protein. I also characterize the DNA binding properties of the complex and a cancer-associated peptide-binding site. This study provides the molecular basis for understanding the functions of the Tof1-Csm3 complex, its human orthologue the Timeless-Tipin complex and additionally the Drosophila circadian rhythm protein Timeless.


Introduction 41
As well as error-free DNA synthesis, the eukaryotic replication fork must coordinate processes such 42 as establishing chromosome cohesion, activating the S phase checkpoint and transferring epigenetic 43 material. Furthermore, the polymerases and helicases which form the core of the replication 44 machinery are regulated to couple unwinding and synthesis, protect the fork when it is blocked, and 45 integrate external signals. The coordination of all of these processes is required to maintain genome 46 stability [1]. This becomes especially important in conditions of replication stress, which is a common 47 feature of cancer cells [2,3]. Indeed many non-essential replisome-associated proteins are 48 upregulated and become essential in cancer [4][5][6]. 49 The Tof1-Csm3 fork protection complex was identified as a non-essential chromosome cohesion 50 factor [7,8] that interacts with Topoisomerase 1 [9] to link concatenation and fork regulation [10]. 51 Additionally it has an important role in protecting stalled forks and enabling their restart [11][12][13], 52 coupling the replicative helicases and polymerases [14], promoting fork progression [15,16], 53 mediating the S phase checkpoint [17,18] and maintaining genome stability at CAG repeats [19]. The 54 mammalian orthologue of Tof1-Csm3 is the Timeless-Tipin complex [20], which has similar functions 55 [21-24] and additionally regulates the fork in response to oxidative stress [25]. Timeless also has a 56 PARP1 binding (PAB) domain at its C-terminus important for double-stranded break repair [26,27], 57 and interacts with RPA through the C-terminus of Tipin [28,29]. Timeless was first discovered as a 58 circadian rhythm regulator in drosophila [30]. However, drosophila also contain a homologue of 59 mammalian Timeless, known as Tim2, which is likely to be the true orthologue of Tof1 [31]. 60 Nevertheless, it has been suggested that mammalian Timeless links the circadian rhythm with DNA 61 replication [32,33]. 62 A lack of structural information has precluded progress in the understanding of the specific role that 63 Tof1-Csm3 plays at replication forks, and how this relates to the diverse phenotypes resulting from 64 its mutation. A crystal structure of a the N-terminal domain of Timeless shows that this forms an 65 Armadillo repeat protein [34], while 2D cryo-EM classes of the yeast replisome show that Tof1-Csm3 66 binds in front of the fork to stabilize incoming DNA [35]. Here, I present the structure of the 67 Chaetomium thermophilum Tof1-Csm3 complex. The structure reveals that the protein is folded as a 68 single unit, with Csm3 forming an alpha-helical bundle that caps the Armadillo repeats of Tof1. This 69 suggests a structural role for this complex at the fork. The crystallographic packing in my structure 70 reveals a peptide-binding patch that is affected by a cancer-associated mutation in human Timeless, 71 and I map a double-stranded DNA binding activity to a minimal Tof1-Csm3 complex. The structure 72 also enables sequence alignment of Tof1 with human and drosophila Timeless clarifying the 73 similarity of their structures but differences in function. 74 75

76
Structure of the Tof1-Csm3 complex 77 To determine the structure of the Tof1-Csm3 complex, multiple constructs from the mildly 78 thermophilic fungus Chaetomium thermophilum were screened for purification and expression. 79 Crystals were obtained for many of these ( Fig. S1A), but only formed in customized crystallization 80 screens designed for challenging complexes. These screens are detailed in Figure S1B, and were also 81 previously used to crystallize another complex [36]. The only construct resulting in diffracting 82 crystals was Tof1(1-728)DL1,2,3-Csm3(48-157). Here, three predicted disordered loops were 83 deleted from Tof1: L1 (256-363), L2 (420-434), L3 (558-585) (Fig. S1A). These crystals diffracted 84 anisotropically to 3.1 Å (Table 1), and the dataset could be solved by molecular replacement using 85 the N-terminal fragment of Timeless (PDB 5MQI) [34] with the remaining half of the protein manually 86 built, exploiting molecular dynamics force-field refinement [37] and contact prediction [38]. A final 87 Rwork/Rfree of 23.7/26.4% was achieved. The entire structure is well resolved, aside from the N-88 terminal portion of Csm3, and two small loops in Tof1 (Fig. S2). The main crystal contacts occur at 89 the N-terminus of Tof1, and thus Csm3 and the very C-terminus of Tof1 have relatively high B factors 90 and there is lower map quality in this region (Fig S2C). 91 The structure reveals that, instead of containing distinct domains, the entire complex forms an 92 extended alpha-helical repeat protein ( Fig 1A). Tof1 begins with two helices linked by a beta-hairpin, 93 which is then followed by eight 3-helix armadillo repeats (Arm1-8). The previous Timeless N-94 terminal domain structure is a fragment of this structure ending after Arm-5. For this fragment, Tof1 95 and Timeless are clearly highly structurally related (Fig. 1B). Csm3 further extends the alpha-helical 96 repeat structure by forming a five-helix bundle which packs on the C-terminus of Tof1. Csm3 shows 97 some similarity to a tetra-helical bundle helix-turn-helix fold, but the first helix is broken into two 98 (Fig. 1C). The closest structural homologue identified by PDBeFold [39] is the DNA binding domain 99 of the small terminase from bacteriophage SF6 (Fig. 1C) [40]. However, in contrast to this domain, 100 the helical bundle of Csm3 is flattened such that it no longer has a hydrophobic core and thus does 101 not appear to be an independently folded structure and rather acts as a cap on the C-terminus of Tof1 102 (Fig.1C). The interface between the two is largely hydrophobic with some salt bridges, and consists 103 of a large percentage of Csm3 (Fig. 1D), showing why both proteins are required to stabilize each 104 other [41]. α25 and α26 from Arm-8 plus the following helix α27 contain all the interaction sites for 105 Csm3 (Fig. 1D). Tof1 helices α25 and α26 pack predominantly against Csm3 α3. Tof1 helix α27 106 inserts into the concave structure of Csm3 making hydrophobic interactions with Csm3 helices α2 107 and α3. 108

Relation of Tof1 homologues 109
With the structure of the fungal Tof1-Csm3 complex, it is possible to gain insight into the structures 110 of the orthologous protein Timeless, and its homologue, the circadian rhythm regulator drosophila 111 Timeless (CR-Timeless) (Fig. 2). From this analysis it is immediately clear that the Timeless protein 112 from varying eukaryotes has the same overall fold, with the hydrophobicity profile of all helices up 113 to α26 conserved. Interestingly, helix α27, the major Csm3/Tipin interaction site, is very highly 114 conserved in all the DNA replication Tof1/Timeless proteins but absent in the drosophila CR-115 Timeless. This is clearly indicative of the separate function of this protein. Furthermore, Loop 1 is 116 highly conserved in the DNA replication Tof1/Timeless proteins (Fig. S3A), but has an entirely 117 different sequence in CR-Timeless. This suggests Loop 1 has an important role in DNA replication. 118 Interestingly, the presence of an additional small C-terminal portion of Tof1 (residues 728-900) 119 largely increases the stability of the Tof1-Csm3 complex, despite preventing crystal diffraction ( Timeless. Confirming this assignment, the RaptorX server, which determines the fold de novo by 123 evolutionarily predicted sequence-contact restraints, independently generates a structure from the 124 C. thermophilum sequence which is highly similar to the human PAB domain (Fig. S3C). This 125 assignment is particularly notable because budding yeast contains no PARP proteins, and indeed 126 important PARP1-binding residues are not conserved in the fungal proteins (Fig. S3B). Surprisingly, 127 these residues are conserved in CR-Timeless. 128 Interactions of the Tof1-Csm3 complex 129 Other Armadillo repeat proteins, such as β-catenin and importin-α, often bind peptides within the 130 interior of their α-solenoid structure [42,43]. As previously noted, there is a highly conserved cleft 131 within this interior towards the N-terminus of Tof1 [34], lined with basic and hydrophobic residues. 132 In our structure, the purification tag of a symmetry copy occupies this cleft (Fig. 3A). Furthermore, 133 this interaction occurs in all three copies in the asymmetric unit despite the lack of symmetry in the 134 packing, which suggests the patch has a very high propensity for peptide binding (Fig, 3A). 135 Intriguingly, in human Timeless, one Arginine lining this pocket, Arg40 (C. thermophilum Lys51), has 136 been found mutated six times to cysteine and once to proline in different cancers in the COSMIC 137 Sanger database [44], This is the most common cancer-associated missense mutation of Timeless, 138 with the second being Pro1043 (five times), which is a key Parp1-interacting residue (Fig. S3B). This 139 implies Arg40 has functional importance. Additionally, some residues, such as Arg47 and Arg54, are 140 highly conserved in fork protection Tof1/Timeless but not CR-Timeless (Fig. 2B). Considering the 141 positive charge of this pocket we tested if this was a functional DNA binding site. In an EMSA, 142 Tof1(long)-Csm3 bound to dsDNA but not ssDNA with a mild affinity (Fig. 3B). However, a quadruple 143 alanine mutation of the peptide-binding patch (K50/R54/R98/R173) had no effect on this activity 144 ( Fig. 3B), suggesting that this patch has another function. As this N-terminal site is not involved in 145 DNA binding, we next tested whether the other positively charged region of the complex harbored a 146 DNA binding site -the C-terminus of Tof1 with Csm3 (Fig. 3C). We were able to generate a stable 147 minimal complex of Tof1-Csm3 containing only helices α22-28 of Tof1 with Csm3 ( Fig S1A). 148 Interestingly, this minimal complex retained dsDNA binding activity (Fig. 3B) together. Importantly, we show that the Tof1-Csm3 and Timeless-Tipin complexes are highly related 157 (Fig. 1B, 2), and thus likely play the same role at the fork, while the Drosophila CR-Timeless has the 158 same overall fold but clear differences in sequence related to its separate role (Fig. 2)  studies which artificially truncated this fold. This structure suggests the protein has a scaffolding or 163 mechanical role at the fork, which is presumably further regulated by the large intrinsically 164 disordered regions in both Tof1 and Csm3. A recent 2D cryo-electron microscopy study has shown 165 that Tof1-Csm3 binds ahead of the replicative helicase and reduces the flexibility of incoming DNA 166 [35]. Such a mechanical function would fit well with my structure, given its rigid structure and dsDNA 167 binding properties. My structure also reveals a highly conserved peptide-binding patch that may 168 interact with a partner protein such as the replicative helicase [14] or Mrc1/Claspin [11,47]. 169 Intriguingly one arginine in this binding patch has been detected as a cancer-associated mutation. 170 Normally, Timeless is overexpressed in cancer, and cells become dependent on the protein to combat 171 cancer-caused replication stress [6]. Future studies will be able to ascertain whether this mutation 172 has a positive or negative effect on Timeless activity. 173 Overall, our structure provides a basis for understanding the interactions, mutations and function of 174 both the fork protection complex and circadian rhythm regulator Timeless protein at a molecular 175 level. 176

Conflict of interest 177
The authors declare that they have no conflict of interest. 178

Acknowledgements 179
I would like to thank beamline scientists and support staff at the EMBL-operated PETRA III beamline 180 P14 at DESY. This work was funded by DFG grant GR5152/3-1 and a postdoctoral fellowship from 181 the Alexander von Humboldt Foundation. 182

5'CTGCGGTTCGTTCTCCGATCGG 245
Expression and purification of Tof1-Csm3 constructs 246 All Tof1-Csm3 constructs were expressed and purified using the same method. The appropriate 247 plasmid was transformed into BL21(DE3)Star (Novagen) cells and grown in media supplemented 248 with 50 µg/ml kanamycin. Large terrific broth expression cultures were inoculated 1 in 100 with an 249 overnight start culture and grown at 30 o C with shaking at 200 rpm until they reached an A600 of 0.5. 250 The temperature was reduced to 17 o C, and cultures were induced with 0.4 mM IPTG overnight, 251 followed by harvesting using centrifugation, and storage of bacterial pellets at -80 o C until use. 252 Thawed pellets were resuspended in 50 mM Tris-HCl pH 8.0, 500 mM NaCl, 1 mM TCEP, 10 mM 253 imidazole, 1 EDTA-free protease inhibitor tablet/50 mL (Roche), and 30 U/mL DNase I. Lysis was 254 performed by two passages through a Microfluidics M-110P microfluidizer at 150 MPa. The lysate 255 was cleared by centrifugation for 1 hour at 60000 x g and then loaded on a 5 mL Histrap FF column 256 (GE Healthcare) equilibrated in 50 mM Tris-HCl pH 8.0, 500 mM NaCl, 1 mM TCEP, using an Akta 257 Purifier FPLC system. The column was then washed with 16 column volumes of 25 mM imidazole in 258 the same buffer, and then eluted with a 10 column volume gradient of 25 -250 mM imidazole. The data were collected at 100K and at a wavelength of 0.980 Å at beamline P14 of the EMBL-276 operated PETRA III ring at DESY. The data were integrated using XDS [53], and then merged either 277 using Aimless [54] to 3.6 Å or using the STARANISO server [55] to 3.09 Å (Table 1). 278 A homology model of C. thermophilum Tof1 residues 1-488 was generated from the structure of the 279 corresponding region from human timeless [34] (PDB 5MQI) using the SWISS-MODEL server [56]. 280 From this, all sidechains were truncated to alanine and all loops deleted. Three copies were found 281 using the STARANISO-processed data and PHASER [57] with a TFZ score of 11.1. Initial refinement 282 was performed using Phenix Refine [58] and the non-corrected data to 3.6 Å. Helices were manually 283 placed in the density, and Phenix Autobuild [59] was run occasionally and reduced model bias. A 284 continuation of the alpha-helical repeat structure from the Timeless fragment was clear, and so this 285 was exploited for sequence and topology assignment. The sequence was too short for the last four 286 helices and so these were assigned to Csm3. Once the model was largely complete, refinement was 287 continued using Buster [60,61]  Csm3 copy was the best defined and used for all structural figures, unless otherwise stated, and these 293 were generated using PyMOL [63]. 294

Electrophoretic Mobility Shift Assays 295
The ssDNA oligonucleotide had the following sequence: 5'-Cy3-GTAGTTTGTACTGGTGACGA. The 296 dsDNA substrate was generated by mixing this 1:1 with the complementary oligonucleotide 5'-297 TCGTCACCAGTACAAACTAC, melting at 95 o C for 2 minutes, and then slow cooling at room 298 temperature to anneal. DNA substrates were used at a final concentration of 50 nM. Samples were 299 prepared in HBS-200 with an additional 10% glycerol, and incubated on ice for 30 minutes. Samples 300 were loaded on a 6% polyacrylamide gel and run at 70 V for 50 minutes in a tris-glycine buffer 301 system. After running, gels were scanned with a Pharos FX fluorescence imaging system (Biorad) and 302 excitation/emission wavelengths of 532/605 nm. 303