CryoEM structures of human CMG - ATPγS - DNA and CMG - AND-1 complexes

DNA unwinding in eukaryotic replication is performed by the Cdc45-MCM-GINS (CMG) helicase. Although the CMG architecture has been elucidated, its mechanism of DNA unwinding and replisome interactions remain poorly understood. Here we report the cryoEM structure at 3.3 Å of human CMG bound to fork DNA and the ATP-analogue ATPγS. Eleven nucleotides of single-stranded (ss) DNA are bound within the C-tier of MCM2-7 AAA+ ATPase domains. All MCM subunits contact DNA, from MCM2 at the 5′-end to MCM5 at the 3′-end of the DNA spiral, but only MCM6, 4, 7 and 3 make a full set of interactions. DNA binding correlates with nucleotide occupancy: five MCM subunits are bound to either ATPγS or ADP, whereas the apo MCM2-5 interface remains open. We further report the cryoEM structure of human CMG bound to the replisome hub AND-1 (CMGA). The AND-1 trimer uses one β-propeller domain of its trimerisation region to dock onto the side of the helicase assembly formed by Cdc45 and GINS. In the resulting CMGA architecture, the AND-1 trimer is closely positioned to the fork DNA while its CIP (Ctf4-interacting peptide)-binding helical domains remain available to recruit partner proteins.

tracks along the leading-strand template in the 3′-to-5′ direction (Fu et al., 2011), with the Ntier ring of MCM2-7 at the leading edge of the advancing helicase (Georgescu et al., 2017).
Strand separation is proposed to be achieved by a modified version of steric exclusion, whereby the lagging strand penetrates the N-tier of the CMG before separation (Langston & O'Donnell, 2017).
The mechanism of translocation by which the CMG couples ATP hydrolysis to processive DNA unwinding is the current focus of intense research efforts. Based on structural analysis of bacteriophage, viral and bacterial systems (Enemark & Joshua-Tor, 2006;Gao et al., 2019;Itsathitphaisarn, Wing, Eliason, Wang, & Steitz, 2012;Singleton, Sawaya, Ellenberger, & Wigley, 2000) a consensus has emerged for a sequential rotary mechanism of DNA unwinding by replicative DNA helicases. In this mechanism, ATP is sequentially hydrolysed by successive ring subunits so that each ring position cycles through ATP, ADP and apo states. In turn, the ATP state determines allosterically the position of the DNA-binding loops, that adopt a staircase arrangement matching the DNA spiral bound within the ring pore. The sequential hydrolysis of ATP around the ring causes the coordinated motion of the DNA-binding loops, resulting in translocation of the DNA substrate through the ring.
A complicating feature when trying to analyse CMG translocation is that, unlike the homohexameric helicases of simpler organisms, the MCM2-7 motor of the CMG is a hetero-hexamer of six related but distinct subunits (Bochman, Bell, & Schwacha, 2008). Indeed, biochemical measures of DNA unwinding by purified fly CMG showed that ATP binding and hydrolysis are not equally important at all MCM ring interfaces (Eickhoff et al., 2019;Ilves, Petojevic, Pesavento, & Botchan, 2010). Furthermore, biological evidence in yeast shows that the importance of DNA binding is different among MCM subunits (Lam et al., 2013;Ramey & Sclafani, 2014). Recent cryoEM analyses of yeast CMG have led to the proposal of alternative translocation mechanisms, based on 'pumpjack' or 'inchworm' movements of the N-and Ctier of the MCM ring (Abid Yuan et al., 2016). A recent structural study of the fly CMG in conditions of DNA-fork unwinding (Eickhoff et al., 2019) imaged four distinct states of the helicase; the states formed the basis for an asymmetric model of DNA unwinding that accounted for the different roles of the MCM2-7 subunits in translocation.
The critical insights provided by these initial landmark studies have not been sufficient to settle the important issue of the mechanism of DNA translocation by the CMG, and therefore further structural investigations are needed. It is especially important to obtain high-resolution cryoEM maps that will allow the determination of accurate atomic models of the helicase bound to fork DNA substrates, to elucidate unambiguously key aspects of the mechanism of translocation on DNA such as the protein-DNA interface and the geometry of the ATP-binding sites. Equally important is to obtain high-resolution information on the interactions of the CMG with other core replisome components. Furthermore, published structural analyses focused on CMGs from simpler model systems such as yeast or Drosophila, and no structural evidence is currently available for vertebrate CMG.
Here we report the cryoEM structure at 3.3 Å of human CMG bound to a fork DNA substrate in the presence of ATPgS. We also present the cryoEM structure of human CMG bound to AND-1, a core replisome component that acts as a platform for recruitment of replisome components to the replication fork. Unique features captured in our structures provide insights into DNA translocation and formation of larger replisome assemblies by the human CMG helicase.

Expression and purification of human CMG-ATPgS-DNA
To maximise our chances of producing correctly-assembled human CMG, we used transient transfection of suspension-free HEK293 cells with a plasmid system encoding all 11 subunits of the CMG assembly. After co-expression of MCM2-7, Cdc45 and GINS, human CMG was purified by Ni 2+ -and Streptactin-affinity chromatography (Supplementary figure 1A). A large endogenous protein that co-purified at sub-stoichiometric levels with the CMG over the two-step purification was identified as AND-1, a known replisome component. AND-1 copurification indicates a tight constitutive association with the CMG in the human replisome, in agreement with the known association of AND-1's orthologue Ctf4 with the yeast CMG (Gambus et al., 2006).
To capture a high-resolution snapshot of human CMG poised to translocate on a fork DNA substrate, we decided to use the ATP analogue ATPgS. Streptactin-bound CMG was incubated with buffer containing a fork DNA substrate and ATPgS before elution with desthiobiotin (Supplementary figure 1B). The DNA consisted of a 40 bp duplex region with 30 nt tails and resembled closely a fork DNA that had been designed to measure CMG's helicase activity, with a 3¢ polydT tail for helicase loading in the correct orientation for fork unwinding and a 5¢ GC-rich tail that inhibits helicase binding (Petojevic et al., 2015).
CryoEM data were collected on a Titan Krios operating at 300 keV using a K2 Summit detector and processed with Relion-3 (Scheres, 2012). After 2D and 3D classification and refinement, we obtained a 3.29 Å map of CMG-DNA-ATPgS from a set of 213,527 particles, and a 3.41 Å map of the CMG C-tier, comprising the ring of AAA+ ATPase domains, after masking of the N-tier (Supplementary figures 2 and 3). Both maps were used to build a molecular model of CMG--ATPgS-DNA. The excellent quality and high resolution of the map allowed an accurate description at atomic level of the protein-DNA interface and ATP-binding sites of the human CMG ( Figure 1A).

Overall structure
The 11-subunit assembly of the human CMG shows the familiar architecture first demonstrated for the yeast and drosophila CMG (Costa et al., 2011;Georgescu et al., 2017): a two-tiered ring of MCM2-7 proteins, with the Cdc45 and GINS coactivators bound together to the N-tier portion of MCM2, MCM5 and MCM3, so that Cdc45 faces the MCM2-5 interface of ATPase domains ( Figure 1B). Each AAA+ ATPase domain contains a nucleotide-binding site at the subunit interface in the C-tier ring, and interacts with DNA via two b-hairpin loops named presensor-1 (PS1) and helix-2 insert (H2I) (Iyer, Leipe, Koonin, & Aravind, 2004) that line the pore of the C-tier ring (Supplementary figure 4A). Flexible anchorage between MCM subunits is provided by a domain-swapped helix in each ATPase domain, which tethers each MCM to its neighbour subunit (Supplementary figure 4B).
A continuous chain of eleven thymidine nucleotides is bound in a right-handed B-form spiral within the C-tier channel of ATPase domains. The single-stranded (ss) DNA contacts all 6 MCM subunits, from MCM2 with its 5′-end to MCM5 with the 3′-end, and thus traverses all MCM interfaces except MCM2-5 (Figure 2A, B). No clear density is visible within the N-tier of the CMG for either single-stranded DNA or the double-strand portion of our fork DNA substrate. The likely explanation for this observation is that the slowly-hydrolysable ATPgS nucleotide has permitted the engagement of the CMG helicase with the leading-strand portion of the fork DNA substrate but prevented its translocation to the ss-dsDNA nexus.
Three of the six MCM interfaces in the ring: MCM6-4, MCM4-7 and MCM7-3, are bound to ATPgS, whereas the MCM3-5 and MCM2-6 interfaces contain ADP as product of ATPgS hydrolysis, while the MCM2-5 interface is empty ( Figure 2B). In accordance with the apo status of MCM5, the MCM2-5 gate remains ajar and the MCM2-7 C-tier ring adopts a shallow right-handed spiral conformation (Supplementary Figure 5).
In addition to their N-and C-tier domains, each MCM subunit contains a smaller C-terminal winged helix (WH) domain. The WH domains of MCM2 and MCM6 were well resolved in the focused C-tier map and could be therefore be modelled in the density ( Figure 1B). Density for the WH domain of MCM5 could be identified in the lumen of the C-tier pore, but was not of sufficient quality to allow modelling. The similar MCM2 and 6 WH domains sit on the rim of the C-tier and interact with each other with approximate two-fold symmetry. The first 14 proline-rich amino acids of the MCM3 isoform used in our study bind at the interface between the MCM3 N-tier and the GINS Psf3, likely extending and stabilising the GINS-MCM ring interface (Supplementary figure 6).

DNA binding
In the structure, the ssDNA is embedded within the pore of the C-tier ring ( The role of the invariant PS1 lysine is remarkably similar to that of K506 of the E1 papillomavirus replicative DNA helicase (Enemark & Joshua-Tor, 2006). The ssDNA is kept in close contact with each MCM subunit by two hydrogen bonds between phosphates of the second and third nucleotide in each binding site and main-chain nitrogens of the PS1 residue after the invariant lysine and of a first-strand residue in the H2I hairpin (Figure 3).
Besides these polar contacts, the protein-DNA interface has substantial hydrophobic character: small aliphatic side chains of valine and alanine in both H2I and PS1 loops pack against the ribose-phosphate backbone of the DNA, creating a continuous hydrophobic surface in the Ctier pore that matches the spiral of the DNA (Figure 3). In addition to making extensive contacts with the DNA backbone, the MCM subunits use the H2I loop to interact with the bases: a pair of conserved H2I residues, consisting of a basic and an aromatic/hydrophobic amino acid six residues apart ([+]x 6 [W/Y] motif; +, basic; W, aromatic; Y, large aliphatic) contact the third and fourth thymidine in each binding site (Figure 3  . This local helical folding is driven by anti-parallel b-strand pairing of the two residues preceding the serine with the second b-strand of the PS1 loop in the preceding MCM subunit. This intersubunit interaction helps merge the H2I and PS1 loops of individual MCMs into a continuous DNA-binding staircase that extends around the pore in the C-tier, as noted recently for the archaeal homo-hexameric MCM ring (Meagher, Epling, & Enemark, 2019). As expected for the MCM subunit at the bottom of the staircase, H2I a N is disordered in MCM5.

ATP binding and hydrolysis
In the structure, five of the six ATP-binding sites in the MCM2-7 ring are occupied by a nucleotide (Figure 4  The ATP status of the ADP-bound MCM2 subunit appears anomalous given its position at the top of the ring staircase. Several indicators point to MCM2 acting as a 'seam subunit' (Eickhoff et al., 2019) that has only partially engaged with the rest of the C-tier ring: the smaller interface area with MCM6 (1391 Å 2 , instead of ~2000 Å 2 for the other nucleotide-occupied MCM interfaces), the disordered conformation of its DNA-binding element H2I a N , and the higher B value attained during real-space refinement. The ADP moiety of MCM2 is also unusual in the way it sits in the P-loop, as the nucleotide is shifted so that its b phosphate occupies the position occupied by the g-phosphate in the ATPgS-bound interfaces (Supplementary figure 11).
Overall, the observations relative to ATP status and DNA binding in the MCM C-tier ring are consistent with a sequential rotary mechanism of ATP hydrolysis as the basis for translocation of the human CMG. The structural integrity of the C-tier during translocation is provided by a domain-swapped helix (Supplementary figure 4) that provides a flexible tether between contiguous MCM subunits, in a similar fashion as recently described for the replicative gp4 DNA helicase (Gao et al., 2019).

Interaction with AND-1
Analysis of purified human CMG overexpressed in HEK293 cells revealed the co-purification of sub-stoichiometric amounts of endogenous AND-1, a known replisome factor and the human orthologue of yeast Ctf4 (Supplementary figure 1). Our previous work had shown that yeast Ctf4 acts as a recruitment hub for replisome proteins, tethering multiple factors at the fork via its trimeric structure (Simon et al., 2014;Villa et al., 2016). AND-1 shares its oligomeric nature with Ctf4, although it appears to have a distinct mechanism of binding to its replisome partner, Pol a/primase (Kilkenny et al., 2017).
Co-purification of endogenous AND-1 indicated a strong constitutive interaction with the human CMG complex. We therefore decided to co-express AND-1 together with the components of the human CMG, and succeeded in purifying a 12-subunit CMG assembly which we refer to here as CMGA (Supplementary figure 12). CryoEM analysis of CMGA using Warp for particle picking (Tegunov & Cramer, 2019) and CryoSparc for image reconstruction (Punjani, Rubinstein, Fleet, & Brubaker, 2017) yielded a 6.77 Å map that was readily interpretable and permitted the unambiguous docking of the high-resolution structure of human CMG and the crystal structure of the AND-1 trimer that we reported earlier (Kilkenny et al., 2017) (Supplementary figure 13).
The structure shows that the disk-shaped AND-1 trimer docks edge-on at a near perpendicular angle onto the leading face of the CMG (Figure 6). Despite AND-1 being full-length, only the SepB-like domain of AND-1 is visible in the map, indicating that the N-terminal b-propeller domain and its extended C-terminal portion spanning the HMG box are flexibly oriented in the trimeric structure. Relative to the CMG, the trimeric AND-1 disk is arranged so that its N-terminal segments are located ahead of the fork and in proximity of the parental doublestranded DNA. In contrast, the helical structure of the SepB domain and the relative C-terminal extensions project away from the CMG (Figure 6).
AND-1 binds at the perimeter of the CMG, engaging both Cdc45 and GINS with the first and last blade of one of its b-propeller domains (Figure 7). The CMG -AND-1 interface is formed by the B-domain of Psf2 that projects towards the concave surface formed by blades 1 and 6 of AND-1's b-propeller, as well as the helical portion of Cdc45 that links its two DHH domains. The interface buries only 1087 Å 2 , a surprisingly small area for a constitutive interaction (Figure 7). The limited resolution of our structure is insufficient for unambiguous identification of interface amino acids. However, we can determine that the interface is of mixed hydrophobic and hydrophilic nature, and that the tight binding of AND-1 to the CMG despite the relatively limited interface might be driven by the presence of charge-charge interactions that become solvent-excluded upon CMGA formation.

DISCUSSION
In this paper, we have used cryoEM to capture a high-resolution view of the human CMG bound to a fork DNA substrate in the presence of ATPgS. Our map of human CMG-ATPgS-ssDNA allowed us to visualise unambiguously critical features of the complex, such as its protein-DNA interface and the nucleotide-binding sites, and to represent them in an accurate atomic model. We have also reported an intermediate-resolution structure of the CMGA assembly, which described the mode of interaction of CMG with the core replisome factor AND-1.

DNA binding
In our structure, all six MCM subunits contact ssDNA, spanning a total of 11 nucleotides. The footprint of an MCM subunit on ssDNA covers four nucleotides, rather than two as previously reported (Figure 3) (Eickhoff et al., 2019), with two overlapping nucleotides between neighbouring subunits. Interaction of MCM6, 4, 7 and 3 with DNA takes place via an identical set of contacts mediated by both PS1 and H2I DNA-binding loops (Figure 3). A significant difference is that MCM6 and MCM2 use aromatic residues at a conserved H2I position to unstack the nucleotides at the 5′-end of the DNA and disrupt the B-form DNA. These aromatic residues intermesh with consecutive bases much as the teeth of a cogwheel; such contacts appear well suited to avoid slippage and transmit torque when an MCM subunit engages the leading DNA strand emerging from the N-tier, at the top of the staircase. Overall, the arrangement of PS1 and H2I loops within the C-tier ring follows closely the DNA spiral, lowering steadily their vertical reach from MCM2 at the top of the binding staircase to MCM5 at the bottom (Figure 2).
Earlier structural work on yeast and fly CMG had shown that DNA can be bound via two different sets of MCM subunits: MCM6, 4 and 7 (Abid  or MCM2, 3, 5 and 6 (Georgescu et al., 2017;Goswami et al., 2018). A recent cryoEM analysis of fly CMG in the act of translocating on DNA revealed the existence of several different conformational states, which appear to encompass and extend the previously described DNA-bound states of the CMG (Eickhoff et al., 2019). In light of this analysis, our CMG-DNA structure would most likely correspond to state 2B, in which MCM6, 4, 7 and 3 contact DNA and are bound to ATP, with MCM2 in the process of exchanging ADP for ATP and re-engaging with DNA at the 5′end. Thus, the emerging evidence from this wealth of structural data strongly indicates multiple modes of asymmetric DNA binding in the C-tier ring as a key feature of DNA translocation by the CMG.

ATP site occupancy and hydrolysis
The site occupancy and hydrolysis status of ATPgS in our CMG-DNA structure is in general agreement with the sequential rotary model of ATP utilisation but also reveals some showing that loss of ATP binding by MCM3 and MCM5 caused the largest decrease in ATP hydrolysis rates (Ilves et al., 2010), and with a four-fold reduction in DNA unwinding caused by an 'arginine finger'-to-alanine mutation in fly MCM5 (Eickhoff et al., 2019). Whether the open state of the apo MCM2-5 interface represents a natural intermediate state in the translocation cycle, or rather a stalled or paused state of the helicase remains to be established.
At any rate, the unexpected indication of ATPgS hydrolysis provides evidence that the CMG has engaged productively with the fork DNA substrate and might have undergone a limited degree of translocation.
The ADP-bound status of MCM2, at the top of the staircase and interacting with the 5′-end of the ssDNA, is apparently inconsistent with a sequential rotary model of ATP hydrolysis. We have already described several elements of evidence indicating that the MCM2 appears to behave as a 'seam' subunit (Eickhoff et al., 2019). An intriguing possibility is that the observed ATPgS hydrolysis by MCM2 might be a clue of translocation with reverse polarity, that could have been prompted by idling of the helicase upon incubation with slowly-hydrolysable ATPgS. Evidence that the CMG can backtrack on ssDNA has been provided by recent singlemolecule studies (Burnham, Kose, Hoyle, & Yardimci, 2019).

DNA translocation
The staircasing arrangement of the DNA-binding loops and the ATP-hydrolysis status in the MCM ring are both supportive of a sequential rotary mechanism of DNA translocation for the human CMG. Differences, such as the observed ATP-hydrolysis status of MCM3, with the proposed model of asymmetric translocation (Eickhoff et al., 2019), remain to be explained and might be species-specific.
Our structure captures a high-resolution snapshot of the CMG trapped on fork DNA with ATPgS and does not provide conclusive evidence concerning models of asymmetric translocation. However, insight into asymmetry in MCM2-7 behaviour comes from analysis of the solvation free-energy for formation of the interfaces between contiguous ATPase domains (D i G) in the C-tier ring, using the EBI PISA server (Krissinel & Henrick, 2007). The analysis shows striking differences in D i G values among MCM interfaces. Although the three ATPgS interfaces, as well as the ADP-bound MCM3-5 interface, bury similar surface areas (between more than 2-fold higher energy gains and 7-fold lower P-values than for the MCM7-3 ATPgS and the MCM3-5 ADP interfaces (Supplementary figure 14). Thus, the D i G analysis indicates that the contacts binding together the MCM 4-7 and 6-4 interfaces are much tighter and more specific than in the other MCM interfaces. These observations might provide a structural basis for the finding that human MCM4, 6 and 7 can be recovered as a stable heterotrimeric complex from HeLa cells (Ishimi, 1997). They further suggest that MCM subunits 4, 6 and 7 might behave as a rigid body in the asymmetric mode of DNA translocation. This would be in agreement with biochemical evidence that loss-of-ATP-binding K-to-A mutations in fly MCM6 and 4 causes only modest reductions in DNA unwinding (Ilves et al., 2010), and that loss of ATP hydrolysis at the MCM6-4 and MCM4-7 interfaces caused by alanine mutation of the 'arginine finger' in MCM4 and 7 has equally modest effects on unwinding (Eickhoff et al., 2019).
The evolutionary invariance of catalytic residues in all six MCM2-7 ATPases represents a challenge for models of asymmetric translocation that consider ATP binding and hydrolysis to be important for a subset of MCM subunits. The following points in this regard can be made: as far as we can determine in our cryoEM map, all catalytic and sensor MCM residues at each nucleotide-bound interface engage correctly with the nucleotide and are therefore potentially capable of catalysis. Furthermore, symmetric translocation is adequate for DNA replication by homo-hexameric DNA helicases in viruses, bacteria and archaea, implying that it represents the evolutionary consensus for DNA translocation. Finally, differentiation of the eukaryotic MCM into six distinct proteins may have evolved to endow the CMG with its unique mode of loading and activation and possibly termination. Asymmetric translocation might therefore represent an adaptation to cope with MCM sequence diversification rather than a process of optimisation. Consequently, the eukaryotic CMG might be capable of considerable mechanistic flexibility, concerning the role played by each of its MCM subunits during translocation.

CMGA
Our cryoEM structure of the human CMGA elucidates the interaction mechanism of the replisome component AND-1 with the CMG. AND-1 does not contact the MCM proteins and binds to the side of the CMG where Cdc45 and GINS are located. Thus, an important role of Cdc45 and GINS, in addition to activating the CMG helicase activity, is to mediate CMG's interaction with AND-1. Only the trimer of AND-1's SepB-like domains is visible in the map, indicating that its N-terminal b-propeller and extended C-terminal region are flexibly arranged relative to its central trimeric structure. AND-1 is docked onto the CMG like a rigid body, with a single b-propeller wedged in between Cdc45 and GINS. This mode of CMG binding is similar to the mode of interaction that was recently described for yeast Ctf4 with the CMG (Yuan et al., 2019), suggesting that the resulting architecture of the CMGA is functionally important and has been evolutionarily preserved.
Because of the high interaction angle of the disk-like AND-1 trimer relative to the plane containing the leading edge of the CMG, the N-terminal b-propeller domains of AND-1, not visible in our structure, would be placed in the trajectory of the parental DNA. This striking feature of the CMGA architecture indicates that AND-1/Ctf4 can in principle contact the parental DNA ahead of the CMG's leading edge of translocation. Clearly, these findings point to further discoveries and unexpected observations that await future structural studies of the eukaryotic replisome.
We had originally reported that Ctf4 can interact with the CMG via a CIP (Ctf4-Interacting Peptide) motif present in the N-tail of yeast Sld5 (Simon et al., 2014), as well as in Ctf4's multiple protein partners (Villa et al., 2016). In the light of our current observations, we believe that the mode of AND-1 interaction with the CMG described here and reported for yeast Ctf4 represents the principal mode of AND-1/Ctf4 recruitment to the replisome. Binding by the CIP of yeast Sld5 might further secure the association of Ctf4 to the yeast CMG, as well as possibly act as a safety mechanism for keeping Ctf4 anchored to fork when its primary interaction site with the CMG is disrupted. A CMG-binding motif equivalent to the yeast CIP is not found in human Sld5 or any other human CMG components, indicating that the Ctf4-Sld5 CIP interaction is unique to budding yeast. (Q9BRX5) and Sld5 (Q9BRT9) were synthesised using the GeneArt Gene Synthesis service (ThermoFisher). GenBank EAX04367 codes for an MCM3 isoform (853 aa) that contains 45 additional residues at the N-terminus relative to Swissprot P25205.

METHODS
ORFs were codon optimised for overexpression in human cells and designed with flanking restriction sites for insertion into the ACEMam1 and 2 vectors of the MultiMam transient system (Vijayachandran et al., 2011). MCM4 was encoded with an N-terminal His 8 -TEV tag, while Psf2 was encoded with a C-terminal TEV-2xStrepII tag. Multi-cassette constructs encoding MCM4-6-7, MCM2-3-5 and Cdc45-Psf1-Psf2-Psf3-Sld5 were generated making use of the I-CeuI / BstXI sites in the MultiMam vectors. For expression of human CMG -AND-1 (CMGA), the full-length ORF of human AND-1 (IMAGE cDNA clone 6514641) was cloned with an N-terminal 2xStrepII-TEV tag into the ACEMam2 construct expressing MCM2-3-5, while the C-terminal Strep tag fused to the Psf2 gene was removed.
Mammalian tissue culture and protein production. Three hours post-transfection, 4 mM valproic acid (Sigma, #P4543) was added to the cultures, to enhance protein expression. Cultures were returned to shaking incubation for four days, before being harvested by centrifugation at 500 x g for 10 minutes at 10 °C. Cell pellets from ~1.2 L cell culture were subsequently resuspended in 40 mL of chilled, sterile PBS supplemented with SIGMAFAST EDTA-free protease inhibitor cocktail (Sigma, #S8830).
Washed cells were again harvested by centrifugation, then snap-frozen in 50 mL centrifuge tubes using liquid nitrogen, and stored at -80 °C.
The filtered sample was applied to a 5 mL HisTrap HP column (GE Healthcare, #17-5248-02), pre-equilibrated with Buffer N and 40 mM imidazole using a peristaltic pump. All subsequent chromatography steps were performed on an ÄKTA Purifier (GE Healthcare). The loaded column was washed twice with 5 column volumes of Buffer N with 40 mM imidazole and bound protein was eluted in reverse flow using Buffer N with 300 mM imidazole. 2 mL fractions corresponding to ~10 mL of the elution peak were pooled and applied to a 1 mL StrepTrap column (GE Healthcare, #28-9075-46) pre-equilibrated with Buffer N with 2 mM DTT. After sample loading, the column was washed with 10 column volumes of Buffer N and 4 column volumes of Buffer S (25 mM Hepes pH 7.5, 50 mM potassium chloride, 5 mM magnesium acetate, 2 mM DTT). The CMG was eluted in reverse flow using Buffer S with 15 mM d-Desthiobiotin (Sigma, #D1411); 0.5 mL fractions corresponding to the ~1.5 mL elution peak were analysed by SDS-PAGE and stored at 4 °C.
To purify the CMG-DNA complex, the CMG purification protocol was followed until the StrepTrap column loading step. After sample loading, the column was washed for 5 column volumes using Buffer N with 2 mM DTT, followed by 5 column volumes of Buffer S. To form the CMG-DNA complex, ~900 µL of 4 µM forked DNA duplex and 100 µM ATPgS (Jena Bioscience, #NU-406-50) in Buffer S was applied to the column at 0.05 mL/min. The column was first washed in reverse flow for 4 column volumes with Buffer S and 100 µM ATPgS, and the CMG-DNA complex was eluted in reverse flow using Buffer S, 100 µM ATPgS and 15 mM d-Desthiobiotin. 0.5 mL fractions corresponding to the ~1.5 mL elution peak were analysed by SDS-PAGE and stored overnight at 4 °C. The peak fraction was used for grid preparation.
To purify the CMGA complex, the CMG purification protocol was followed using cell cultures that had been transfected with the ACEMam2 construct expressing AND-1 and MCM2-3-5.
Preparation of fork DNA. A fork DNA substrate was generated using two 70 bp oligonucleotides, based on a design by Petojevic and colleagues (Petojevic et al., 2015): Oligonucleotides were supplied PAGE-purified by IDT and resuspended to 100 µM in TE buffer pH 8 (Invitrogen, #AM9849). Equimolar amounts were mixed in 200 µL aliquots at 40 µM in TE supplemented with 50 mM NaCl, prior to annealing. The resulting fork DNA consisted of a 40 bp duplex region and a 30 nt fork comprising a 3¢ poly-dT tail for CMG loading in the correct orientation for unwinding and a 5¢ GC-rich tail that does not bind CMG (Petojevic et al., 2015).
Cryo-EM grids of CMGA complex were prepared as above, with two variations. Firstly, ATPgS was added to the protein sample at a final concentration of 1 mM approximately 1 hour before grid preparation. Secondly, a 5% glutaraldehyde cross-linking solution was added in 1:10 volume ratio to the protein sample approximately 10 minutes before grid preparation. The protein sample was incubated on ice at all times.
Grid samples were initially screened on a Talos Arctica (FEI) operating at 200 keV. Highresolution data were subsequently acquired for a single grid on a Titan Krios (FEI) operating at 300 keV. Automated data collection for Single Particle Analysis (SPA) was performed using the EPU package (FEI). Grid preparation, screening and data collection were performed at the Cryo-EM facility in the Department of Biochemistry.
CryoEM data processing. Statistics for data collection and processing are reported in Table   S1.
CMG-DNA. Data processing was performed on the Cambridge Service for Data-Driven Discovery (CSD3) high-performance computer cluster, using RELION-3 (Zivanov et al., 2018). Motion correction for all 3,694 movies was performed using 5 x 5 patch alignment in MotionCor2 (Zheng et al., 2017) with 'InFmMotion' activated to take into account frame motion blurring. CTF correction was performed using GCTF (Zhang, 2016) against non-dose weighted averages, with 'equiphase averaging' activated. Micrographs yielding resolution limit estimates of >6 Å were discarded. Laplacian-of-Gaussian-(LoG-) based auto-picking with a default threshold of -0.1 was used to pick a total of 845,596 particles across 3,619 micrographs. This particle set was used for all downstream processing without the use of template-driven picking procedures. Particles were initially extracted 4xbinned with a box size of 90 pixels (4.28 Å/pixel). For iterative rounds of 2D classification, the option to 'Ignore CTFs until first peak' was activated. This yielded a final set of 365,202 high-resolution CMG particles, which was then subjected to 3D classification using an internally generated initial model. Particles from 3 out of 4 resulting classes were pooled (297,183 total), re-extracted without binning (360 pixels, 1.07 Å/pixel) and refined against a suitably re-scaled initial model. Subsequent soft mask generation and re-refinement with solvent flattened FSCs yielded a first high-resolution map of CMG at 3.50 Å, with clear density for ssDNA in the pore. CTF refinement, Bayesian particle polishing and additional masked refinement of the polished particles using solvent flattened FSCs improved the resolution to 3.24 Å.
Inspection of the polished map identified some unresolved heterogeneity in the structure, particularly in the C-tier ATPase domains of the MCM ring. The unbinned polished particle set was therefore subjected to a further round of 3D classification (3 classes) without alignments. The majority of the particles (213,527) formed a single DNA-bound CMG class, which was used for further processing. A second 696 particles) was deemed poor quality and discarded, while a third CMG class (52,960 particles) did not have bound DNA.
The high-resolution DNA-bound CMG class was subjected to refinement with solvent flattened FSCs, yielding a final resolution of 3.29 Å and providing clearer definition of the C-terminal ATPase domains. This model was used for the majority of model building. However, ATPase domains for MCM2 and 5 still exhibited some heterogeneity. To resolve this, all particles were first shifted to the centre of mass of the MCM2-7 ATPase ring and density outside this ring was subtracted from the shifted particles, facilitating focussed 3D classification (without alignment) of the MCM2-7 C-tier. The majority of particles (181401, 85%) populated the first of two classes. This presented improved definition for the ATPase domains of MCM2 and 5, and also revealed good density for the C-terminal domains of MCM2 and 6. Masked, solventflattened refinement of this class resulted in a C-tier map extending to 3.41 Å that was used to complete model building.
CMGA: Data processing was performed using Warp (Tegunov & Cramer, 2019) for motion correction, CTF estimation, automated particle picking and particle extraction, and cryoSPARC (Punjani et al., 2017) for all subsequent steps. A total of 140,128 particles were automatically picked and extracted from 1,844 micrographs. An initial round of 2D classification identified 6 classes that presented sensible densities; contributing particles were used to generate two distinct initial models that were characterised as CMG (18,439 particles) and AND-1 (9,600 particles). The CMG initial model exhibited weak additional density adjacent to the GINS/Cdc45 subunits, suggesting a mixed population of CMG and CMGA in the data. Accordingly, the CMG initial model and particle set were subjected to heterorefinement for 2 classes, resulting in a CMG-only particle set (7,860 particles) and a CMGA particle set (10,579 particles). The latter was again subjected to hetero-refinement for 2 classes to remove any remaining CMG-only particles, resulting in an improved CMGA map derived from 8,888 particles.
The highest quality CMG, CMGA and AND-1 models derived from the initial 2D classification were then used to drive iterative rounds of hetero-refinement against the entire 140,128-particle dataset. Once suspected AND-1 particles had been eliminated, CMG particles were gradually removed from the CMGA dataset until a clean set of 15,393 particles produced a map for CMGA with continuous density for the trimeric And1 SepB domain. This set was finally subjected to non-uniform refinement to deliver a final map with a resolution of 6.77 Å.
Single-stranded DNA was built in Coot (Emsley & Cowtan, 2004). Models for the ATPase domains of MCM2 and 5, and for the C-terminal domains of MCM2 and 6, were completed using the focussed C-terminal map. The complete CMG-DNA model was subsequently refined using phenix.real_space_refine (Adams et al., 2010) with bond-length and angle restraints for bound ATPgS, magnesium and zinc ions.
For the CMGA structure, the CMG structure and the crystal structure of the AND-1 trimer (5OGS)  were docked in the map and subjected to rigid-body refinement in phenix.real_space_refine (Adams et al., 2010). The N-tier ring of MCM proteins was treated as a single rigid body, whereas individual ATPase domains in the C-tier were allowed to move independently.
Statistics for real space refinement are reported in Table S2. Figures were prepared using Chimera UCSF (Pettersen et al., 2004) and ChimeraX (Goddard et al., 2018).