Tracking in atomic detail the functional specializations in viral RecA helicases that occur during evolution

Many complex viruses package their genomes into empty protein shells and bacteriophages of the Cystoviridae family provide some of the simplest models for this. The cystoviral hexameric NTPase, P4, uses chemical energy to translocate single-stranded RNA genomic precursors into the procapsid. We previously dissected the mechanism of RNA translocation for one such phage, ɸ12, and have now investigated three further highly divergent, cystoviral P4 NTPases (from ɸ6, ɸ8 and ɸ13). High-resolution crystal structures of the set of P4s allow a structure-based phylogenetic analysis, which reveals that these proteins form a distinct subfamily of the RecA-type ATPases. Although the proteins share a common catalytic core, they have different specificities and control mechanisms, which we map onto divergent N- and C-terminal domains. Thus, the RNA loading and tight coupling of NTPase activity with RNA translocation in ɸ8 P4 is due to a remarkable C-terminal structure, which wraps right around the outside of the molecule to insert into the central hole where RNA binds to coupled L1 and L2 loops, whereas in ɸ12 P4, a C-terminal residue, serine 282, forms a specific hydrogen bond to the N7 of purines ring to confer purine specificity for the ɸ12 enzyme.


INTRODUCTION
Viruses protect their genome by condensing it into a compartment, the virion. Many complex viruses rely on rapid encapsidation by energy-dependent transport of the nucleic acid into an empty preformed capsid (procapsid). This process requires the presence of portal complexes, which are conduits for nucleic acid molecules, and molecular motors that convert the chemical energy gained from nucleoside triphosphate (NTP) hydrolysis into mechanical movement, resulting in nucleic acid translocation.
Some viruses, including herpesvirus and tailed doublestranded DNA (dsDNA) bacteriophages, package their genome using a multi-protein packaging motor (terminase) that transiently assembles at a single vertex (1)(2)(3)(4). These complexes are relatively elaborate, consisting of a large dodecameric portal that is an integral part of the capsid and an oligomeric transiently associated terminase, neither of which can work in the absence of the other. The ATPase-nuclease terminase subunit is responsible for recruiting the viral DNA to the procapsid. Compacting relatively stiff dsDNA into a small volume of the procapsid has a high energy cost. Single-molecule experiments have revealed that viral packaging proteins can exert forces as high as 110 pN on dsDNA, making them some of the strongest known biological motors (5).
Similarly, dsRNA bacteriophages of the Cystoviridae family (bacteriophages f6 through to f14, and f2954) encapsidate single-stranded RNA (ssRNA) genomic precursors into procapsids (6). However, their packaging *To whom correspondence should be addressed. Tel: +44 1865 287560; Fax: +44 1865 287549; Email: erika@strubi.ox.ac.uk machinery is less complex, consisting of a hexamer that is at the same time the physical portal and the active genome translocating motor (7,8). Although this motor shares the same function of translocating the genomic nucleic acid into the procapsid, the challenges differ between ssRNA and dsDNA. ssRNA is significantly more flexible (persistence length l p $1-2 nm) than dsDNA (l p $50 nm) (9), and the packaging densities are less than those found for dsDNA viruses (10); therefore, high forces are probably not required. However, naturally occurring ssRNAs, such as the genomic precursors, exhibit extensive local secondary structure (11,12), and thus the packaging motor has to exhibit helicase activity.
P4 NTPases are structural components of the procapsid, built by co-assembly of 120 copies of the major structural protein P1 with $10 copies of the viral RNA-dependent RNA polymerase P2, 10 hexamers of P4 and 12 trimers of the assembly cofactor P7 (24) (Figure 1). In bacteriophage f6, P4 hexamers nucleate procapsid assembly in vitro (7,25), are essential for genome packaging (21) and also have a role in transcription (21,26). Up to 12 P4 hexamers lie on the 5-fold symmetry axes of facets of the procapsid (16,24,27), creating a symmetry mismatch. Although the P4 hexamer constitutes the packaging motor, the specificity for viral RNA is mediated by RNA-binding sites on the P1 shell, which recognize three distinct packaging signals on the genomic precursors (28,29).
Previous studies have revealed the structure and mechanism of f12 P4 (30)(31)(32). P4 is a protein of $35 kDa, which can assemble into a hexameric ring. NTP-binding sites are located on the external perimeter of the ring at the interfaces between adjacent subunits, whereas the nucleic acid binding sites are found in the central channel (31) (Figure 1). P4 proteins are the only known RNA-specific helicases belonging to helicase Superfamily 4 (SF4) (33). SF4 encompasses mainly DNA helicases and is characterized by five conserved sequence motifs (H1, H1a, H2, H3 and H4) (34). Motifs H1, H1a and H2 are involved in nucleotide binding and hydrolysis, whereas H3 is involved in the coupling of NTP hydrolysis to nucleic acid translocation, and H4 in oligonucleotide binding. Crystal structures of f12 P4 at different key catalytic states of the protein unveiled a power stroke mechanism by which a conformational change associated with sequential NTP hydrolysis is responsible for RNA translocation (31,35,36).
P4 NTPases show little sequence similarity; however, they are believed to share a common architecture and mechanism of action. When recombinant P4 proteins are studied in isolation, they show variation in their in vitro biochemical properties (Table 1): f8 and f13 P4 NTPases form stable complexes with RNA and their ATPase activities are strongly stimulated by RNA (f8 has no detectable ATPase activity in absence of RNA), whereas f6 and f12 P4s bind RNA transiently and are only weakly stimulated; the isolated P4 hexamers of f8 and f13 have measurable helicase activities in vitro in contrast to f6 P4, which only acquires processive helicase activity in the context of the procapsid (30); the f12 P4 hexamer has low translocation processivity and lacks helicase activity (36); the NTPase activity of f12 P4 is specific to purine bases (26), whereas the other P4s can also accept pyrimidine bases (8,40). These differences in biochemical properties are presumably reflected in the hexamer architecture and structural details of different domains. To gain further insights into RNA loading, interaction and translocation mechanisms and the structural evolution of these packaging enzymes, we have solved the crystal structures of three additional P4 proteins, from f8, f13 and from the prototype virus of the cystoviral family, f6. We also report here the structural and/or biochemical characterization of f12 P4 mutants to explain nucleotide specificity and RNA recognition. We compare these structures with that of wild-type f12, whose structure has already been reported (31), creating a series of structurally related viral packaging motors.
Recombinant P4 proteins were expressed in Escherichia coli BL21(DE3) or B834(DE3) and purified to homogeneity as previously described (31,32,42). Briefly, E. coli cells were grown at 37 C in Luria-Bertani medium until OD 540nm reached 0.5-0.6. Cultures were then chilled on ice and induced with 1 mM isopropyl-b-thiogalactopyranoside. Induced cells were further incubated for 12-14 h at 17-18 C, harvested by centrifugation and lysed with a French pressure cell. P4 proteins were purified by chromatography: Heparin and Q-sepharose columns (GE Healthcare) followed by size exclusion chromatography (Superdex 200, GE Healthcare).

Crystallization
Crystallization conditions of the P4 proteins have been previously described (32,42). In brief, crystals of f6 P4Á310 proteins were grown at 24 C from a 3.5 mg/ml protein solution in 20 mM HEPES (pH 8.0), 5 mM MgCl 2 , 2 mM CaCl 2 , 5 mM adenosine diphosphate (ADP) and 100 mM NaCl, and they appeared after 9 months in drops in which 3 ml of protein had been mixed with 3 ml of a reservoir solution consisting of 6% PEG 4000 and 90 mM sodium acetate (pH 4.5). Crystals were cryo-protected by transferring them into reservoir solution with a final glycerol concentration of 25% before freezing in a nitrogen-gas stream at À173 C.
From a 12 mg/ml protein solution, f13 P4 crystals were grown at 20 C using 100 mM Tris-HCl (pH 7.0), 900 mM trisodium citrate and 200 mM NaCl as precipitant. Crystals were cryo-protected as f6 P4Á310, but using a final glycerol concentration of 20%.
The f8 P4 crystals were grown at 24 C in 100 mM sodium acetate (pH 4.6) and 2.2 M ammonium sulphate as a precipitant. Drops consisted of 0.9 ml of protein at a concentration of 3 mg/ml, 0.9 ml of reservoir solution and 0.4 ml of 100 mM dithiothreitol (DTT). Crystals of f8 P4Á281 obtained from a protein solution concentrated to 5 mg/ml appeared in 100 mM Tris (pH 8.0) and 18% PEG 1000. Crystals were cryo-protected following the protocol for f6 P4Á310.
Crystals of f12 P4 mutants were obtained in a solution composed of 10% PEG 1500 in 100 mM sodium acetate (pH 4.8) and 5 mM AMPcPP. Crystals of wild-type f12 P4 with UTP were obtained with the same precipitant and 5 mM UTP.
The structure of f13 P4 was solved by single-wavelength anomalous dispersion as described elsewhere (47). The substructure was determined using the program SHELX (48), and phases were refined using SHARP (49). After 6-fold non-crystallography symmetry averaging using General Averaging Program (unpublished program available from D. I. Stuart or J. M. Grimes), an interpretable electron density map was obtained into which the structure could be built.
The structure of f6 P4 was solved by molecular replacement with the crystal structure of the f13 P4 as a search model. The search model included one hexamer in which each chain was truncated to the conserved ATPase core of the protein. A weak molecular replacement solution comprising two truncated hexamers was found by the program AMoRe (50). The preliminary phases were greatly improved by 12-fold non-crystallographic symmetry averaging and phase extension from low resolution using General Averaging Program. The last 34 residues of the f6 P4Á310 construct were not visible in the electron density; their absence might be due to proteolysis, which would explain the long crystallisation period.
The structure of f8 P4 was initially solved by singlewavelength anomalous dispersion from crystals of the selenomethione labelled protein in space group P622 containing one monomer in the asymmetric unit. HKL2MAP (48) was used to identify the selenium sites, which were then fed into PHENIX AUTOSOL (51), resulting in an interpretable electron density map for the ATPase core domain. The electron density corresponding to the rest of the protein was not interpretable owing to the statistically disordered crystal reported previously (42). The hexameric P4 was formed by applying the crystallographic symmetry and used as search model for molecular replacement with the program PHASER (46) to find a solution for f8 P4 (R32 space group) and f8 P4Á281 (P2 1 2 1 2 space group).
Manual building was performed with the program COOT (52) and restrained refinement (with TLS) with either AUTOBUSTER (53)

Hydrogen-deuterium exchange mapping
Previously published hydrogen-deuterium exchange (HDX) data for f8 P4 were used (37) and mapped onto the high-resolution structure presented in this work using average rate colouring as described (37).

ATPase activity of mutants
ATPase activity of f12 P4-binding site mutants was assayed using the EnzChek phosphate assay kit (Invitrogen) (39).

Evolutionary analysis of structures
The coordinates of the ATPase core of P4 from f8 (residues 104-261) were submitted to the DALI Server (56), a program that identifies and ranks proteins by structural similarity. The DALI search returned 47 proteins, which have significant structural similarity to P4. All these proteins were then truncated to their core ATPase domains, and using the program SHP superimposed onto one another, and a matrix of structural relationships was calculated (57).

Overall fold
All P4 proteins form a hexameric ring with a central channel varying in size from 13 to 21 Å (30 Å for f8 P4Á281) and external diameter of $100 Å ( Figure 2). However, the hexamers have different charge distributions on their surfaces (Supplementary Figure S1) and different outline shapes: f6 P4, f8 P4 and f13 P4 form hexagonal notched rings, whereas f12 P4 has a smoother contour. The subunit interface within hexamers varies in size from $1500 to 1900 Å 2 , and the number of hydrogen bonds, salt bridges and hydrophobic interactions shows substantial variation (Supplementary Table S3). The interfaces within the P4 hexamers are more polar than expected for a stable oligomer. This is because rings of hexameric helicases are generally required to open to load the nucleic acid strand into the central cavity (Table 1) (58,59). The rounder f12 P4 subunits bury the biggest surface area and form the highest number of hydrogen bonds and salt bridges, whereas the interaction area is least for f8 P4, which harbours fewer hydrogen bonds and only three salt bridges. The buried area does not correlate with P4 ring stability. For example, f12 P4 has been shown to exhibit frequent ring opening unless it is bound to the procapsid (38), leading to low translocation processivity (36). On the other hand, f8 P4 is a processive translocase and opens only during loading a new RNA strand into the central channel (37). Ring stability correlates instead with the fraction of buried polar interactions (hydrogen bonds and salt bridges) per buried area. The less stable f6 and f12 hexamers have 0.016 and 0.018 polar contacts per Å 2 respectively, whereas the more stable f8 and f13 exhibit values of 0.13 and 0.15, respectively.

ATPase core domain
Within the hexamer, the different P4 monomers adopt similar orientations and can be divided into three domains: an N-terminal region (110-150 residues), a central core NTPase domain of $160 residues and a smaller C-terminal domain ($40-50 residues) (blue, grey and red, respectively, in Figure 2). Strikingly, despite low overall sequence conservation ranging from 9 to 21% amino acid sequence identity, the key structural features of the ATPase core domain (motifs H1, H1a, H2, H3 and H4) are well-conserved (Figure 3A andB). The ATPase domain is a Rossmann-type nucleotide-binding domain consisting of a twisted seven-stranded b-sheet with   Table 2), except for one residue in motif H4 (residue K241 in f12 P4), which has no equivalent in f8 P4 (see explanation for this later in the text). It is therefore likely that all cystoviral P4 NTPases use an RNA translocation mechanism similar to that described for f12 P4 (31), although details may vary, especially for f8 P4 where a tight coupling between ATPase activity and RNA binding is observed (Table 1).
Structural classification based on the ATPase core domain shows that cystovirus P4 proteins are closely related to each other and only distantly related to other P-loop ATPases (Figure 4 and Supplementary Figure S2

N-terminal domain
The structural conservation across P4 proteins of the central ATPase core domains does not extend to the Nand C-terminal domains. Most of the N-terminal domain residues of P4 from f6 and f8 are visible in our crystal structures (starting from amino acid residues 2 and 12, respectively), whereas f13 P4 lacks the first 32 residues [which are predicted to be disordered (60)]. In all P4 structures, the N-terminal domain covers the apical part of the hexamer (Figure 2), and in f12 P4, an N-terminal domain a-helix projects from one subunit to the adjacent one, giving the hexamer a more rounded appearance. f6 P4 lacks such a helix and might stabilize the hexamer by strengthening subunit interfaces with nucleotides. f6 P4 is the only P4 that needs nucleotides and divalent cations to form hexamers (7). It is also conceivable that NTP binding triggers a conformational change in the f6 P4 subunits allowing them to form hexamers. Interestingly, f8 and f13 P4s also lack such a stabilizing helix; however, the first 12 and 31 residues, respectively, are not visible in the crystal structures and might play such a stabilizing role.
The N-terminal domains of cystoviral P4s are highly divergent (Figures 2, 3B and C). However in f6 and f13, more than half of their residues can be superimposed with a root-mean-square deviation of 2.1 Å , including two parallel helices and two small anti-parallel b-sheets, creating a topologically identical sub-domain ( Figure 3C). In f8 and f12, the N-terminal domains have higher secondary structure content but are completely unrelated to each other and to those in f6 and f13. In f12 P4, the N-terminal domain is composed of two orthogonal a-helices and three anti-parallel b-sheets ( Figure 3C). The f8 P4 N-terminal domain is composed of two helices separated by a four-stranded antiparallel bsheet ( Figure 3C). Structural alignment searches against the PDB database returned no significant matches for any of the N-terminal domains, aside from a weak structural similarity (43 of 87 residues within 3.7 Å ) of f8 P4 to one half of a C2 domain (domain involved in targeting proteins to cell membranes; Figure 3C). Intriguingly, f8 lacks the P8 nucleocapsid protein layer present in other cystoviruses so that P4 proteins (together with P1 shell) interact directly with the viral lipid membrane (10).

C-terminal domain
The C-terminal domain of P4 comprises $40-50 amino acid residues downstream of the ATPase core ( Figure 2) expected to be located at the bottom of the hexamer and to be essential for binding to the capsid protein P1 (38,61). The C-terminal domains of P4 proteins diverge substantially. In f6 and f13, the C-termini are predicted to be disordered with little secondary structure (60), and indeed, no density for these domains could be found in our crystal structures. In contrast, the corresponding regions in f8 and f12 are predicted to be mostly ordered (60) with a C-terminal helix preceded by a flexible loop. In P4 f12, the strand following the arginine finger motifs extends back into the ATP-binding site contributing two residues (Y288 and S292), which help position the nucleotide ring (see later in the text). The density for the amino acid chain then disappears to re-emerge into a C-terminal helix stacked at the bottom of the hexamer (Figure 2). In P4 f8, the strand following the arginine fingers motifs does not extend as far as the ATP-binding site but instead climbs back along the side of the hexamer (partially  disordered) to re-emerge into as C-terminal helix at the top of the hexamer ( Figure 2B), followed by a loop that dives into the central channel restricting its diameter by more than half (see later in the text for more discussion on the C-terminal domain).

Nucleotide binding site
The f6 P4 was crystallised with ADP-Mg 2+ bound in the nucleotide binding site, whereas P4 from f8 and f13 were crystallized in their apo form. As for f12 P4, and other hexameric NTPases, the nucleotide binding sites in f6 P4 are located at the interfaces between neighbouring subunits. The ADP phosphate groups are bound via the conserved Walker A (H1) motif residues (K132, S133) ( Figure 5); a conserved glutamate E150 (H1a) is positioned to catalyse the nucleophilic attack on the g-phosphate, whereas D187, a conserved aspartate in the Walker B motif (H2), co-ordinates the magnesium ion. A sensor motif detecting the presence or absence of the g-phosphate of NTP and modulating allosteric transitions of the RNA binding loop L2 in response to ATP binding and hydrolysis was identified in P4 from f12 (N234) (31). The equivalent residue in f6 P4, N232, is positioned to contact the g-phosphate of the NTP ( Figure 5) and might fulfil the same role. As the mechanism of NTP binding and hydrolysis is similar, it is likely that the equivalent conserved residues in P4 from f8 and f13 ( Figure 5 and Table 2) play analogous roles. It has been shown that f12 P4 possesses two essential 'arginine fingers' (35). We find that all P4 proteins follow this unusual pattern ( Figure 5 and Table 2). Arginine fingers can contact the g-phosphate of the triphosphate from a neighbouring subunit, and the insertion of this residue in a catalytic site is believed to stabilize the transition state, thus facilitating ATP hydrolysis. Arginine fingers in P4 proteins are all contributed from the same region (a loop between two strands in the C-terminal region) but display different conformations ( Figure 5). In P4 from f6, f12 and f13, the arginine fingers are pointing towards the catalytic sites, making the subunits competent and primed for hydrolysis. However, in f8 P4, these residues are displaced >8 Å from that position and therefore cannot contribute to catalysis. This suggests that in f8 P4, extensive conformational changes occur as a consequence of nucleotide and/or oligonucleotide binding, which render the enzyme competent for catalysis. Indeed, nucleotide binding kinetics revealed a firstorder rate limiting step, which is consistent with a conformational change associated with ATP binding (39,62).
In RecA-like ATPases, bound nucleotides are stabilized by stacking of the adenine moiety between side chains, but these side chains are not conserved and are contributed from different regions. In RepA and T7 helicases, the ATP base stacks against residues belonging to the subunit carrying the catalytic site. In f12 P4 (31), as in RepA (63), the nucleotide base is sandwiched between Y288 from the catalytic subunit and Q278 from the neighbouring subunit. In f6 P4, a much looser stacking of the nucleotide base is observed, with only one side chain (F275) stabilizing the adenine ring ( Figure 5). From our structures, we predict similar loose arrangements in P4 from f8 and f13 where F247 (from the same subunit) and F301 (from a neighbouring subunit) seem to be in the correct orientation to stack the nucleotide base. The difference in the arrangement of the nucleotide binding motifs is likely to explain the mechanism of base-specific hydrolysis in different P4s. Of the P4s, only f12 is purine specific, with pyrimidines also being accepted by f16, f18 and f13 (Table 1).
To understand this catalytic mechanism in detail, we performed side-directed mutagenesis of the residues in f12 P4 involved in binding the nucleotide ring and analysed the mutants structurally and biochemically. In f12 P4, the stacking interaction is critical for nucleotide binding, as replacement of the tyrosine with alanine (Y288A) completely abolished ATP binding and ATPase activity (Table 1) so that the apoprotein structure is found even in the presence of ATP (data not shown). However, the mutation Q278A had only a moderate effect on ATPase activity and virtually no effect on the structure of the bound ATP analogue AMPcPP when compared with the wild-type ( Figure 6A and C), primarily increasing the K M as a result of reduced nucleotide affinity (Table 1). Hence, the stacking interactions primarily determine nucleotide affinity but not specificity. A specific feature in f12 P4 is a hydrogen bond between the hydroxyl of S292 and N7 of the purine ring. The substitution S292A did not prevent ATP binding but completely abolished ATPase activity owing to misplacement of the triphosphate moiety in the active site ( Figure 6D). A displacement is also seen when the AMPcPP bound wild-type structure is compared with that of UTP bound hexamer ( Figure 6A and B). This confirms that pyrimidine triphosphates can bind the hexamer without being hydrolysed (36) and should act as competitive inhibitors. Indeed, we find that UTP effectively competes with ATP and inhibits hydrolysis (data not shown). Hence, purine specificity is achieved by locking the base by hydrogen bonding to the N7 site of a purine. The correct coordination of the base results in the precise alignment of the nucleotide that is essential for catalysis so that UTP is misaligned and not hydrolysed. This is probably the mechanism underpinning the dependence of helicase efficiency on the type of nucleotide. For example, T7 gp4 helicase activity is optimal in presence of dTTP (58).

Nucleic acid binding site
It has been proposed that P4 hexamers bind nucleic acid through their central channel via two protruding loops named L1 and L2 (31) (Figure 3A andB, Supplementary Figure S3). Mutagenesis studies confirmed that these loops are essential for nucleic acid binding and translocation (30,35,37). Structurally homologous loops were reported to bind ssDNA and ssRNA, respectively, in crystals of the E1 helicase of bovine papilloma virus and Rho of E. coli (59). The L1 loops in P4 are rich in residues that contribute to flexibility (in f12 P4 they are disordered), whereas the L2 loops are mainly composed of hydrophilic residues, amongst them a lysine, which in f12 P4 (K241) was shown to be essential for RNA binding (35). The structures of P4 from f6 and f13 show ordered L1 loops, which line the central channel and contact the L2 loops (Supplementary Figure S2). The L2 loops are found with lysine residues (K239 and K265, respectively) projecting towards the centre of the channel, in the same position as K241 in f12, suggesting a conserved mechanism for binding and translocating RNA. Although the L2 loop of f8 P4 contains hydrophilic residues (DDENVD), it does not project a lysine side chain towards the central channel. Nevertheless, the L1 loop contains a motif (LKK) that has been shown to be crucial for RNA binding (35). The first lysine of this motif (K185) is found in the equivalent position to K241 of f12 P4 and is also seen interacting with D220 of loop L2. We therefore postulate that K185 (loop L1) in f8 P4 plays the same role in RNA binding as K241 (loop L2) in f12 P4, and that the coupling of the movement of the L1 and L2 loops to ATP hydrolysis via motion of helix 6, as proposed for f12, may be a general feature of all P4 molecules (Supplementary Figure S2). The importance of the L1 loop is further supported by mutational analysis in f12 P4: deleting L1 loop central residues T202-T203-S204 or mutating them into the equivalent residues of f8 P4 (LKK) completely abolishes the ATPase activity (Table 1). This demonstrates that the integrity of the L1 loop is essential for ATP hydrolysis, despite being distal to the ATP active site.
RNA loading in r8 P4 and the structural basis of processive translocation The f8 P4 ATPase activity is tightly coupled to ssRNA translocation, as it will only hydrolyse ATP in the presence of ssRNA. As noted earlier in the text, the RNA binding motif LKK in loop L1 is located in the middle of the central channel (37). Nucleic acids are likely to bind in the channel, ensuring topological enclosure of the strand and processive translocation.
Based on transient cooperative exposure of subunit interfaces to HDX on RNA binding (residues 198-209 in Figure 7), it was suggested that RNA enters the central channel via a transient ring opening (37). The deletion of the C-terminal portion of the protein (residues 282-321) more than doubles the diameter of the central channel (from 13 to 30 Å ), as the C-terminus wraps upwards from the base of the hexamer, along the inter-subunit cleft, to stick down into the central channel ( Figure 8). As the C-terminal domain is (i) necessary for ATP hydrolysis (data not shown), (ii) restricts the diameter of the central channel and (iii) blocks the interface through which RNA is thought to be loaded, we postulate that the C-terminal region needs to be displaced by RNA for ring opening and subsequent ATP hydrolysis to occur. To verify this hypothesis, previous HDX experiments (37) were further analysed by mapped to the f8 P4 structure.
The C-terminal region exhibits the fastest HDX within the protein (Figure 7). However, the distal C-terminal portion that extends into the central channel is marginally protected in the absence of RNA and becomes fully exposed only on addition of RNA, implying that this region becomes further exposed presumably by expulsion from the central channel ( Figure 8B). Thus, it appears that f8 P4 has developed a specific mechanism to regulate ATPase activity and couple it with ssRNA binding such that RNA displaces the C-terminal domain, to allow ATP hydrolysis to occur. This would explain the tight coupling observed between ATP hydrolysis and translocation.

CONCLUSION
The current study broadens our understanding of the mechanism used by dsRNA bacterial viruses to package RNA genome during assembly. Interestingly, P4 proteins are only remotely related to packaging ATPases of dsDNA viruses such as gp17 from bacteriophage T4 (64) or pUL15 from Herpex Simplex virus 1 (65), which have more complicated portal complexes. Recently, however, it has been suggested that the ATPase of the phi29 DNA packaging motor is a member of the hexameric AAA+ superfamily (66), indicating that the mechanism of nucleic acid packaging might be similar.
A structure-based phylogeny (Figure 4) suggests that the RecA-like proteins may be the closest cellular relatives of the P4, with f12 being the most similar to the cellular proteins, f8 being rather divergent and f6 and f13 rather similar to each other and intermediate in terms of divergence from the cellular proteins. These structural variations map onto the various functional specializations of the molecules so that although the motors have a common catalytic mechanism, they have developed somewhat different specificity and control mechanisms. We identify a specific hydrogen bond (serine 292 and N7 of the purine ring) responsible for the purine specificity of f12 P4 catalysed NTP hydrolysis reaction and find that an extraordinary insertion of the C-terminal peptide into the central channel of the hexamer explains the tight coupling of ATPase activity and RNA translocation in f8. Furthermore, the f8 P4 structure revealed a novel Figure 7. Mapping of HDX data on the f8 P4 structure. HDX rates are coloured from slow-exchange (blue) to fast-exchange rates (red). Previously measured HDX rates (53) for f8 P4 in the presence/absence of AMP, ADP, ATP and RNA (as indicated) were mapped onto the f8 P4 monomer structure. The central box shows on the left, the orientation of all the monomers of the figure within the hexamer, and on the right, the same monomer in which the N-and C-terminal domains are coloured in blue and red, respectively. mechanism of power transduction to the RNA in which RNA is engaged with the L1 loop, which, in turn, is coupled to the L2 loop. Comparison between the P4 structures suggest that coupling between the two loops may be a general mechanistic feature of P4 and perhaps other SF4 helicases. Overall, the P4 machine represents a remarkable test bed where, by virtue of high mutational rates over long periods of time, nature has been able to devise a range of functional variations on the basic theme of regulated RNA translocation, resulting in an array of systems where although the molecular engine remains largely similar, the ignition and transmission systems have diverged markedly.