The way is the goal: how SecA transports proteins across the cytoplasmic membrane in bacteria

Abstract In bacteria, translocation of most soluble secreted proteins (and outer membrane proteins in Gram-negative bacteria) across the cytoplasmic membrane by the Sec machinery is mediated by the essential ATPase SecA. At its core, this machinery consists of SecA and the integral membrane proteins SecYEG, which form a protein conducting channel in the membrane. Proteins are recognised by the Sec machinery by virtue of an internally encoded targeting signal, which usually takes the form of an N-terminal signal sequence. In addition, substrate proteins must be maintained in an unfolded conformation in the cytoplasm, prior to translocation, in order to be competent for translocation through SecYEG. Recognition of substrate proteins occurs via SecA—either through direct recognition by SecA or through secondary recognition by a molecular chaperone that delivers proteins to SecA. Substrate proteins are then screened for the presence of a functional signal sequence by SecYEG. Proteins with functional signal sequences are translocated across the membrane in an ATP-dependent fashion. The current research investigating each of these steps is reviewed here.


INTRODUCTION
In bacteria, Sec-dependent translocation of proteins across the cytoplasmic membrane can occur by two different mechanisms: (i) a translationally coupled mechanism, which is conserved in all organisms and which is mediated by the signal recognition particle (SRP), and (ii) a bacteria-specific mechanism that is uncoupled from protein synthesis, which is mediated by the AT-Pase SecA (Fig. 1). In the translationally coupled pathway, recognition of nascent Sec substrates by the SRP ultimately results in binding of the ribosome to the protein-conducting channel in the cytoplasmic membrane (SecYEG), such that the nascent substrate protein is effectively synthesised directly across (or inserted directly into) the membrane (Saraogi and Shan 2014). If a Sec substrate protein is not recognised by the SRP, it is targeted for translocation by the SecA-mediated pathway (Lee and Bernstein 2001;Schierle et al. 2003). In this pathway, translocation of substrate proteins is independent of (i.e. 'uncoupled from') protein synthesis (Josefsson and Randall 1981a,b;Randall 1983). In Escherichia coli, a substantial proportion of the proteome is dependent on the SecA-mediated pathway for localisation, including most outer membrane proteins (OMPs) and soluble periplasmic proteins (PPs) (Oliver andBeckwith 1981, 1982a,b;Huber et al. 2005a). The near universal conservation of SecA in other bacteria suggests that this pathway is similarly important in all bacteria (van der Sluis and Driessen 2006).  Approximately 20% of all proteins synthesised in E. coli are ultimately translocated across the membrane by the Sec machinery. A minority of these newly synthesised proteins (∼7.5%) are integral cytoplasmic membrane proteins (IMPs), most of which are thought to be inserted into the membrane in a translationally coupled fashion (left). The rate of insertion of these proteins is ultimately limited by the rate of translation elongation (∼20 amino acids/s). A much larger fraction of newly synthesised Sec substrates (∼13.5% of all proteins synthesised) are translocated across the membrane by a bacteria-specific mechanism, which is dependent on the ATPase SecA (right). The rate of SecA-mediated translocation is uncoupled from protein synthesis and is much faster (>125 amino acids/second) than the rate of translation elongation (∼20 amino acids/second), which could allow the simultaneous synthesis of multiple substrate proteins destined for the same SecYEG channel.
SecA-mediated translocation can be divided into two steps: (i) targeting of substrate proteins to the membrane-bound translocation machinery and (ii) translocation through SecYEG across the membrane. Research over the past several years has significantly advanced our understanding of mechanism of both of these steps. Because several recent reviews have focussed on the mechanism of translocation (Park and Rapoport 2012;Chatzi et al. 2013;Collinson, Corey and Allen 2015), this review focuses more closely on the steps preceding translocation across the membrane.

Why do bacteria have a SecA-mediated translocation pathway?
One perennial questions is: Why do bacteria have two translocation pathways? Although there is not a definitive answer to this question, one potential reason is that bacteria contain a limited number of SecYEG channels. Most bacteria do not extensively invaginate their cytoplasmic membranes or form subcellular compartments dedicated to protein section (e.g. the endoplasmic reticulum of eukaryotes). In addition, the bacterial Sec machinery shares the cytoplasmic membrane with a host of other machineries (e.g. the respiratory chain, cytochromes, F-ATPases, small-molecule transporters, flagella, other secretory systems, etc). Estimates of relative protein abundance from protein synthesis rates suggest that there are not enough chan-nels to support translocation exclusively by the translationally coupled mechanism (Li et al. 2014). Approximately 21% of all newly synthesised proteins are PPs, OMPs or integral cytoplasmic membrane proteins (IMPs) (Li et al. 2014). Assuming that an exponentially growing E. coli cell contains around 50 000 ribosomes (Bremer and Dennis 2008), stoichiometric ratios of proteins derived from ribosome profiling experiments suggest that there are ∼5000 SecY molecules per cell (Li et al. 2014). Some studies have estimated that there are as few as 500 copies of SecY per cell, but the number of ribosomes per cell in these instances is proportionately smaller (Matsuyama, Akimaru and Mizushima 1990;Wang et al. 2015). Numbers derived from Li et al. (2014) also suggests that there are ∼1.8 million Sec substrate proteins (IMPs, PPs and OMPs) with a combined length of around 420 million amino acids. If translocation were purely cotranslational, all copies of SecY in the cell would be occupied and would need to translocate around 50 amino acids per second at standard rates of growth (generation time of 25-30 min), i.e. more than double the maximum rate of translation elongation (Bremer and Dennis 2008). Thus, cotranslational translocation cannot likely keep pace with the rate of production of Sec substrate proteins. Other 'back-of-the-envelope' calculations have yielded similar conclusions (Pugsley 1993;Collinson, Corey and Allen 2015). However, if translocation is divorced from protein synthesis, multiple ribosomes could simultaneously synthesise proteins that are destined for translocation through the same SecYEG channel (Fig. 1). If translocation is much faster than translation elongation, translocation of all of these proteins could be accomplished in the same amount of time it would take to translocate a single protein cotranslationally. For example, assuming only integral membrane proteins (IMPs) are inserted cotranslationally (Ulbrandt, Newitt and Bernstein 1997;Schibich et al. 2016), ∼3700 SecYEG channels (i.e. ∼75% of the total) would be needed to insert all IMPs, and the rate of cotranslational insertion would be about 30 amino acids per second-much closer to the rate of translation elongation in vivo (Li et al. 2014). If the remaining ∼1300 SecYEG channels are left to translocate ∼1.2 million PPs and OMPs per generation (with a total of around 230 million amino acids), the minimum rate of SecA-mediated translocation would be ∼125 amino acids per second. If the channel is dimeric during SecA-mediated translocation (see discussion under 'the SecYEG complex' below), this number would be closer to 250 amino acids per second.

The Sec machinery
The SecYEG complex The central component of the Sec machinery is the evolutionarily conserved protein conducting channel in the cytoplasmic membrane (Park and Rapoport 2012). In bacteria, this channel is formed by an integral membrane protein complex composed of SecY, SecE and SecG, which are present in a 1:1:1 stoichiometry. The main component, SecY, is homologous to the eukaryotic Sec61α and archaeal SecY proteins (Park and Rapoport 2012;Collinson, Corey and Allen 2015). SecY contains 10 transmembrane domains arranged in pseudo-2-fold symmetry, which forms an aqueous channel in the cytoplasmic membrane that is shaped like an hourglass ( Fig. 2A-F). At the centre of the hourglass, there is a narrow constriction that is lined by long-chain aliphatic residues (Van den Berg et al. 2004) (Fig. 2G-I). Substrate proteins pass through this constriction during translocation across the membrane (Cannon et al. 2005;Li et al. 2016) (Fig. 2C, F and I). In the resting state, the channel is blocked from the exterior by a small α-helical 'plug' domain ( Van den Berg et al. 2004;Li et al. 2007) (Fig. 2A, D and G). Finally, SecY contains a lateral gate between the halves of the protein, which opens to allow partitioning of transmembrane helices and signal sequences into the membrane   (Fig. 2G). Opposite the lateral gate is a 'hinge' that links the two halves of SecY (Van den Berg et al. 2004). SecE and SecG appear to stabilise SecY. SecE binds to the exterior of SecY spanning both sides of the hinge (Van den Berg et al. 2004) (Fig. 2D and G), and SecY is rapidly degraded in its absence (Taura et al. 1993). SecG is not essential for translocation (Brundage et al. 1990;Nishiyama, Hanada and Tokuda 1994), but mutations disrupting SecG decrease the rate of translocation (Nishiyama, Hanada and Tokuda 1994). In addition, genetic evidence suggests that it stabilises the nontranslocating form of the channel (Belin et al. 2015).
Structural and biophysical studies suggest that SecYEG occupies at least three conformations: (i) closed, (ii) partially open and (iii) open (Van den Berg et al. 2004;Zimmer, Nam and Rapoport 2008;Allen et al. 2016;Li et al. 2016). In the closed, nontranslocating state, the plug domain blocks the exterior opening to the constriction, and the lateral gate is tightly closed (Van den Berg et al. 2004) (Fig. 2A, D and G). Single-molecule fluorescence measurements suggest that binding of ADP-bound SecA to SecYEG results in a small increase in the diameter of the channel, resulting in formation of the 'part-open' conformation (Allen et al. 2016). However, binding to ATP causes a large dilation of the constriction and a partial destabilisation of the plug (Zimmer, Nam and Rapoport 2008;Allen et al. 2016) (Fig. 2B, E and H). Opening of the channel could be further stabilised during translocation by the intercalation of a signal sequence or transmembrane helix into a binding site on the exterior of the lateral gate   (Fig. 2F and I). Mutations known as prl mutations (Bieker, Phillips and Silhavy 1990) have been isolated in the genes encoding all three components of the channel. These mutations allow the translocation of proteins with defective (or absent) signal sequences in vivo (Bieker, Phillips and Silhavy 1990) and appear to destabilise the closed form of the channel (Van den Berg et al. 2004;Li et al. 2007;Belin et al. 2015).
Early studies suggested that the channel is very narrow. Folding, even of relatively small substrate proteins, prevents translocation across the membrane in vivo (Randall and Hardy 1986), and the introduction of stably folded elements into a Sec substrate protein results in trapping of partially translocated intermediates in vitro (Uchida, Mori and Mizushima 1995). However, the exact size of the channel formed by SecYEG is a matter of some debate. Molecular dynamics simulations of the SecYEG monomer suggest that it could accommodate structures up to ∼16Å Schulten 2006, 2007;Tian and Andricioaei 2006), and experimental evidence suggests that the channel can expand to ∼22Å (Bonardi et al. 2011).
Several lines of evidence suggest that SecYEG normally dimerises, but the physiological role of the dimers, if any, is unknown. The most widely accepted dimer interface is located at the back of the hinge domain, resulting a 'back-to-back' arrangement (Veenendaal, van der Does and Driessen 2001;Mori et al. 2003;Deville et al. 2011). High-resolution structures suggest that one copy of SecYEG interacts with SecA during translocation (Zimmer, Nam and Rapoport 2008;Li et al. 2016). However, conclusions from mechanistic studies investigating the requirement for dimerisation of SecYEG in SecA-mediated translocation are mixed (Osborne and Rapoport 2007;Deville et al. 2011). Furthermore, it has been proposed that SecYEG could (also) form 'front-to-front' dimers, in which SecYEG protomers interact with each other via the lateral gate, in order to accommodate substrate proteins with more extensive tertiary structure (Bonardi et al. 2011;Das and Oliver 2011). However, a front-to-front dimer would preclude the interaction of SecY with many auxiliary Sec components (see below), which appear to interact with the lateral gate (Sachelaru et al. 2013(Sachelaru et al. , 2014Botte et al. 2016).

SecA
SecA is required for the translocation of most proteins in E. coli (Oliver andBeckwith 1981, 1982a,b). The ATPase activity of SecA occurs at the interface of two nucleotide-binding domains (NBD-1 and NBD-2) (Schmidt et al. 1988;Lill et al. 1989;Hunt et al. 2002) (Fig. 3A, dark blue and cyan, respectively), which are related to those of RecA-like helicases (Hunt et al. 2002;Sharma et al. 2003;Ye et al. 2004). The primary structure of NBD-1 is interrupted by a domain known as the polypeptide crosslinking domain (PPXD) (Hunt et al. 2002) (Fig. 3A, light blue), which contacts the substrate polypeptide during translocation (Bauer and Rapoport 2009). C-terminal to NBD-2 is an α-helical domain that is composed of two subdomains (Hunt et al. 2002): (i) the αhelical scaffold domain (HSD) (Fig. 3A, red) and (ii) the α-helical wing domain (HWD) (Fig. 3A, orange). The HSD contains a twohelix finger (2HF) near the C-terminus, which contacts the substrate protein and plays a critical role in protein translocation (Erlandson et al. 2008). In addition, most SecA proteins contain a C-terminal tail (CTT) that is not resolved in high-resolution structures. In E. coli, the CTT contains a small zinc-binding domain (ZnBD) that is required for the efficient interaction of SecA LG) and hinge are indicated, and the aliphatic residues lining the pore ring constriction are depicted as sticks (green). The translocating peptide is coloured blue (F and I). The structure of SecA in the 3DIN and 5EUL structures has been cut away to more clearly illustrate the conformational changes in the channel. Structural models were rendered using UCSF-Chimera v 1.12 (Pettersen et al. 2004). with its binding partner SecB (Fekkes et al. 1997(Fekkes et al. , 1999. However, the ZnBD is present in SecA in many species that lack SecB (van der Sluis and Driessen 2006), suggesting that the ZnBD (and the CTT) has another function. For example, it has been suggested that the CTT could autoinhibit SecA by competing for interaction with substrate proteins although the significance of this activity is unknown (Gelis et al. 2007).
SecA undergoes a large conformational change upon interaction with substrate protein or SecYEG (Zimmer, Nam and Rapoport 2008;Chen et al. 2015) (Fig. 3B). In the x-ray crystal structure of SecA from Bacillus subtilis (1M6N), the PPXD is positioned near the HWD (Hunt et al. 2002), a state known as the 'closed' conformation. However, upon interaction with substrate protein or SecYEG, the PPXD undergoes a large rotation and translation to bring it into proximity of NBD2 (and away from the HWD) (Zimmer, Nam and Rapoport 2008;Chen et al. 2015 the 'open' conformation. SecA binds to substrate protein in the groove between NBD-1/-2 and the PPXD (Zimmer and Rapoport 2009). 'Opening' of the clamp encloses the substrate protein, which is thought to stabilise its interaction with SecA (Zimmer, Nam and Rapoport 2008;Gold et al. 2013). Opening of the clamp also activates the ATPase activity of SecA by increasing the rate of nucleotide exchange (Fak et al. 2004;Gold et al. 2013). The PPXD occupies several part-open conformations in different high resolution structures (Gelis et al. 2007;Chen et al. 2015). For example, NMR studies suggest that ∼10% of the protein is in the closed conformation while ∼90% occupies a 'partially open' conformation (Gelis et al. 2007). These intermediate conformations  (Gelis et al. 2007). The locations of NBD-1 (dark blue), NBD-2 (cyan), PPXD (light blue), HSD (red) and HWD (orange) are indicated. The CTT is absent in high-resolution structures and is not depicted. The approximate locations of the substrate binding site between the PPXD and NBD-2 and the signal sequence binding site between the PPXD and the HWD are likewise indicated. (B) Structures of SecA illustrating the large translational and rotational movement of the PPXD (light blue) during conversion between the closed (1M6N) (Hunt et al. 2002), part open (4YS0) (Chen et al. 2015) and open (3DIN) (Zimmer, Nam and Rapoport 2008) conformations. (C) Structure of SecA (grey, PPXD in light blue) in complex with SecYEG (purple) (3DIN) from two angles (Zimmer, Nam and Rapoport 2008). This structure illustrates the deep penetration of the twohelix finger (2HF; green) into the SecYEG channel (purple) and binding of the TM6/7 loop by the PPXD of SecA. Structural models were rendered using UCSF-Chimera v 1.12 (Pettersen et al. 2004). probably represent transition states between the closed and open conformations but could serve another as-yet undetermined function.
Binding of SecA to SecYEG involves extensive contact between the two proteins, and results in conformational changes in both proteins (Mori and Ito 2006;Zimmer, Nam and Rapoport 2008;Das and Oliver 2011;Li et al. 2016) (Fig. 3C). For example, the 2HF of SecA inserts deep into the channel, and SecA binds the large cytoplasmic TM6/7 loop of SecY between the PPXD and the HSD (Zimmer, Nam and Rapoport 2008). Binding of SecA to ATP appears to destabilise the closed form of the channel (Zimmer, Nam and Rapoport 2008; Allen et al. 2016;Li et al. 2016).
The oligomeric state of SecA during translocation has been a matter of some dispute. It has been well noted that SecA forms homodimers in solution (Akita et al. 1991;Driessen 1993;Hi-rano, Matsuyama and Tokuda 1996;Doyle, Braswell and Teschke 2000;Woodbury, Hardy and Randall 2002). X-ray crystal structures of the SecA dimer suggest several different dimer interfaces (for example, see Hunt et al. 2002;Vassylyev et al. 2006;Zimmer, Li and Rapoport 2006;Papanikolau et al. 2007). Site-specific crosslinking studies indicate that SecA prefers one of these conformations when overproduced in vivo (Banerjee, Lindenthal and Oliver 2017). However, purified SecA probably populates several different dimers in solution (Woodbury, Hardy and Randall 2002;Kusters et al. 2011;Auclair, Oliver and Mukerji 2013). The role of this dimer (if any) is unclear. Dimerisation appears to enhance protein translocation (Driessen 1993;Jilaveanu, Zito and Oliver 2005;Jilaveanu and Oliver 2006;Kusters et al. 2011;Gouridis et al. 2013). However, monomeric versions of SecA can promote protein translocation (Or, Navon and Rapoport 2002;Or et al. 2005), and high-resolution structures of the SecA-SecYEG complex indicate that SecA docks with SecYEG in a 1:1 stoichiometry (Zimmer, Nam and Rapoport 2008; Li et al. 2016).

Sec targeting signals
Proteins destined for SecA-mediated translocation across the membrane share two common features. First, they contain an internally encoded targeting signal that allows them to be recognised by the Sec machinery, which usually takes the form of an N-terminal signal sequence (Hegde and Bernstein 2006). Second, all substrate proteins contain features which allow them to be maintained in an unfolded conformation prior to translocation (Randall and Hardy 1986;Schatz and Dobberstein 1996).

N-terminal signals sequences
In the 1970s, Blobel and colleagues proposed that secreted proteins contained peptide sequences at their N-termini that allowed them to be recognised by the translocation machinery (Blobel and Dobberstein 1975a,b). These N-terminal signal sequences were first identified genetically in bacteria by mutations that prevented translocation of reporter fusion proteins across the cytoplasmic membrane (Emr, Schwartz and Silhavy 1978;Bassford and Beckwith 1979). Subsequent work indicated that signal sequences have a conserved primary structure, which consists of a hydrophobic core flanked by shorter N-and Cdomains ( Fig. 4A) (von Heijne 1990;Hegde and Bernstein 2006). The N-domain, located N-terminal to the hydrophobic core, is positively charged and may play a role in orienting the signal sequencing in the membrane (von Heijne 1990; Andersson, Bakker and von Heijne 1992). The hydrophilic C-domain is less positively charged than the N-domain and contains a recognition site for signal peptidase, which allows cleavage of the signal sequence from the precursor to form the mature protein (Perlman and Halvorson 1983;Wolfe and Wickner 1984;von Heijne 1990;Hegde and Bernstein 2006). The C-domain also contains a recognition site for signal peptidase-1 or -2, which remove the signal sequence from the mature protein during translocation (Josefsson and Randall 1981a,b;Hegde and Bernstein 2006). Processing by signal peptidase-2 also results in lipidation of the N-terminal cysteine (Hegde and Bernstein 2006). All three domains can vary in length. However, an analysis of signal sequence-containing proteins from E. coli K-12 in the UniprotKB database indicates that the median signal sequence length in E. coli is 22 amino acids, with a minimum of 15-16 amino acids (Fig. 4B).
The decision whether to export substrate proteins by the translationally coupled pathway or by the translationally uncoupled pathway depends on the hydrophobicity of the signal sequence (Lee and Bernstein 2001;Bowers, Lau and Silhavy 2003;Schierle et al. 2003;Huber et al. 2005a). Very hydrophobic signal sequences are recognised by the SRP and are targeted for translationally coupled translocation (Huber et al. 2005a;Schibich et al. 2016). Those that fail to be recognised by the SRP appear to be targeted to the SecA-dependent pathway by default (Lee and Bernstein 2001;Schierle et al. 2003).

Other targeting signals
Some Sec substrate proteins appear to contain additional targeting signals that allow them to be recognised by the Sec machinery. For example, it has been suggested that the molecular chaperone SecB can recognise a subset of Sec substrate proteins (Derman et al. 1993;Kumamoto and Francetic 1993;Prinz et al. 1996;Randall et al. 1997Randall et al. , 1998Knoblauch et al. 1999). In addition, recent work suggests that SecA can recognise polypep- , the hydrophobic core (grey) and the C-domain (C; black). If present, the signal peptidase recognition site is contained at the C-terminal portion of the C-domain and results in cleavage of the signal sequence from the mature Sec substrate protein during translocation. (B) Analysis of the length of E. coli signal sequences in the UniProtKB database. Protein entries in the UniProtKB database for E. coli K-12 were screened for those containing the key feature 'signal peptide'. The lengths of these signal peptides were then determined and plotted as a histogram. The median signal sequence length of this set (22) is indicated. (C) The hydrophobicity of the signal sequences in (B) was determined according to Huber et al. (2005a) and plotted as a histogram according to their hydrophobicity. The minimum hydrophobicity required for SRP recognition indicates that most N-terminal cleavable signal sequences are targeted for SecA-mediated translocation. tide sequences in the mature portion of some substrate proteins (Chatzi et al. 2017). Finally, at least one protein (the SodA protein of Rhizobium leguminosarum), which lacks a signal sequence, can be recognised by the Sec machinery in both Rhizobium and E. coli (Krehenbrink, Edwards and Downie 2011).

Folding of the substrate protein
Because the channel is only large enough to accommodate unfolded proteins (Van den Berg et al. 2004;Schulten 2006, 2007;Tian and Andricioaei 2006;Bonardi et al. 2011;Li et al. 2016), Sec substrate proteins must be kept unfolded in the cytoplasm (Randall and Hardy 1986;Teschke et al. 1991;Uchida, Mori and Mizushima 1995;Li et al. 2016). Mutations that slow folding can compensate for defects in targeting caused by defective signal sequences Teschke et al. 1991;Song and Park 1995). Indeed, many proteins (e.g. some normally cytoplasmic proteins and heterologously expressed proteins) are refractory to translocation because they rapidly fold in the cytoplasm before they can be transported through SecYEG (Huber et al. 2005b(Huber et al. , 2010Steiner et al. 2006). Because most substrates of the SecA-mediated pathway exist transiently as full-length cytoplasmic intermediates (Josefsson and Randall 1981a,b), bacteria have evolved multiple mechanisms to prevent premature cytoplasmic folding. For example, the signal sequences of some precursor proteins can slow their folding Park et al. 1988;Beena, Udgaonkar and Varadarajan 2004). In addition, molecular chaperones (e.g. SecB) can bind to a subset of Sec substrates and prevent cytoplasmic folding (Collier et al. 1988;Kumamoto and Gannon 1988). Finally, some substrate proteins require covalent modifications (e.g. disulfide bonds) in order to fold stably, and these modifications can only be made after the protein has been localised to the correct compartment (Hatahet, Boyd and Beckwith 2014).

Recognition of substrate proteins by the Sec machinery
Substrate proteins must be recognised by some component of the Sec machinery. For many years, it was generally assumed that SecB recognised substrate proteins and delivered them for SecA-mediated translocation by interacting with SecA (Hartl et al. 1990;Fekkes et al. 1998;Driessen and Nouwen 2008). However, recent research also implicates SecA and SecYEG in substrate protein recognition, and it seems likely that all three components are involved in targeting substrate proteins for SecA-mediated translocation.

Recognition by SecB
SecB is a homotetrameric molecular chaperone, which is required for the efficient translocation of a subset of proteins exported by the SecA-mediated pathway . SecB interacts with its nascent substrate proteins cotranslationally (Kumamoto and Francetic 1993), and mutations that disrupt the secB gene cause the translocation of maltose-binding protein (MalE) to become fully posttranslational (Kumamoto and Gannon 1988). (A significant portion of newly synthesised MalE is translocated cotranslationally although translocation begins at much longer nascent chain lengths than is typical for SRPmediated translocation (Josefsson and Randall 1981a,b;Schierle et al. 2003), which is typical of many substrates of the SecAmediated pathway (Josefsson and Randall 1981a,b)). Biochemical studies supported the idea that substrate proteins were transferred from SecB to SecA and then translocated through Se-cYEG (Hartl et al. 1990). SecB interacts with SecA (den Blaauwen et al. 1997;Fekkes et al. 1997;Randall and Henzl 2010), and substrate protein strengthens this interaction (Fekkes et al. 1998). Finally, mutant SecB proteins that are defective for interaction with SecA accumulate in a substrate-bound form in vivo (Gannon and . However, SecB cannot be the only, or even the primary, protein that recognises substrates of the SecA-mediated pathway. Escherichia coli mutants lacking SecB are viable (Kumamoto and Gannon 1988;Shimizu, Nishiyama and Tokuda 1997) and are defective in the translocation of a relatively small subset of proteins (Kumamoto and Beckwith 1985;Baars et al. 2006). Even for these substrate proteins, translocation is only partially defective in the absence of SecB (Kumamoto and Beckwith 1983).
Finally, SecB is not found in all bacteria (van der Sluis and ).

Direct recognition of substrate proteins by SecA
One possibility is that SecA recognises its substrate proteins directly. The only protein components required for translocation in vitro are SecA, SecY and SecE (Brundage et al. 1990), suggesting that one of these proteins can recognise substrate proteins. SecA binds directly to signal sequence-like peptides (Gelis et al. 2007;Auclair et al. 2010;Zhang et al. 2016). In addition, the presence of a signal sequence increases the affinity of SecA for unfolded proteins (Kebir and Kendall 2002;Gouridis et al. 2009) and alters the behaviour of SecA towards substrate protein (Eser and Ehrmann 2003). These findings suggest that SecA can directly recognise proteins containing signal sequences. SecA has also been implicated in the recognition of internally encoded targeting signals (Chatzi et al. 2017).
SecA also interacts cotranslationally with nascent Sec substrates in vivo (Chun and Randall 1994;Huber et al. 2017). This interaction appears to be mediated by a specific interaction between SecA and the ribosome (Huber et al. 2011(Huber et al. , 2017. SecA binds to the ribosome near the site where nascent chains emerge into the cytoplasm (Huber et al. 2011;Singh et al. 2014), and disrupting this interaction causes a partial defect in SecAmediated translocation (Huber et al. 2011). In addition, it strongly disrupts the interaction between SecB and its nascent substrate proteins (Huber et al. 2017), suggesting that the interaction of SecA with nascent substrates precedes the interaction of SecB with these proteins. It is yet not known whether SecA recognises all substrate proteins cotranslationally or only a subset. One recent study found that binding of SecA to the ribosome was required for the insertion of the IMP RodZ (Wang, Yang and Shan 2017). This requirement could explain the dependence of MreB, which binds to RodZ, on SecA for its localisation (Govindarajan and Amster-Choder 2017). However, a different study found that SecA interacted with all (or most) nascent Sec substrates (Huber et al. 2017). One explanation is that SecA normally recognises all substrate proteins cotranslationally but that only a subset require cotranslational recognition for insertion.

Recognition of substrate proteins by the lateral gate of the SecYEG channel
Finally, it is possible that the channel itself recognises substrate proteins. It has been suggested that signal sequences are required to 'unlock' SecYEG prior to translocation (Hizlan et al. 2012), which could serve as a recognition step. In addition, the presence of a signal-sequence-like peptide can stimulate SecAmediated translocation of signal sequence-less substrate proteins when added in trans (Gouridis et al. 2009), suggesting that SecY itself may recognise substrate proteins. However, direct recognition by SecYEG seems unlikely since SecY would need to screen all newly synthesised substrate proteins, even those synthesised in the cytoplasm.
Together, this research suggests that substrate recognition occurs in two steps: initial recognition and quality control (Fig. 5). First, SecA recognises the substrate protein. Alternatively, molecular chaperones, such as SecB, recognise substrate proteins and deliver them to the translocation machinery by interacting with SecA. Second, interaction of the signal sequence with SecYEG serves as a quality control step ensuring that only proteins with functioning signal sequences are translocated across the membrane. Proposed pathways for targeting substrate proteins for SecA-mediated translocation. It appears that substrate proteins are initially recognised by Sec machinery by two different mechanisms: (1) SecA cotranslationally recognises nascent Sec substrate proteins as they emerge from the ribosome by virtue of an internally encoded targeting signal. SecA may then recruit SecB to the substrate protein (1a) or deliver the protein directly to SecYEG (3). (2) Alternatively, a subset of substrate proteins may be recognised by SecB and delivered to the Sec machinery through the interaction of SecB with SecA (2a), which ultimately delivers the protein to Se-cYEG (3). Incorporation of the signal sequence into the lateral gate of SecYEG may serve as a final quality control step to prevent the translocation of proteins with signal-sequence-like regions in their primary structure (4).

The role of ribosome-associated chaperones in determining the timing of translocation
Different substrate proteins are delivered to SecYEG for translocation with differing kinetics in vivo (Josefsson and Randall 1981a,b). In E. coli, one factor that influences the timing of delivery is the ribosome-associated chaperone Trigger Factor (TF) (Oh et al. 2011). Mutations disrupting the gene encoding TF (tig) cause the translocation of multiple SecA substrate proteins to become more cotranslational (Ullers et al. 2007;Oh et al. 2011) and can suppress the translocation defect caused by mutations in secB (Lee and Bernstein 2002;Ullers et al. 2007). This phenotype is reminiscent of the ability of chloram-phenicol, which causes translocation to become more cotranslational at subinhibitory concentrations (Kadokura and Beckwith 2009), to suppress translocation defects in many sec mutants (Lee and Beckwith 1986). The physiological importance of the delay in targeting caused by TF is unknown. One idea is that competition between TF and the SRP influences the choice of translocation pathways, perhaps by making the SRP more selective (Eisner et al. 2003(Eisner et al. , 2006Bornemann, Holtkamp and Wintermeyer 2014;Ariosa et al. 2015). Alternatively, it is possible that TF prevents excessive cotranslational translocation by the SecA mediated pathway, which is toxic (van Stelten et al. 2009).

The mechanism of SecA-mediated protein translocation
The mechanism of SecA-mediated translocation has been investigated intensively and is a rich source of mechanistic evidence. SecA can initiate translocation in both the ATP-and ADP-bound forms, and the subsequent translocation of substrate proteins through SecYEG requires rounds of ATP binding and hydrolysis (Schiebel et al. 1991). The rate-limiting step in the ATPase cycle of SecA is nucleotide exchange, and interaction of SecA with substrate protein and with SecYEG increases the rate of nucleotide exchange (Fak et al. 2004). Biochemical studies have suggested that each round of ATP binding and hydrolysis results in the translocation of ∼50 amino acids (Tani et al. 1989;Schiebel et al. 1991;van der Wolk et al. 1997), and the length of time required for translocation increases with increasing length of substrate protein (Tomkiewicz et al. 2006). Binding of SecA to ATP results in high-affinity binding to SecYEG and in the protection of a large portion of SecA from proteolytic digestion, indicating that SecA undergoes a large conformational change upon binding to ATP (Economou and Wickner 1994). At later stages of translocation, the proton-motive force (PMF) also assists in translocation by an unknown mechanism (Schiebel et al. 1991).
Despite this wealth of evidence, the molecular mechanism of SecA-mediated translocation is disputed. Several distinct mechanistic models have been proposed to account for the above observations, and these models can generally be divided into three types: (i) processive, (ii) probabilistic and (iii) mixed processive/probabilistic. Processive models depend entirely on mechanical pushing force provided by SecA (e.g. see Gouridis et al. 2013). In contrast, probabilistic models rely on Brownian movement of the polypeptide chain through a channel (e.g. see Allen et al. 2016). Finally, mixed models contain both processive and probabilistic elements (e.g. see Bauer et al. 2014).

Processive translocation by mechanical pushing
Traditionally, SecA has been viewed as a mechanical pump that pushes proteins through SecYEG (van der Wolk et al. 1997). In order to translocate substrate proteins in discrete steps of 50 amino acids, processive models would require a movement within SecA of ∼75Å (or two movements of ∼37Å, van der Wolk et al. 1997), assuming that the substrate protein is purely α-helical. Most plausible processive models require SecA to dimerise or to oligomerise in order to account for such large translational motions (Gouridis et al. 2013). Multimerisation would also be consistent with the similarity of SecA to RecAlike helicases, which frequently multimerise in order to unwind RNA molecules (Ye et al. 2004). In these models, binding of SecA to ATP causes a conformational change in the SecA dimer, which pushes the protein through SecYEG (Schiebel et al. 1991) (Fig. 6A). Subsequent hydrolysis of ATP to ADP results in resetting of the motor to the pre-translocation state and could also result in a second round of translocation (van der Wolk et al. 1997). One key feature of processive models is that translocation is unidirectional: some feature prevents the retrograde translocation of the substrate protein when the motor protein resets to its pretranslocation state.
Recent research suggesting that SecA is monomeric during translocation has called into question the validity of most classical processive models (Or, Navon and Rapoport 2002;Or et al. 2005;Zimmer, Nam and Rapoport 2008). However, one recent study suggests that SecA could cycle between dimeric and monomeric forms during translocation (Gouridis et al. 2013). If true, such a mechanism could explain some of the apparently contradictory results surrounding the oligomeric state of the protein.

Probabilistic translocation by ratcheted diffusion
An alternative to translocation by mechanical pushing is ratcheted diffusion of the substrate protein. In this model, SecA serves as a regulator protein, which causes opening of the SecYEG channel in the presence of translocating polypeptide (Fig. 6B) (Allen et al. 2016). In summary, binding of SecA to ADP causes SecYEG to occupy a partially open conformation, and binding of SecA to ATP causes SecYEG to occupy the fully open conformation. In the partially open conformation, only amino acids with small side chains can pass through the constriction in SecYEG. However, the presence of a translocating polypeptide in the channel is sensed by SecYEG and the 2HF of SecA and results in exchange of ADP for ATP. Binding to ATP causes SecA to open the channel, allowing the translocation of polypeptides containing bulky amino acids. This model requires a single copy of SecA but is still consistent with the basic parameters of SecA-mediated translocation determined by biochemical experiments. In addition, it provides an explanation for (i) the importance of the 2HF in SecA-mediated translocation (Erlandson et al. 2008;Zimmer, Nam and Rapoport 2008); (ii) how limiting the rate of nucleotide exchange promotes protein translocation (Fak et al. 2004); and (iii) why prl mutations, which appear to destabilise the channel, promote more promiscuous translocation (Van den Berg et al. 2004). Finally, a diffusional model could explain the rates of translocation observed in vivo (Simon, Peskin and Oster 1992). One issue not addressed by this model is how translocation is driven in the forward direction. It is possible that interaction with periplasmic chaperones, such as PpiD (Antonoaea et al. 2008), prevents backsliding in vivo, and several additional mechanisms have been suggested (Allen et al. 2016).

The push-and-slide model (mixed processive and probabilistic)
Finally, it has been suggested that translocation could proceed by a 'push-and-slide' mechanism, which includes elements of both processive and probabilistic models (Bauer et al. 2014) (Fig. 6C). This mechanism is also diffusional in nature, but the direction of translocation is biased due to pushing by the 2HF. Binding of SecA to SecYEG results in opening of the channel, allowing the movement of polypeptides through the channel. Binding to ATP results in translation of the 2HF. The tip of the 2HF contains a conserved aromatic residue, which is thought to bind to substrate protein and 'push' the substrate polypeptide through SecYEG (Erlandson et al. 2008;Bauer et al. 2014). Although this movement is not large, diffusion of the polypeptide through SecYEG allows for the large step (Schiebel et al. 1991;Bauer et al. 2014). Afterwards, hydrolysis of ATP resets the 2HF to its pre-translocation position. This model has many of the same strengths as the ratcheted diffusion model (Allen et al. 2016) and provides an explanation for the overall directionality of translocation. However, it is not clear how the aromatic residue in the 2HF 'lets go' of the substrate polypeptide after ATP hydrolysis in order to prevent retrograde translocation. In addition, this model relies on a large translational movement by the 2HF, but biochemical studies suggest that such a movement is not required for translocation ).

The involvement of the auxiliary Sec components in SecA-mediated translocation
SecYEG associates with several auxiliary Sec components, including SecD, SecF, YajC and YidC, to form a supercomplex known as the holotranslocon (Duong and Wickner 1997;Schulze et al. 2014). This complex has been implicated in the assembly of IMPs (Botte et al. 2016;Komar et al. 2016). However, mutations Figure 6. Proposed mechanisms for SecA-mediated translocation. Mechanistic models for translocation can be grouped into three classes: (A) processive models, (B) probabilistic models (ratched diffusion) and (C) mixed processive/probabilistic models ('push-and-slide'). (A) Processive models require a 'power stroke' that results in the translocation of around 5 kDa (∼50 amino acids) per round of ATP binding and hydrolysis. In order to account for the large 'step size' for each round of translocation, most processive models require SecA to oligomerise. Binding of SecA to ATP results in a conformational change that mechanically pushes the substrate protein through SecYEG. It has been proposed that hydrolysis of ATP could result in a second pushing step (van der Wolk et al. 1997). (B) In the ratcheted diffusion model (Allen et al. 2016), translocation is probabilistic. SecA gates opening of the channel, allowing the substrate protein to translocation through the channel by diffusion. In the ADP-bound form, SecA causes the channel to occupy a part-open conformation that allows limited diffusion of the polypeptide chain. However, the presence of a polypeptide chain in the channel is sensed by the 2HF of SecA, which promotes nucleotide exchange. Binding to ATP opens the channel and allowing free diffusion of the polypeptide chain. (C) In the 'push-and-slide' mechanism (Bauer et al. 2014), binding of SecA to the channel results in opening of the channel and allows the polypeptide chain to diffuse freely through it. The direction of diffusion is biased by pushing the 2HF-binding to ATP results in a translocation of the 2HF, which pushes the polypeptide chain through SecYEG. ATP hydrolysis resets the 2HF without pulling on the polypeptide chain. affecting each of these components can cause defects in SecAmediated protein translocation in vivo (Gardel et al. 1987(Gardel et al. , 1990Pogliano and Beckwith 1994;Samuelson et al. 2000), suggesting that the holotranslocon or its individual constituents assist in SecA-mediated translocation.

SecDF
In E. coli, mutations in the secD and secF genes cause a strong defect in the translocation of many SecA substrate proteins and a severe cold sensitive growth defect (Gardel et al. 1987(Gardel et al. , 1990. The products of these genes are two IMPs, which form a complex in the cytoplasmic membrane (Gardel et al. 1990;Duong and Wickner 1997). SecD and SecF are encoded in the same operon in E. coli and are produced as a single polypeptide chain in many bacteria (Bolhuis et al. 1998). Each protein contains six transmembrane helices, and the pair of proteins resembles complementary halves of a proton-driven pump (Gardel et al. 1990;Tseng et al. 1999). Genetic and biochemical studies suggest that SecDF assists in the later steps of SecA-mediated translocation (Gardel et al. 1990;Nouwen et al. 2005;Tsukazaki et al. 2011). Recent highresolution structures of SecD and SecF suggest that the large periplasmic domains of these proteins bind to translocating substrate proteins and ratchet them into the periplasm by a mechanism dependent on the PMF (Tsukazaki et al. 2011;Ficici, Jeong and Andricioaei 2017;Furukawa et al. 2017).

YajC
In E. coli, YajC is encoded in the same polycistronic message as SecD and -F and appears to interact with the SecDF complex (Pogliano and Beckwith 1994;Duong and Wickner 1997). However, the role of YajC in SecA-mediated protein translocation is unknown. However, mutations in the yajC gene do not cause a detectable translocation defect on their own (Pogliano and Beckwith 1994).

YidC
YidC is a homologue of the Oxa1p in mitochondria and Alb3 in chloroplasts proteins, which are required for protein translocation in these organelles, and depletion of YidC causes a pleiotropic translocation defect in E. coli (Samuelson et al. 2000). It has been suggested that YidC promotes the translocation (or insertion) of a distinct subset of IMPs, which do not require SecA or SecYEG for insertion (Froderberg et al. 2003;Celebi et al. 2006;du Plessis, Nouwen and Driessen 2006). It is therefore possible that the defective translocation of SecA substrate proteins in YidC-depletion strains is a secondary consequence of the defective insertion of some other essential protein (Samuelson et al. 2000). However, an uncharacterised mutation (ssaF), which is tightly linked to the yidC locus, can suppress the temperature sensitivity of a secA51 mutant (Oliver 1985).

MATERIALS AND METHODS
The figures given in the subsection 'Why do bacteria have a SecA-mediated protein translocation pathway?', the graphical abstract and Fig. 1 were derived from the relative protein abundances estimated by Li et al. (2014) from ribosome profiling. Ribosome profiling measures the rate of protein synthesis rather than steady-state abundance and theoretically provides a better basis for estimating the rate of transport of newly synthesised proteins. It is possible that ribosome profiling overestimates the relative abundance of proteins produced at low levels (which could explain the high rate of synthesis estimated for IMPs). However, examination of several protein complexes containing both IMPs and soluble proteins (e.g. NADH reductase, F-ATPase, cytochrome bo oxidase, etc.) suggests that there is not a systematic bias in IMPs produced at similar levels. We estimated the absolute abundance of each protein by multiplying the abundance calculated by Li et al. (2014) by the fraction of the number of ribosomes in an exponentially growing E. coli cell (50 000) (Bremer and Dennis 2008) over the abundance of ribosomal protein uL23 estimated by Li et al. (2014) (103  687). OMPs and PPs were identified by searching the UniProtKB database (The UniProt 2017) for E. coli K-12 proteins containing the keyword 'signal peptide' in the PTM/Processing category, and IMPs were identified by searching UniProtKB for E. coli K-12 proteins with the keyword 'transmembrane'. The length of each protein was determined from its entry in the UniProtKB database (The UniProt 2017).

CONCLUDING REMARKS
We propose a working model for targeting and translocation by the SecA-mediated translocation pathway based on the literature reviewed here: PPs and OMPs are recognised by the presence of an N-terminal signal sequence and the absence of significant tertiary structure. Substrate proteins are recognised cotranslationally by SecA, but translocation itself is uncoupled from translation and is largely post-translational, possibly resulting from the activity of TF. Alternatively, molecular chaperones can recognise substrate proteins and target them for translocation by interacting with SecA. After delivery to SecYEG, SecA promotes translocation of the proteins across the membrane. Although the mechanism of translocation is not known, it seems clear that translocation must be at least partially probabilistic in order to account for both the large step size and the speed of translocation.
While this model is tidy, it is also incomplete. For example, it does not provide an explanation for the requirement of SecA for SRP-mediated translocation (Schierle et al. 2003) or the binding preference of SecA for nascent IMPs in vivo (Huber et al. 2017). In neither case does SecA appear to be responsible for driving translocation since binding of SecA to SecYEG and binding of ribosomes to SecYEG are mutually exclusive (Wu et al. 2012). However, SecA is not thought to be involved in the recognition IMPs since the SRP carries out this step (Saraogi and Shan 2014).
Another question is: How many targeting pathways are there? Recent research suggests SecA can recognise substrate proteins by multiple different mechanisms (Chatzi et al. 2017;Huber et al. 2017;Wang, Yang and Shan 2017). In addition, molecular chaperones could expand the repertoire of the SecAmediated pathway by specifically recognising a subset of proteins and delivering them to SecA. For example, SecB targets unfolded substrate proteins to SecA under certain conditions (Derman et al. 1993;Fekkes et al. 1998), and overexpression of Hsp70 and GroEL can compensate for translocation defects in some mutant strains of E. coli Wild et al. 1996).
Finally, a number of critical questions about the mechanism of translocation still need to be answered: Is SecA-driven translocation processive or probabilistic or a mixture thereof (Bauer et al. 2014;Allen et al. 2016)? If probabilistic, does translocation require mechanical pushing force from SecA (Bauer et al. 2014) or not (Allen et al. 2016)? Is SecA monomeric during translocation in vivo (Or, Navon and Rapoport 2002;Or et al. 2005), or does translocation require oligomerisation of SecA (Gouridis et al. 2013)? Similarly, what is the oligomeric state of SecYEG during translocation? Finally, what is the role of the PMF in translocation (Enequist et al. 1981;Schiebel et al. 1991)? Despite over 40 years of research, the Sec machinery continues to provide a rich source of inquiry.

ACKNOWLEDGEMENT
We thank J. Yule and H. Brownsill for critical readings of the manuscript.

Conflict of interest.
None declared.