Phase variation (PV) of surface molecules and other phenotypes is a major adaptive strategy of pathogenic and commensal bacteria. Phase variants are produced at high frequencies and in a reversible manner by hypermutation or hypervariable methylation in specific regions of the genome. The major mechanisms of PV involve site-specific recombination, homologous recombination, simple sequence DNA repeat tracts or epigenetic modification by the dam methylase. PV rates of some of these mechanisms are subject to the influence of genome maintenance pathways such as DNA replication, recombination and repair while others are independent of these pathways. For each of these mechanisms, the rate of generation of phase variants is controlled by intrinsic and dispensable factors. These factors can impart environmental regulation on switching rates while many factors are subject to heterogeneity both within isolates of a species and between species. A major gap in our understanding is whether these environmental and epidemiological variations in PV rate have a major impact on fitness. Experimental approaches to studying the biological relevance of differing PV rates are being developed, and a recent intriguing finding is of a co-ordination of switching rates in the phase variable P-pili of uropathogenic bacteria.
The maintenance of fitness is a severe challenge to bacterial pathogens and commensals. This challenge comes from their exposure to rapidly changing environments, evolving host responses, competition/predation by other microorganisms (e.g. bacteriophages) and genetic, immunological and behavioural variations in their host populations. Many of these selective pressures are severe and are repetitively focused on specific molecules, particularly surface structures, or phenotypic states of bacterial cells. This type of selection is thought to have exerted a strong secondary selection for localized increases in mutation or recombination rate in bacterial genes referred to as ‘contingency loci’ (Moxon et al., 1994). These loci enable bacterial populations to adapt to or survive selective pressures through the generation and outgrowth of genetic variants that are ‘fitter’, i.e. better adapted to, in a specific environment than the majority of the population.
A variety of mutational or recombinatorial mechanisms are responsible for the ‘localized hypermutation’ associated with contingency loci and for the rapid generation of large numbers of genetic variants (Moxon et al., 1994). These mechanisms can produce alterations in the sequence and hence structures of specific determinants (Fig. 1a). This process is often referred to as antigenic variation [see van der Woude & Baumler (2004) for a discussion of the definition of antigenic variation]. Alternatively, there is a reversible change in the expression of a particular locus (Fig. 1c). These latter changes are responsible for the phenomenon of phase variation (PV) (Hallet, 2001; Moxon et al., 2007). This phenomenon of high frequency, reversible mutational or recombinatorial events driving phase variable changes in gene expression has been described for many bacterial genes in a wide range of bacterial species. These genes exhibit wide variations in the rates of generation of phase variants. Progress in characterization of the mechanistic determinants of the rate of localized hypermutation in phase variable loci is discussed in the context of recent advances in our understanding of the epidemiological and environmental variations in mutability (Fig. 2). A particular emphasis is placed on the involvement of genome maintenance pathways in determining PV rates. The final sections consider the important question of the impact of variations in PV rates on the fitness of bacterial pathogens and commensals.
Mechanisms, distribution and rates of PV
The two mechanisms of localized hypermutation most commonly associated with PV are site-specific recombination and slippage within simple sequence repeat tracts (van der Woude & Baumler, 2004; Moxon et al., 2007). In the former, process changes in the orientation of DNA sequences alter the positioning of promoters relative to genes (Dybvig et al., 1993; Henderson et al., 1999). In the latter, changes in the lengths of DNA sequences alter the reading frame or the relative positions of promoter elements (Bayliss & Moxon, 2005). Several of the phase variable events involving site-specific recombination simultaneously switch expression of one allele of a gene to ‘on’ while another is switched ‘off’ and so produce a type of antigenic variation (see Fig. 1b). In most cases, however, PV produces a change in gene expression. In the simplest scenario, switching occurs between two different expression states: ‘on’ and ‘off’ such as produced by site-specific inversion of a single DNA fragment or repeat-mediated translational switching when only one initiation codon is present in the gene. Note that ‘on’ variants can be associated with both a high and intermediate/low level of gene expression depending on the prevailing environmental conditions, but this transcriptional regulation should be regarded as a component of the ‘on’ PV state rather than a separate one. Multiple PV states are observed when PV is mediated by simple sequence repeats present within promoter regions. For example, the nadA gene of Neisseria meningitidis can produce three levels of expression (high, intermediate and low) with each expression state being associated with different numbers of repeats (Martin et al., 2005). Recently, three levels of expression were detected due to changes in the numbers of 5′-CAAT repeat tract present in the reading frame of the Haemophilus influenzae lic2A gene (Dixon et al., 2007). This gene contains initiation codons in two reading frames and each of these was associated with a different level of expression resulting in PV between high, low and no detectable expression. The relevance of these findings to the other H. influenzae, several of which contain initiation codons in multiple frames, or to phase variable genes in other organisms requires further exploration. Localized hypermutation due to recombination between tandem duplications of regions of the genome is observed in some phase variable loci and changes gene expression by altering the copy number of specific genes (Kroll & Moxon, 1988). Finally, intragenomic recombination between dispersed alleles of a gene occasionally results in loss of gene expression due to insertion of alleles containing premature stop codons (Criss et al., 2005). Stochastic changes in the methylation of promoter elements can also alter gene expression and cause PV (Henderson et al., 1999). Although this mechanism of PV does not involve localized hypermutation, there are many parallels with the hypermutable systems and so this mechanism of PV will also be discussed herein.
Phase variable genes are widely distributed in the genomes of pathogenic and commensal bacteria. These genomes may contain a single phase variable gene or multiple loci subject to these stochastic changes in gene expression. Not unsurprisingly, bacterial species exhibit wide variations in the numbers of phase variable genes present in their genomes. The numbers of repeat-mediated loci are the easiest to assess although even here an exact definition of the numbers of repeats required for PV hinders genomic analyses. The H. influenzae strain Rd, N. meningitidis strain MC58 and Campylobacter jejuni strain 11168 are suggested to contain, respectively, 12, 40 and 27 phase variable loci (Hood et al., 1996b; Parkhill et al., 2000; Saunders et al., 2000). Assessments of the numbers of other types of phase variable loci have been more difficult to derive and few attempts have been made to collate this information for multiple types of phase variable genes. Recent analysis of the genome of UPEC strain CFT073, for example, indicates the presence of 12 potential fimbrial encoding loci but it is uncertain how many of these are phase variable or whether there are additional phase variable genes (Totsika et al., 2008). Once the number of phase variables is known, it is possible to calculate the potential numbers of genotypes. Hence, if there are 12 loci and each can exist in two PV states (i.e. ‘on’ and ‘off’) then there are 212 or 4096 genotypes. If, however, each loci can exist in three PV states then there would be 312 or 531 441 potential genotypes. These numbers provide an indication of the vast repertoire of variants accessible through PV of multiple loci and hence the relative importance of this phenomenon to bacterial pathogenesis.
This review emphasizes the importance of how often phase variable events occur i.e. the PV rate. In general, the review will discuss PV rate in relative terms rather than providing definitive numbers. This is because PV rates are reported in a variety of ways in the scientific literature, which in part reflects differences in the mechanisms of PV. The most common approach to reporting data of this type is to provide a frequency of phase variants in a given population and it is important to note that PV frequencies usually exceed 1 × 10−5 variants per total number of cells (i.e. at least three orders of magnitude higher than a basal mutation frequency). However, a frequency does not reflect how often mutations occur at the molecular level as phase variants are generated in bacterial populations by both mutational or recombination events and by replication of phase variants. PV rates have been estimated for switching by mutations in repetitive DNA using a variety of methods of which the most accurate is that described by Saunders et al. (2003) as their method encompasses both forward and reverse mutational events. The output of these estimations is a number of events per cell division. This approach may not, however, be relevant to site-specific recombination as mutational events are not linked to DNA replication but rather occur in a time-dependent manner such that the number of events occurring per cellular division may vary as a function of the rate of replication of the bacterial population. As with the derivation of bacterial genomic mutation rates (Hall & Henderson-Begg, 2006), some standardization and collation of PV rate data are required to facilitate progress in this field.
Intrinsic determinants of PV rate
Under steady-state conditions, phase variable genes will generate phase variants at a constant rate. The major determinants of PV rate in static environments are therefore intrinsic either to the mechanism of localized hypermutation or to the genome in which the phase variable gene is located. These factors fall into two classes: cis-acting factors and trans-acting factors (Fig. 2). Most cis-acting factors are integral components of the mechanism of hypermutation. Trans-acting factors can either be integral to the switching mechanism (e.g. a site-specific recombinase) or part of the general replication machinery of the cell. A particular interest of this thematic review is whether replication, recombination and repair (RRR) genes have a major influence on switching rates. Experimental evidence pertaining to the factors controlling the mutability of phase variable loci is discussed in this section while the environmental regulation of PV rates and the epidemiological variations responsible for strain-to-strain differences in switching rates are discussed in the next section. In both sections, the discussion is broken up according to the different mechanisms of localized hypermutation and particular emphasis is placed on well-described exemplars of each mechanism.
Two classes of factors are required for site-specific recombination to mediate PV. These are a site-specific recombinase and a specific target sequence for this recombinase. At least two copies of the target sequence must be present and these copies must flank a promoter or part of the reading frame of a gene such that a change in the orientation of the DNA fragment located between the target sequences alters gene expression or the structure of the gene product. These systems can exhibit varying degrees of complexity with two or more recombinases acting on the same target sequence in a single locus (McClain et al., 1991), one recombinase acting on multiple occurrences of the same target sequence (Coyne et al., 2003) and overlapping recombination events (Chopra-Dewasthaly et al., 2008). In addition, other factors can influence the activities of these systems. These include the sequences flanking the target sequence, the presence/absence of binding sites for other DNA-binding proteins, transcriptional activity across the locus and variations in expression of the recombinases. General RRR genes are not thought to have a major influence on PV by this mechanism although this area has not been extensively studied. The fim system has been studied in detail and is considered below as an example likely to have parallels in other similar systems (Fig. 3).
PV of type 1 fimbriae expression in Escherichia coli strain K12 is controlled by two recombinases, FimB and FimE, which mediate inversion of a 314-bp DNA fragment containing the promoter for expression of fimbrial structural proteins (Klemm et al., 1986; McClain et al., 1991). The recombinases act on 9-bp inverted repeats present at either end of the DNA fragment (Gally et al., 1996; McCusker et al., 2008). One recombinase, FimB, mediates flipping of the DNA fragment in both directions at similar efficiencies while the other, FimE, has a very high bias towards switching in the ‘on-to-off’ direction (McClain et al., 1991). The result of the differential activities of these two recombinases is that the switch has a higher ‘on-to-off’ switching rate than ‘off-to-on’ when both proteins are fully active. Expression of FimE is, however, regulated by Rho, which destabilizes the fimE mRNA by prematurely terminating transcription of this gene (Hinde et al., 2005). The Rho-dependent binding site is present within the invertible element such that FimE expression is reduced when the switch is in the ‘off’, but not ‘on’, orientation. This autoregulation of FimE resets the switching environment facilitating ‘off-to-on’ switching by FimB. Alterations in the inverted repeats significantly perturb recombination (Gally et al., 1996; McCusker et al., 2008). In contrast alterations in the DNA sequences flanking these repeats, the recombinase-binding elements, have varying effects on the relative binding and recombinase activities of FimB and FimE with the former enzyme being the most tolerant of alterations (Holden et al., 2007a; McCusker et al., 2008). Similarly, alterations in the binding sites for a range of transcriptional regulators can also have profound effects on the activities of the two recombinases. These changes can adjust the bias in the switch for one or other direction. The key catalytic residues of FimB and FimE have been inferred by homology and shown to be essential for the function of FimE (Smith & Dorman, 1999). One mutation, a substitution of arginine-to-lysine at position 59, increased the ‘off-to-on’ switching activity of FimE. Thus the balance between the two directions of switching in the fim locus is influenced by the action of two trans-acting factors acting in concert with a series of cis-acting factors and exhibits a potential for alterations in switching rates due to adaptive mutations.
Phase variable changes in gene expression by this mechanism involve the interdependent interactions of duplicated sequences and trans-acting RRR enzymes. In the simplest systems, changes in gene expression result from alterations in gene copy number. Indeed hypermutability in these systems may differ from normal gene amplification only because the duplicated sequences can persist in genomes for many generations in the absence of selection. The best known example of this mechanism of PV involves changes in expression of the serotype b capsule of H. influenzae. All the genes required for production of the serotype b capsule of H. influenzae are encoded in an 18-kb locus. In one of the two lineages of H. influenzae serotype b strains, this entire locus is duplicated with each copy being flanked by IS1016, an insertion sequence (Kroll et al., 1991, 1993). A deletion in a gene required for capsular export in one copy of the two duplicated loci results in the selective maintenance of this duplication as deletion of one copy of the locus generates bacterial cells unable to export the capsule and hence defective for growth. These defective colonies are generated at frequencies of 0.25 during stationary phase growth indicating that this duplication is highly unstable (Hoiseth et al., 1985). Colonies or biological isolates with three or more copies of the locus have also been detected indicating that amplifications in the copy number of this locus can occur (Corn et al., 1993). The inactivation of recA reduces production of defective colonies with one copy of the locus, indicating that hypermutability results from homologous recombination (Hoiseth et al., 1986). The involvement of other RRR genes in these recombination events or of the effects of variations in RecA activity/expression are not known.
PV by this mechanism involves reversible changes in methylation by a DNA sequence-specific methylase of sites present in promoter elements. The changes in methylation are associated with alterations in binding of transcription factors and hence cause switches in gene expression. Complex interactions between these transcription factors, other DNA-binding proteins and processes such as DNA replication are responsible for the stochastic switches in methylation. PV of a number of genes is known to be mediated by methylation of 5′-GATC sequences by the Dam methylase. PV of the pap locus is described below as an indication of the complexity of the Dam-mediated phase variable systems (van der Woude & Baumler, 2004).
The pap locus of uropathogenic E. coli encodes the pyelonephritis-associated (P) pilus and contains three genes (Pap A, B and I) in two divergently transcribed operons (Fig. 4). The promoter region of papBA, encoding the P-pili structural subunits (PapA) and a regulatory protein (PapB), contains a proximal and a distal set of three binding sites for the leucine-responsive regulatory protein (Lrp). PV of P-pili occurs as a result of differential methylation of 5′-GATC sites present in each of the sets of Lrp binding sites (Braaten et al., 1994). In the ‘off’ state, the proximal 5′-GATC site is unmethylated and bound by the transcriptional regulator Lrp. The transition to the ‘on’ state requires PapI, which forms complexes with Lrp and promotes binding to hemimethylated binding sites (Nou et al., 1995; van der Woude et al., 1995). Following DNA replication, PapI-Lrp complexes bind to the transiently hemimethylated distal Lrp binding sites preventing further methylation. There is a simultaneous loss of Lrp from binding at the proximal Lrp sites resulting in full methylation of these sites and activation of transcription. These transitions are facilitated by the regulatory activities of PapB, CAP (catabolite activator protein), H-NS and RimJ (White-Ziegler et al., 1998, 2002; Weyand et al., 2001). Mutations in the Lrp-binding sites can alter the rates of switching or ‘lock’ expression in either the ‘on’ or ‘off’ phases (Kaltenbach et al., 1998). The locus can also be ‘locked’ into the ‘off’ phase by inactivation of Dam, Lrp or PapI or by overexpression of Dam. Other mutations in Lrp alter the rate of switching. Another intriguing aspect of pap switching is the autoregulatory activities of PapB (Forsman et al., 1989). Low levels of this protein activate papI expression by binding to a low-affinity PapB-binding site in the papI promoter. In this way, PapB stimulates ‘off-to-on’ switching. High levels of PapB, in contrast, bind to low-affinity binding sites in the papB gene and repress expression thereby favouring ‘on-to-off’ switching. The levels of PapB are also controlled by the sensitivity of papB mRNA to degradation by RNAseIII. The autoregulatory activities of PapB serve to keep the pap locus in a dynamic state, which may be an essential feature of the mechanism of PV of this locus. The sensitivity of PV in this locus to environmental regulation is reviewed in a later section, but this brief description indicates the web of interactions between cis-acting sequences, locus-specific factors and general transcriptional regulators required to control PV by differential methylation.
Simple sequence repeats
Tandem arrangements of repetitive DNA sequences are highly mutagenic due to the potential for slippage of the DNA strands during, for example, DNA replication (Streisinger et al., 1966; Levinson & Gutman, 1987). Many phase variable genes contain repetitive sequences consisting of multiple identical repeat units (Hood et al., 1996a; Saunders et al., 1998, 2000; Parkhill et al., 2000). These repeat tracts may be present within reading frames or promoter elements leading to translational or transcriptional switches in gene expression (Fig. 5). The repeats located within genes usually consist of nontrimeric repeats and mediate switching between ‘on’ and ‘off’ PV states. Some repeats located towards the ends of genes are involved in altering protein function while multiple expression states have been found for PV events involving repeats located in the 5′-ends of reading frames. The promoter-associated repeats may be located within the core promoter wherein mutations alter binding of the RNA polymerase or within upstream regions such that mutations alter the interactions between transcriptional regulators and the RNA polymerase [reviewed in Bayliss & Moxon (2005)]. Two examples are described below which demonstrate the relative importance of cis-acting factors and RRR genes for controlling the stability of repetitive DNA in phase variable genes. As this mechanism of PV is present in a range of bacterial commensals and pathogens each with a different composition of RRR genes, this discussion is followed by some general points on the factors likely to be involved in controlling PV in other bacterial species.
Long tracts of tetranucleotide repeats were found in the reading frames of 12 genes within the H. influenzae strain Rd genome (Hood et al., 1996a). Longer repeat units (penta and hepta) and one short dinucleotide repeat tract have been found in other strains, but in general PV in this species is mediated by tetranucleotide repeats. A reporter construct was utilized for measuring PV rates in mod, which contains a 5′-AGTC tract in strain Rd and encodes a putative DNA methyltransferase of a type III restriction-modification system (De Bolle et al., 2000). Repeat number is an important determinant of PV rate for this locus with an increase from 17 to 38 repeat units being associated with a fourfold increase in switching rate. The role of trans-acting factors in controlling PV rate from this locus has been examined in detail. One critical observation was that mutations in mismatch repair (MMR) genes do not destabilize the repeats (Bayliss et al., 2002). A similar lack of change in PV rate has been observed for other mutations including inactivation of the exonucleases, ExoI and RecJ (Bayliss et al., 2004; Kumar et al., 2008). Contrastingly, significant increases in PV rate resulted from deletion of the Klenow fragment of DNA polymerase I or from inactivation of RNAseHI (Bayliss et al., 2002, 2005). The implication of these findings is for a major role of lagging strand DNA synthesis in controlling the PV rate of genes containing repeat units of four or more nucleotides.
The genome sequences of N. meningitidis have revealed a propensity for mononucleotide repeats consisting of Cs or Gs as a mechanism of PV (Saunders et al., 2000; Snyder et al., 2001; Martin et al., 2003). Repeats of this type may control PV of >25 genes with a further 15 or more genes being subject to PV due to tetra, penta or other longer repeat units. Comparisons of one phase variable gene between strains indicated a significant role for the length of a polyC tract in controlling PV rate (Richardson et al., 2002). Mutations in mutS or mutL produce 100- to 1000-fold increases in switching rate for genes containing mononucleotide repeats, but not genes with tetranucleotide repeats (Richardson & Stojiljkovic, 2001; Martin et al., 2004). In contrast, inactivation of dam does not influence mutability, which is consistent with the absence of a homolog for the gene, mutH, linking Dam methylation to mismatch repair in E. coli. Extensive investigations of roles for other RRR genes in controlling PV rates have not been performed in species utilizing repetitive DNA for PV.
These studies of repeat-mediated PV in H. influenzae and N. meningitidis and the extensive studies of the mutability of repetitive DNA in E. coli (Strauss et al., 1997; Morel et al., 1998; Kim et al., 2006) all highlight a strong link between the activities of RRR genes and the stability of repeat tracts. Significant differences in complements of RRR genes are exhibited by three of the other bacterial genera, namely Helicobacter, Campylobacter and Mycoplasma, in which this mechanism of PV is a major feature. Most notably, these species lack MMR genes while repeat units contain less than four nucleotides, suggesting either that PV rates may be particularly high in these species or there are alternate pathways for reducing mutability in repetitive DNA tracts. Current evidence for C. jejuni supports the former hypothesis (Linton et al., 2000). Continuing analyses of the roles of RRR genes in determining repeat-mediated PV rates is required to underpin confident predictions of PV rates for a range of bacterial species.
Epidemiological variations in and environmental regulation of PV rate
Phase variable genes are mediators of many interactions between bacterial cells and their environments. The fluctuations in the strength and frequency of these selective pressures combined with alternative adaptive strategies have an impact on whether or not PV rate is maintained at a particular level or indeed if a phase variable gene is retained within a particular strain. One consequence is likely to be strain-to-strain (i.e. epidemiological) variations in PV rate of a particular locus or of multiple loci. Another consequence is the environmental regulation of PV rates, although the occurrence of such regulation is less intuitive than epidemiological variation. Transcriptional and translational responses to environmental cues are the most widespread strategy for adaptation to environmental changes. Localized hypermutation in contrast generates adaptive variants in a stochastic manner before environmental change. This ‘contingent’ response is thought to provide an advantage in the face of strongly bactericidal challenges (Moxon et al., 1994, 2007). It is apparent, however, that the PV rate of many phase variable genes is under environmental control (van der Woude & Baumler, 2004). The rationale for this control would appear to be that generation of a large numbers of variants only when required is more advantageous than constant production. One aspect of this rationale that has been widely explored in the context of genome-wide mutation rates is whether exposure to ‘stress’, such as the absence of nutrients or the presence of low levels of antibiotics, induces the generation of genetic variants and hence facilitates adaptation (Blazquez et al., 2003; Pettersson et al., 2005; Galhardo et al., 2007). The examples below highlight some of our current knowledge with respect to epidemiological variations in PV rate and environmental regulation of switching rates. These examples indicate that switching rates in some genes are under strong regulation by environmental factors while evidence for environmental regulation of other phase variable genes is weak or nonexistent. These examples also highlight the importance of epidemiological variations in PV rate, an area deserving more attention as it may have a crucial influence on persistence, transmissibility and virulence of bacterial pathogens and commensals.
Epidemiological variations in PV rate by this mechanism can arise from changes in the sequences of cis-acting regulators of recombination or from changes in the presence or activity of the recombinases or other regulatory proteins. Similarly, environmental control of mutability can be exerted in these systems through control of the expression of the relative levels of the factors required for switching. In simple systems this could mean varying the levels of the recombinases in response to external signals. In more complex systems, each trans-acting factor could be regulated by different signals enabling quantitative and qualitative responses to multiple environmental signals. Observations of either epidemiological or environmental variations in PV rate require detailed knowledge of the switching mechanisms or functional comparisons. As a result, few systems have been studied in much detail.
The E. coli fim switch is responsive to multiple external signals. The recombinases, FimB and FimE, are inversely regulated by temperature through the action of H-NS on their promoters with FimB being induced at high temperatures while FimE is repressed (Olsen et al., 1998). This regulation combined with the higher activity and bias of FimE towards ‘off-to-on’ switching means that at low temperatures, the population is largely in the ‘off’ phase while ‘on’ variants are induced in significant levels at temperatures found during colonization of host niches. The rate of switching is also responsive to other signals associated with host environments including levels of various branched-chain amino acids (particularly leucine), alanine, sialic acid and N-acetylglucosamine (Lahooti et al., 2005; Sohanpal et al., 2007). These signals exert an influence on switching rates either through regulation of the levels of FimB or through binding of regulatory proteins, such as Lrp and IHF, to the fimS invertible element and consequent alterations in the relative recombinase activities of FimE and FimB. Many of these signals are associated with the induction of a host immune response and may enable regulation of the levels of ‘on’ and ‘off’ variants in the bacterial population relative to host immune status (Chu & Blomfield, 2007). The proposed advantage of this regulation is that the bacterial population can utilize the metabolites associated with an immune response while limiting clearance by immune effectors. This fairly simple scenario is further complicated by the cross-regulation of fimbrial expression by other surface-expressed molecules, which are themselves regulated by the environment and/or by PV. Thus type I fimbriae expression is repressed by PapB in many clinical isolates of E. coli (Holden et al., 2006).
The complexity of environmental regulation has meant that few studies have tackled epidemiological variations in fim switching rates. Phase variable type I fimbriae are found in most E. coli isolates and in other related species. A study of 50 uropathogenic E. coli isolates demonstrated variation in fim switching rates and localized some of this variation to specific changes in cis-acting sequences within and adjacent to fimS (Leathart & Gally, 1998). Additional recombinases with the capacity to cause ‘flipping’ of fimS have been found and these recombinases are more prevalent in uropathogenic E. coli isolates than in faecal isolates (Bryan et al., 2006; Xie et al., 2006; Hannan et al., 2008). Similarly, the related species Klebsiella pneumoniae has an additional gene, fimK, that regulates the frequency of switching and reduces generation of ‘on’ variants (Struve et al., 2008). There is, therefore, significant potential for variability in the fim switching rates both within and between species, which may have a major impact on the types of diseases caused by these organisms.
Few studies have been performed to detect epidemiological variations in recombination-mediated PV rates. One indication of the existence of such variations has come from studies of naturally-occurring RecB variations in meningococcal populations (Salvatore et al., 2002). The RecBCD pathway is required for antigenic variation of the neisserial pilin and meningococcal strains belonging to the ET-37 hypervirulent lineage carry multiple mis-sense mutations in RecB. These isolates have elevated recombination frequencies in the pilin loci. It is likely that naturally occurring variations of this type in RRR genes will generate epidemiological variations in PV rates by this and related phase variable mechanisms.
The involvement of multiple cis- and trans-acting factors in controlling methylation-mediated PV events provides a significant resource for adjustments of switching rates. A potent effect of environmental conditions can be channeled through changes in the relative levels and activities of the regulators of switching. Indeed, in many cases, environmental regulation is an inherent part of this switching mechanism as transcription factors are involved in controlling the changes associated with ‘on’ and ‘off’ states. There is also ample potential for epidemiological variations in switching rates but this aspect has not received significant attention. Variations in the activity of the dam gene, for example, could have a major influence on controlling PV rates by this mechanism, and dam mutants are known to exhibit dysregulation of many of these phase variable systems. There is, however, no evidence of significant diversity in the presence or absence of the dam gene in isolates of Enterobacteriaceae.
Progress has been made in characterizing both environmental and epidemiological variations in switching rates of the pap genes. Key factors involved in mediating phase variable changes in methylation of the pap promoter are all regulated by external signals (Hernday et al., 2004). There is a complex series of interactions with some proteins influencing both the rate of switching and the expression levels of other regulators of switching. Proteins with a direct influence on switching include Lrp, H-NS and CRP. H-NS protein can bind to both GATC sites thereby blocking methylation and causing all cells to enter the ‘off’ phase within a single generation in response to low temperatures (White-Ziegler et al., 1998). The complex of CRP, Lrp and PapI protects the distal GATC from methylation but renders the formation of this complex sensitive to levels of cAMP and hence regulation of switching by catabolites such as glucose (Weyand et al., 2001). The other direct regulators of switching, PapB and PapI, are in turn transcriptionally regulated by the CRP, H-NS and RimJ (responsible for modification of ribosomal protein S5). RimJ, for example, represses papBA transcription in response to low temperatures inhibiting ‘off-to-on’ switching (White-Ziegler et al., 2002). These various regulators enhance transition into the ‘off’ state in response to low temperatures, high osmolarity and high levels of glucose. A hint of epidemiological variation in switching rates was provided by comparison of the P-fimbrial switching rates of two E. coli clinical isolates with strain K12 (Holden et al., 2007b). Higher ‘off-to-on’ transition frequencies were observed for the clinical isolates. More recently analysis of the regulatory regions of pap operons in 54 E. coli isolates has identified a number of variable regions in both the switch sites and the regulators papI and papB (Totsika et al., 2008).The functional consequences of this variation is discussed in the context of co-ordinated regulation of PV but an intriguing feature was that sequence variations in the PapB-binding site within the papI promoter region involved alterations in the numbers of a 9-mer repeat. This suggests that the regulation of pap PV may itself be subject to PV due to slippage in a tandem DNA repeat tract. The acquisition of this epidemiological data is a template of relevance to many other phase variable systems of this and other types.
Simple sequence repeats
The simplicity of repeat-mediated PV would appear to prevent a sensitive regulation by environmental signals. Some of the current evidence for environmental regulation is discussed below. In contrast, there are abundant opportunities for epidemiological variation. The most striking evidence for differences in PV rate between isolates of a species is provided by the significant variations in repeat number (a major determinant of PV rate). The lic1 locus of H. influenzae, for example, varies from 5 to 57 5′CAAT repeats (High et al., 1996). The H. influenzae mod locus varies in both the sequence of the repeat units (5′-AGTC or 5′-AGCC), in repeat number or whether repeats are present (De Bolle et al., 2000). Wide variations in repeat number of mono-, tetra- and pentanucleotide repeat tracts have also been noted for N. meningitidis (Jennings et al., 1999; van der Ende et al., 2000; Martin et al., 2003). The most widely studied genes are the iron acquisition genes, hmbR and hpuA. These genes exhibit variations from 7 to 17 nucleotides in their polyG tracts, which, as noted above, equate to 10–1000-fold variations in PV rate (Richardson et al., 2002).
Modal repeat numbers have not been examined in a concerted fashion for many phase variable genes. A consideration of the data on repeat number for sporadic isolates of H. influenzae suggested quite wide variations in the lengths of tetranucleotide repeats for some genes (Moxon et al., 2007). In contrast, the meningococcal hmbR and hpuA genes contained 7–9 repeats in 82 of 104 tracts of a range of epidemic and sporadic isolates (Richardson et al., 2002). Modal repeat numbers may reflect a selective advantage of a particular PV rate. Alternatively there may be functional constraints on repeat tract length or molecular drivers, which favour particular lengths. An understanding of the latter requires detailed knowledge of the patterns of mutational events. Mutations in the tetranucleotide repeats of H. influenzae, for example, are dominated by changes of a single repeat unit (c. 90% of events) with deletions occurring more frequently than insertions (De Bolle et al., 2000). Thus the mutational events will tend to drive these repeats to ever-shorter lengths in the absence of a selection for mutability. Whether similar forces are at work in other species is unclear as patterns of mutability have rarely been examined in detail. Functional constraints may come from the maintenance of protein function or, for the transcriptional-associate repeats, promoter activity. The repeat tracts of the hif locus of H. influenzae or the fetA gene of Neisseria gonorrhoeae are located between the −10 and −35 elements of the core promoter (De Bolle et al., 2000). The spacing of these elements has to be of a specific length to obtain a high level of expression and, hence, tract length is constrained by this requirement. Surprisingly, significant variations in repeat number have been observed for the transcriptionally located repeats of the neisserial porA and nadA genes (van der Ende et al., 2000; Martin et al., 2005), indicating the difficulties of making generalizations. One emerging theme is for significant flexibility in repeat number meaning that determinations of modal repeat numbers may be informative with respect to understanding the importance of PV rate to the adaptive potential of individual loci.
Epidemiological variations in mutability due to trans-acting factors are a major consideration for repeat-mediated PV. The most significant finding is of three classes of PV rate among neisserial strains with high PV rates being associated with epidemic isolates of serotype A isolates (Richardson et al., 2002). High PV rates could be reduced to low rates by complementation with mutS or mutL genes derived from strains with low PV rates. Though the PV rates of only two genes were examined, these isolates are likely to have high switching rates for all the genes containing mononucleotide repeats presenting a significant opportunity to generate genetic variability within their populations. Strains with elevated PV rates also had high basal mutation rates (Richardson et al., 2002). The potential for wide variations in PV rate in other species utilizing mononucleotide repeats as a mechanism of PV should be considered in light of observed differences in basal mutation rates. A potential role for strain-to-strain variations in PV rate due to trans-acting factors is less likely for genes with four or more repeats as mutations in the RRR genes known to control mutability also reduce growth rate and, hence, are not as likely to occur in nature. A formal exploration of PV rate of these genes in multiple isolates is, however, required to test this hypothesis.
Examination of environmental regulation of switching rates of SSR phase variable genes is hampered by the absence of any obvious cis-acting sequences with the potential to render an interaction between mutability and an environmental signal (note: the expression of loci can be responsive to environmental regulation of transcription). As a result, experimentation has focused on determining whether environmental regulation of trans-acting factors alters switching rates. A likely candidate is the SOS response, which is induced in response to DNA damage and hence is indicative of a hostile external environment. The induction of the SOS response did not elevate PV rates of the tetranucleotide repeat tracts of H. influenzae (Sweetman et al., 2005). The SOS regulon of this species, however, lacks the error-prone DNA polymerases, which may account for the absence of a stimulation of switching rates. The overexpression of one of these polymerases, PolIV, in N. meningitidis produced a twofold increase of switching rates due to a mononucleotide repeat tract (Martin et al., 2004). It is unclear, however, whether this finding has relevance to environmental regulation of switching rates as a LexA homolog, and hence an E. coli-like SOS response, is absent from this species. In contrast, the meningococcal homolog of ExoVII is induced in response to host cell attachment and overexpression of this protein elevates switching rates of mononucleotide repeat-mediated phase variable genes by twofold (Morelle et al., 2005). This implies that a general elevation of PV rates will occur upon host cell attachment by meningococcal cells. Elevation of PV rates in this location may facilitate generation of phase variants with altered expression of adhesins or invasins favouring invasion of the host cell or deattachment and spread to other niches. There is also evidence of an increase in meningococcal PV rates during transformation due to inhibition of MMR, one of the main regulators of PV rates in this species. This increase is thought to occur as a result of inhibition of MMR activity by ssDNA molecules imported into the cytoplasm during transformation (Alexander et al., 2004). There is then some limited evidence for environmental regulation of repeat-mediated PV rates, but no general consensus on whether this is a general phenomenon or if the increases will have a significant impact on adaptation.
Adaptive advantages of differing PV rates
An unwritten but burning question in the field of PV is: ‘What is the importance of the PV rate?’. Switching rates of phase variable genes cover a wide spectrum due to heterogeneity in intrinsic factors, epidemiological variations in regulators and environmental regulation of switching rates. All of this implies that PV rate confers an adaptive advantage. There is a formal possibility however that PV rate only needs to exceed a certain threshold to provide an adaptive advantage. This section will consider our experimental and theoretical knowledge of differences in switching rates in the context of both single and multiple phase variable genes.
Theoretical analyses and computer modelling of the influence of PV rate on adaptation
PV involves a mutation generator and a selectable marker. The close-linkage between these elements is critical as it significantly increases the ability of the generator to be inherited, i.e. ‘hitch-hike’, with the beneficial allele that it generates. Another key aspect is the fact that phase variable systems do not generate deleterious mutations in other genes. Many theoretical studies have focused on mutator phenotypes and are not applicable to phase variable loci (Taddei et al., 1997). Leigh (1970) and Ishii et al. (1989), however, derived a simple two-locus model system in which a mutator locus generates and reverses a mutation in a selectable gene enabling survival in two different environments. They found that the mutation rate is determined by the length of time between alterations in the environment and the strength of selection. This relationship is complicated by genetic drift in finite populations preventing evolution of such a phase variable system if the amount of time between switches in the environment is very long and selection is weak (Palmer & Lipsitch, 2006).
The complex fim switch of E. coli has also been subject to theoretical analysis. Models of this system have incorporated the observation that the combination of ‘on’ and ‘off’ variants in the population determines the level of fimbriation, which has quantitative effects on stimulation of the innate immune system. The ‘game theory’ model of Wolf (2005) indicated that a phase variable system of this type would evolve as a factor of the time variance of environmental switches and frequency-dependent selection of variants. Chu & Blomfield (2007) incorporated more experimental observations into their model. Their finding was that this system regulates fimbriation by preventing the population from reaching a steady state, where half the cells are fimbriated. This is achieved through quantitative adjustments of the balance between the two recombinases and enables rapid resetting of the level of fimbriation of the population in response to environmental signals or selection.
A key implication of these models is of a significant role for the frequency of environmental alterations and the strength of fitness effects of phase variants in determining mutation rates. Deriving experimental values for these factors will be a significant challenge. The incorporation of actual experimental findings on factors controlling mutability will be easier. Both of these aspects are important for acquiring testable hypotheses from this approach. Models will inform experimental investigations into the role of PV rate in mediating adaptation to environmental fluctuations.
Experimental evidence of an adaptive advantage of differing PV rates
Construction of non-phase variable derivatives has been the main approach utilized for investigating the importance of switching rates on adaptation due to phase variable loci. In one example of this approach, uropathogenic E. coli carrying a phase-locked ‘off’fim locus were constructed and found to be attenuated relative to wild-type strains in disease models of cystitis or urinary tract infections (Gunther et al., 2002; Snyder et al., 2006). Contrastingly, phase-locked ‘on’ variants were either not attenuated or slightly more virulent in these models. Most recently, the expression of the E. coli recombinase, FimX, in K. pneumoniae was shown to increase both the frequency of ‘on’ variants of the fim locus and the associated ability to produce biofilms (Rosen et al., 2008). As biofilm production was correlated with urinary tract persistence, these experiments indicate that a heightened switching rate may facilitate colonization of the urinary tract. Thus there is the potential to move from the phase-locked experiments showing the importance of switching per se to more intricate experiments on the adaptive advantages of switching rates.
Examination of PV-mediated escape of killing of N. meningitidis by a bactericidal monoclonal antibody has generated evidence of the importance of the rate of switching for adaptation (Bayliss et al., 2008). Escape was examined in an in vitro assay in which bacterial populations are subject to multiple cycles of growth and killing in the presence of the antibody and human serum as a source of complement. Escape of mAb B5 was mediated by phase variants with alterations in the repeat tract of lgtG and hence absence of the epitope recognized by the antibody. A mutation was constructed in the mutS gene resulting in an c. 1000-fold increase in PV of the mAb B5 phenotype. This mutator strain out-competed the wild-type strain during escape of the mAb B5 antibody in both the in vitro assay and during escape of passive protection in an in vivo model of infection providing evidence that heightened PV rates can confer a competitive advantage. Escape in this assay was due to phase variants present in the inoculum and indeed the mismatch repair mutation facilitated escape of the antibody even when small inoculums (c. 5000 CFU) were used. These findings suggest that elevated PV rates may evolve as a result of a change in either the severity of selection (e.g. the adaptive immune response) or the size of a bottleneck. The development of in vivo models of this type will generate further evidence of the importance of PV rate and will permit assessment of whether the small differences generated by epidemiological or environmental factors can provide an adaptive advantage in biologically relevant situations.
Independence or co-ordination of PV rates in multiple loci
The mutational events responsible for PV are thought to occur in a stochastic fashion independent of events in other phase variable loci. This dependency on mutation rate will influence the fluctuations in appearance and survival of phase variants. Thus, for example, a cell with three phase variable loci in the ‘off’ phase will produce three genotypes with one of these loci switched ‘on’. The numbers of variants of each of these genotypes will then be determined by the PV rate of each locus subject to some population-to-population variability engendered by the stochastic nature of the PV events. If these three genotypes have identical or even similar functions, then selection will favour the propagation of the most common genotype, and there is the potential for an ordered appearance of phase variants (Fig. 6a). If the genotypes have different functions, then selection will favour propagation of the best adapted phase variant in the population, which will not always be the most common. Nonselective bottlenecks could also have a severe impact on which variants survive in certain situations. Nevertheless, there is a potential for PV rates to be a major determinant of the turnover of phase variable populations. Examination of the waves of high levels of bacteremia in mice infected with Borrelia hermisii have found some ordering of the appearance of antigenic variants due to the differing switching rates of the loci encoding these proteins (Barbour et al., 2006). A similar ordering was not apparent for the multiple phase variable Opa proteins of N. gonorrhoeae during experimental infections of human volunteers, although in this case the relative switching rates of the opa genes was unknown (Jerse et al., 1994). The general relevance of these findings will require more detailed examinations of other model systems and epidemiological analyses of natural infections.
Another consequence of the dependency on PV rate is that variants with two loci switched ‘on’ will be generated at low frequencies and can only be propagated if they impart a significantly high selective advantage. Modelling of phase variable populations indicates that phase variants with switches in two genes will always appear in small populations (1 × 106 cells) at observed PV rates for H. influenzae genes (De Bolle et al., 2000). The independence of PV events should not, therefore, prevent the rapid generation of multiple phase events in small populations on a repeated basis in most phase variable populations. This may be particularly important during, for example, immune evasion when it may be necessary to switch ‘off’ one surface determinant and switch ‘on’ a related determinant.
Recent analyses of phase variable P-pili have uncovered evidence for the co-ordination of PV events (Totsika et al., 2008). Uropathogenic E. coli contain multiple loci encoding phase variable P-pili. Each of these loci was observed to exhibit allelic variation in the switch regions and in the regulators PapI and PapB. Expression of the various PapI alleles in a reporter system indicated that these variant trans-acting regulators had differential effects on the PV rates depending upon the allelic variation in the switch. Allelic variation in the high-affinity PapB-binding site of the papI promoter also modified switching rates. The authors of this paper concluded that there could be a co-ordinated regulation of the phase variable genes (Fig. 6b and c). Thus switching ‘on’ of one gene will increase the PV rates of another specific phase variable allele favouring progression to switching ‘on’ of this allele. This second allele will then favour switching of a third allele. Cross-regulation of expression of surface markers has been described for fimbriae, P-pili and flagella in E. coli (Holden et al., 2006; Lane et al., 2007). The co-ordination of switching rates was proposed to form a component of this cross-regulation, which is thought to prevent coexpression of surface structures with antagonistic functions.
Fitness implications of differing PV rates
Previous sections have discussed how the differing PV rates of the fim locus impact on fitness. There are a number of other potential situations in which PV rate could impact on fitness. One situation in which there is strong circumstantial evidence for an influence of switching rate was provided by the examination of PV rates of meningococcal isolates (Richardson et al., 2002). High PV rates were much common among epidemic isolates than nonepidemic isolates, indicating that rapid transmission was responsible for the increase in mutability. Transmission of pathogenic bacteria is often mediated by small populations. If phase variants are required for colonization or survival in the new host, high PV rates will enhance transmission by significantly increasing both the number of transmitted populations containing appropriate phase variants and the number of phase variants in each population. These bottlenecks associated with transmission may also apply to other stages of the infection process (Moxon & Murphy, 1978). Contiguous spread of Salmonella enterica from the intestinal tract to other organs can also be subject to stochastic selection (i.e. nonselective bottlenecks) (Grant et al., 2008). The ability of PV to enable rapid adaptation following severe reductions in bacterial population size may therefore be applicable to many aspects of bacterial pathogenesis.
Meningococcal epidemics occur in the face of increasing levels of immunity in the host population. High PV rates facilitate escape of such adaptive responses as many surface molecules are subject to ‘on-off’ switches in gene expression. Thus there may be a strong selection for an increase in PV rates as the host population becomes immune to the circulating strain. Immune selection may also have led to the prevalence of high PV rates among the epidemic meningococcal strains (Richardson et al., 2002) and may be applicable to the variations in PV rates of other bacterial species. A similar scenario may be invoked for bacteriophage infections, as these organisms frequently attach to surface molecules. This may be particularly relevant for C. jejuni. Many bacteriophages have been identified for this bacterial species, which exhibits multiple phase variable loci and may have high PV rates for these loci. The presence of phase variable restriction-modification systems in many bacterial genomes also suggests a strong selective effect from bacteriophage infection (Dybvig et al., 1998; Zaleski et al., 2005). Again there is a potential for the waxing and waning of waves of bacteriophage infections to have a major influence on the switching rates of bacterial populations.
Phase variable systems present significant contrasts with some systems exhibiting a high level of complexity involving multiple factors and levels of regulation and others with much simpler mechanisms for generating and regulating the production of phase variants. As this field develops, we should find out whether complexity is a common feature or a specialization of particular types of phase variable genes. These systems also differ in their interactions with RRR genes. Those based on site-specific recombination appear to be largely independent of these factors while generation of mutations during repetitive DNA-mediated PV is intimately linked to the activities of these factors. This overlap of genome maintenance functions and localized hypermutation requires further investigation as a selection for elevated PV rates may drive changes, or even loss, of the pathways required for preventing mutations. A particular interest may be the essential RRR genes such as components of the replicative DNA polymerase, which have not been investigated in any detail. Finally we need to understand more about the fitness effects of both large and small variations in PV rates in order to interpret the observations on epidemiological differences in switching rates and to reach an appreciation of the degree to which PV contributes to the commensal and pathogenic behaviour of bacteria.
The author thanks the reviewers and Richard Haigh for helpful comments during revision of this manuscript. C.D.B. is supported by an RCUK Fellowship.