Abstract

The archetypal two-component signal transduction systems include a sensor histidine kinase and a response regulator, which consists of a receiver CheY-like domain and a DNA-binding domain. Sequence analysis of the sensor kinases and response regulators encoded in complete bacterial and archaeal genomes revealed complex domain architectures for many of them and allowed the identification of several novel conserved domains, such as PAS, GAF, HAMP, GGDEF, EAL, and HD-GYP. All of these domains are widely represented in bacteria, including 19 copies of the GGDEF domain and 17 copies of the EAL domain encoded in the Escherichia coli genome. In contrast, these novel signaling domains are much less abundant in bacterial parasites and in archaea, with none at all found in some archaeal species. This skewed phyletic distribution suggests that the newly discovered complexity of signal transduction systems emerged early in the evolution of bacteria, with subsequent massive loss in parasites and some horizontal dissemination among archaea. Only a few proteins containing these domains have been studied experimentally, and their exact biochemical functions remain obscure; they may include transformations of novel signal molecules, such as the recently identified cyclic diguanylate. Recent experimental data provide the first direct evidence of the participation of these domains in signal transduction pathways, including regulation of virulence genes and extracellular enzyme production in the human pathogens Bordetella pertussis and Borrelia burgdorferi and the plant pathogen Xanthomonas campestris. Gene-neighborhood analysis of these new domains suggests their participation in a variety of processes, from mercury and phage resistance to maintenance of virulence plasmids. It appears that the real picture of the complexity of phosphorelay signal transduction in prokaryotes is only beginning to unfold.

1 Introduction

The availability of the complete sequences of more than 40 microbial genomes representing eight of the 10 main bacterial phyla and both major branches of archaea (http://www.ncbi.nlm.nih.gov/PMGifs/Genomes/new_micr.html) increasingly impacts our understanding of microbiology, providing ample data for the analysis of the metabolism, cell organization, and evolution of prokaryotes [1,2]. Cross-genome comparisons improve our understanding of each particular genome, allowing prediction of general (biochemical) functions of uncharacterized genes based on their conserved domain organization, gene neighborhood (operon organization), and phylogenetic patterns (presence in some species but not others) [3–5].

Domain architecture has proven particularly informative for analyzing multi-domain proteins involved in signal transduction. In sensor histidine kinases, several new domains have been described, such as the phosphotransfer Hpt domain [6], the heme- and flavin-binding PAS domain [7,8], the extracellular ligand-binding Cache domain [9], the cGMP-binding GAF domain [10,11], and the HAMP linker domain [12] (see [13–17] for recent reviews). Analysis of the downstream signal transduction module revealed an even greater diversity. In addition to well-known response regulators, which consist of a CheY-like phosphoacceptor domain and a helix–turn–helix (HTH) DNA-binding domain, bacterial genomes were found to encode a variety of response regulators with unusual domain organization, featuring still poorly characterized domains, such as GGDEF [18–20], EAL [19,21], and HD-GYP [22,23].

Because of their complex domain organization, signaling proteins are often poorly annotated in sequence databases such as GenBank, most often just as ‘sensor protein’ or ‘response regulator’. However, detailed sequence and structure analyses of these novel domains have been performed and sequence alignments are currently available in several protein domain databases, including SMART [24], COGs [4], and Pfam [25] (Table 1).

1

Conserved domains of the bacterial signal transduction systems

Domain nameLength (aa)FunctionStructureDomain database entryaReference
SMARTCOGPfam
Sensor moduleb
FliY∼220amino acid binding2laoPBPb3438PF00497[26]
Cache∼120small ligand binding3290dPF02743[9]
MHYT∼200metal binding?transmembrane3300
PAS∼100FAD, heme, and cinnamic acid binding2phyPAS2202PF00989[13]
PACPF00785
GAF∼150cGMP binding, photopigment binding1f5mGAF2203PF01590[10,11]
HAMP∼50dimerization?1joyHAMP2770PF00672[12]
His kinase 1∼80phosphoacceptor, dimerization1joyHisKA0642dPF00512[14,15,17]
His kinase 2∼120Phosphorylation of His kinase 1 domain1bxdHATPase0642dPF02518[14,15,17]
Hpt∼100phosphoacceptor2a0bHPT2198PF01627[6,14]
Response moduleb
CheY∼120phosphoacceptor2cheREC0784PF00072[15,51]
HTH∼120DNA bindingccc[15,17]
AAA∼300σ54-binding ATPase1d2nAAA2204PF00158[52]
GGDEF∼170c-diGMP formation?similar to 1cjv?DUF12199PF01590[19,33]
EAL∼250c-diGMP hydrolysis?DUF22200PF00990[19]
HD-GYP∼170phosphodiesterase?2206[22]
Domain nameLength (aa)FunctionStructureDomain database entryaReference
SMARTCOGPfam
Sensor moduleb
FliY∼220amino acid binding2laoPBPb3438PF00497[26]
Cache∼120small ligand binding3290dPF02743[9]
MHYT∼200metal binding?transmembrane3300
PAS∼100FAD, heme, and cinnamic acid binding2phyPAS2202PF00989[13]
PACPF00785
GAF∼150cGMP binding, photopigment binding1f5mGAF2203PF01590[10,11]
HAMP∼50dimerization?1joyHAMP2770PF00672[12]
His kinase 1∼80phosphoacceptor, dimerization1joyHisKA0642dPF00512[14,15,17]
His kinase 2∼120Phosphorylation of His kinase 1 domain1bxdHATPase0642dPF02518[14,15,17]
Hpt∼100phosphoacceptor2a0bHPT2198PF01627[6,14]
Response moduleb
CheY∼120phosphoacceptor2cheREC0784PF00072[15,51]
HTH∼120DNA bindingccc[15,17]
AAA∼300σ54-binding ATPase1d2nAAA2204PF00158[52]
GGDEF∼170c-diGMP formation?similar to 1cjv?DUF12199PF01590[19,33]
EAL∼250c-diGMP hydrolysis?DUF22200PF00990[19]
HD-GYP∼170phosphodiesterase?2206[22]

aProtein domain databases that contain sequence alignments of these signaling domains include SMART (http://smart.embl-heidelberg.de[24]), COGs (http://www.ncbi.nlm.nih.go/cog[4]), and Pfam (http://www.sanger.ac.uk/Software/Pfam[25]).

bGGDEF, EAL and HD-GYP domains occur in fusions with sensor module domains as well as response module domains (CheY). Therefore, their classification as parts of response modules is based on their predicted functions and requires experimental verification.

cHTH domains of response regulators are not listed as separate domains in SMART, COGs, or Pfam.

dMembers of this COG contain more than one domain.

1

Conserved domains of the bacterial signal transduction systems

Domain nameLength (aa)FunctionStructureDomain database entryaReference
SMARTCOGPfam
Sensor moduleb
FliY∼220amino acid binding2laoPBPb3438PF00497[26]
Cache∼120small ligand binding3290dPF02743[9]
MHYT∼200metal binding?transmembrane3300
PAS∼100FAD, heme, and cinnamic acid binding2phyPAS2202PF00989[13]
PACPF00785
GAF∼150cGMP binding, photopigment binding1f5mGAF2203PF01590[10,11]
HAMP∼50dimerization?1joyHAMP2770PF00672[12]
His kinase 1∼80phosphoacceptor, dimerization1joyHisKA0642dPF00512[14,15,17]
His kinase 2∼120Phosphorylation of His kinase 1 domain1bxdHATPase0642dPF02518[14,15,17]
Hpt∼100phosphoacceptor2a0bHPT2198PF01627[6,14]
Response moduleb
CheY∼120phosphoacceptor2cheREC0784PF00072[15,51]
HTH∼120DNA bindingccc[15,17]
AAA∼300σ54-binding ATPase1d2nAAA2204PF00158[52]
GGDEF∼170c-diGMP formation?similar to 1cjv?DUF12199PF01590[19,33]
EAL∼250c-diGMP hydrolysis?DUF22200PF00990[19]
HD-GYP∼170phosphodiesterase?2206[22]
Domain nameLength (aa)FunctionStructureDomain database entryaReference
SMARTCOGPfam
Sensor moduleb
FliY∼220amino acid binding2laoPBPb3438PF00497[26]
Cache∼120small ligand binding3290dPF02743[9]
MHYT∼200metal binding?transmembrane3300
PAS∼100FAD, heme, and cinnamic acid binding2phyPAS2202PF00989[13]
PACPF00785
GAF∼150cGMP binding, photopigment binding1f5mGAF2203PF01590[10,11]
HAMP∼50dimerization?1joyHAMP2770PF00672[12]
His kinase 1∼80phosphoacceptor, dimerization1joyHisKA0642dPF00512[14,15,17]
His kinase 2∼120Phosphorylation of His kinase 1 domain1bxdHATPase0642dPF02518[14,15,17]
Hpt∼100phosphoacceptor2a0bHPT2198PF01627[6,14]
Response moduleb
CheY∼120phosphoacceptor2cheREC0784PF00072[15,51]
HTH∼120DNA bindingccc[15,17]
AAA∼300σ54-binding ATPase1d2nAAA2204PF00158[52]
GGDEF∼170c-diGMP formation?similar to 1cjv?DUF12199PF01590[19,33]
EAL∼250c-diGMP hydrolysis?DUF22200PF00990[19]
HD-GYP∼170phosphodiesterase?2206[22]

aProtein domain databases that contain sequence alignments of these signaling domains include SMART (http://smart.embl-heidelberg.de[24]), COGs (http://www.ncbi.nlm.nih.go/cog[4]), and Pfam (http://www.sanger.ac.uk/Software/Pfam[25]).

bGGDEF, EAL and HD-GYP domains occur in fusions with sensor module domains as well as response module domains (CheY). Therefore, their classification as parts of response modules is based on their predicted functions and requires experimental verification.

cHTH domains of response regulators are not listed as separate domains in SMART, COGs, or Pfam.

dMembers of this COG contain more than one domain.

Here, we briefly review the diversity of newly discovered prokaryotic signaling domains and discuss the emerging complex picture of prokaryotic signal transduction.

2 Recently discovered prokaryotic signaling domains

The archetypal two-component signal transduction systems include a sensor module, which consists of an extracytoplasmic or membrane-associated sensor input domain and a cytoplasmic histidine kinase domain with an ATPase and phosphoacceptor subdomains (Table 1), and a response regulator, which consists of a receiver CheY-like domain and a DNA-binding domain. Functions of these domains and of the PAS domain, commonly found in the sensor module, have been reviewed recently [13–17] and will not be considered here in detail.

2.1 Extracytoplasmic ligand-binding sensor domains

Periplasmic (in Gram-positive bacteria – extracytoplasmic) ligand-binding sensor domains are extremely diverse. The most common type of such domains (FliY-type, Table 1) is homologous to the periplasmic solute-binding protein components of the ATP-dependent transport systems [26]. Several sensor kinases, for example Escherichia coli EvgS, contain duplicated FliY-type domains followed by a transmembrane segment that anchors them to the membrane. Another periplasmic ligand-binding domain, Cache, is found in sensor kinases and in the extracytoplasmic parts of methyl-accepting chemotaxis proteins, such as Bacillus subtilis McpA and McpB [9]. There are many other types of ligand-binding sensor domains that are apparently specific for the recognition of narrow groups of substrates, such as metals, citrate, nitrate, etc.

Despite their diversity, many sensor modules have the same domain architecture with an N-terminal transmembrane segment (likely uncleavable signal peptide), a relatively large (100–300 aa) periplasmic domain, and a second transmembrane segment, followed by a HAMP domain and a cytoplasmic signal-transducing domain. In addition to extracytoplasmic ligand-binding domains, membrane-bound signaling domains also exist, as exemplified by the recently identified MHYT, a predicted metal-binding, redox-sensing domain (MYG, T.A. Gaidenko, A.Y. Mulkidjanian, and C.W. Price, submitted for publication). The diversity of sensor domains probably reflects the wide range of environmental stimuli that elicit regulatory responses in bacterial cells.

2.2 GAF domain

The GAF domain was originally described as a non-catalytic cGMP-binding domain conserved in cyclic nucleotide phosphodiesterases [27]. Subsequently, this domain was recognized in cyanobacterial adenylate cyclases and, finally, in histidine kinases and certain other proteins [10]. In spite of limited sequence similarity, the structure of the GAF domain turned out to be very similar to that of the PAS domain [11], indicating their common ancestry. In bacterial and plant phytochromes, the GAF domain contains a small insertion with a conserved Cys residue that serves for covalent attachment of photopigments [28,29]. GAF domains have been also found in association with a variety of other protein domains, such as PEP-dependent phosphotransferase (PTS Enzyme I [30]), PP2C-type protein phosphatase, NtrC-type ATPase, GGDEF, and EAL [10].

2.3 GGDEF domain

The GGDEF domain (Fig. 1A) was first discovered in the response regulator PleD that controls cell differentiation in the swarmer-to-stalked cell transition in Caulobacter crescentus[18]. PleD and its cognate histidine kinase PleC were first described as members of a typical two-component signal transduction system. However, instead of a typical CheY-HTH domain organization, PleD was found to consist of a CheY domain and a previously uncharacterized domain that was dubbed GGDEF based on its conserved sequence motif (Fig. 1A) [18]. This observation attracted little attention until it turned out that GGDEF is encoded in many bacterial genomes (Table 2), including 19 copies in E. coli and four copies in B. subtilis[22].

1

Consensus sequences of the recently discovered signaling domains. A: GGDEF domain. B: EAL domain. C: HD-GYP domain. The residue numbering is from T. maritima protein TM0107 (A), E. coli YdiV (B), and A. aeolicus aq_2027 (C). The GGDEF motif comprises residues 114–118 in A, the EAL motif comprises residues 29–31 in B, the HD-GYP motif corresponds to the residues 54–55 and 115–117 in C. Consensus sequences of the 218 GGDEF domains, 128 EAL domains, and 36 HD-GYP domains encoded in complete microbial genomes (Table 3) were drawn using the SeqLogo program [60] in the WWW-based implementation by Steven Brenner (http://www.bio.cam.ac.uk/seqlogo). The letters in each position represent amino acid residues found in that position; the height of each letter reflects the fraction of sequences with the corresponding amino acid residue in that position (the degree of conservation). The total height of each column indicates statistical importance of the given position. The residues are colored as follows: N, Q – green; K, R, H – blue; D, E – red; F, L, I, M, V – yellow; the rest – purple.

2

Inventory of signaling domains in complete prokaryotic genomes

SpeciesaGenome size (kb)Total number of proteinsSensor modulebResponse moduleb
CachePASGAFHisKinCheYcHptCheYdGGDEFEALHD-GYP
Bacteria
Mesorhizobium loti7036675223610621525732181
Pseudomonas aeruginosa62645565642963e19117533213
Escherichia coli46394289314928e563219170
Bacillus subtilis421541001014333e0136430
Bacillus halodurans42024066815536e2148422
Mycobacterium tuberculosis44123918023150013120
Vibrio cholerae403338272030541e11104941229
Caulobacter crescentus401737373266622824511100
Synechocystis sp.357331692262842e17741e23132
Mycobacterium leprae326827200315005320
Xylella fastidiosa26792766041145220331
Deinococcus radiodurans264925800732150251654
Lactococcus lactis236522660217007000
Neisseria meningitidis218421210115005000
Thermotoga maritima18611846642901129+2f09
Haemophilus influenzae183017090104026000
Campylobacter jejuni16411654540711111+100
Helicobacter pylori166815661004119000
Aquifex aeolicus1551152217440051181
Chlamydia pneumoniae123010520101002000
Treponema pallidum1138103110210141+103
Chlamydia trachomatis10428940201002000
Borrelia burgdorferi9118500104126111
Rickettsia prowazekii11118340004005110
Mycoplasma pneumoniae8166770000000000
Ureaplasma urealyticum7526110000000000
Buchnera sp. APS6415640000000000
Mycoplasma genitalium5804670000000000
Archaea
Archaeoglobus fulgidus217824200255140111000
Halobacterium sp. NRC-120142058015714316000
M. thermoautotrophicum17511869015416308000
Pyrococcus abyssi176517651101001000
Pyrococcus horikoshii1739∼17501001011000
Aeropyrum pernix1670∼17200000000000
Methanococcus jannaschii166517150000000000
Thermoplasma acidophilum156514780010000000
SpeciesaGenome size (kb)Total number of proteinsSensor modulebResponse moduleb
CachePASGAFHisKinCheYcHptCheYdGGDEFEALHD-GYP
Bacteria
Mesorhizobium loti7036675223610621525732181
Pseudomonas aeruginosa62645565642963e19117533213
Escherichia coli46394289314928e563219170
Bacillus subtilis421541001014333e0136430
Bacillus halodurans42024066815536e2148422
Mycobacterium tuberculosis44123918023150013120
Vibrio cholerae403338272030541e11104941229
Caulobacter crescentus401737373266622824511100
Synechocystis sp.357331692262842e17741e23132
Mycobacterium leprae326827200315005320
Xylella fastidiosa26792766041145220331
Deinococcus radiodurans264925800732150251654
Lactococcus lactis236522660217007000
Neisseria meningitidis218421210115005000
Thermotoga maritima18611846642901129+2f09
Haemophilus influenzae183017090104026000
Campylobacter jejuni16411654540711111+100
Helicobacter pylori166815661004119000
Aquifex aeolicus1551152217440051181
Chlamydia pneumoniae123010520101002000
Treponema pallidum1138103110210141+103
Chlamydia trachomatis10428940201002000
Borrelia burgdorferi9118500104126111
Rickettsia prowazekii11118340004005110
Mycoplasma pneumoniae8166770000000000
Ureaplasma urealyticum7526110000000000
Buchnera sp. APS6415640000000000
Mycoplasma genitalium5804670000000000
Archaea
Archaeoglobus fulgidus217824200255140111000
Halobacterium sp. NRC-120142058015714316000
M. thermoautotrophicum17511869015416308000
Pyrococcus abyssi176517651101001000
Pyrococcus horikoshii1739∼17501001011000
Aeropyrum pernix1670∼17200000000000
Methanococcus jannaschii166517150000000000
Thermoplasma acidophilum156514780010000000

aThe names and data for non-obligate parasites are in bold. Complete genome sequences and corresponding references are available in the NCBI Entrez Genome division at http://www.ncbi.nlm.nih.gov/PMGifs/Genomes/micr.html.

bEach number represents the number of proteins in a given genome that contain the corresponding domain; multiple occurrences of the same domain (e.g., PAS, CheY) on a single polypeptide chain are counted as one. The numbers are from the COG database (http://www.ncbi.nlm.nih.gov/COG, [4]) and/or from the results of iterative PSI-BLAST searches of domain-specific profiles against a database of proteins, encoded in completely sequenced microbial genomes [38,53,54]. These numbers were also compared against those in SMART [24]) and those reported in [41]. Complete lists of proteins that contain each particular domain are available at ftp://ncbi.nlm.nih.gov/pub/galperin/TwoCompCensus.html.

cCheY-like domains of the ‘hybrid’ sensor kinases (found on the same polypeptide chain with the His kinase domain, see [39]).

dCheY-like domains found in the response regulators (associated with HTH, ATPase, CheB, GGDEF, or HD-GYP domains), as well as stand-alone CheY-like domains.

eThese numbers do not include signaling proteins of the LytS family (COG3275, COG2972), predicted to be divergent His kinases [40]. The discrepancies with the numbers reported in [39] are due to the addition of Synechocystis sp. His kinase slr1212 and response regulators slr0687, sll1544, sll1879, and slr2041 and the exclusion of stand-alone Hpt domain slr0073 from the list of histidine kinases.

fThese domains are likely to be inactivated.

2

Inventory of signaling domains in complete prokaryotic genomes

SpeciesaGenome size (kb)Total number of proteinsSensor modulebResponse moduleb
CachePASGAFHisKinCheYcHptCheYdGGDEFEALHD-GYP
Bacteria
Mesorhizobium loti7036675223610621525732181
Pseudomonas aeruginosa62645565642963e19117533213
Escherichia coli46394289314928e563219170
Bacillus subtilis421541001014333e0136430
Bacillus halodurans42024066815536e2148422
Mycobacterium tuberculosis44123918023150013120
Vibrio cholerae403338272030541e11104941229
Caulobacter crescentus401737373266622824511100
Synechocystis sp.357331692262842e17741e23132
Mycobacterium leprae326827200315005320
Xylella fastidiosa26792766041145220331
Deinococcus radiodurans264925800732150251654
Lactococcus lactis236522660217007000
Neisseria meningitidis218421210115005000
Thermotoga maritima18611846642901129+2f09
Haemophilus influenzae183017090104026000
Campylobacter jejuni16411654540711111+100
Helicobacter pylori166815661004119000
Aquifex aeolicus1551152217440051181
Chlamydia pneumoniae123010520101002000
Treponema pallidum1138103110210141+103
Chlamydia trachomatis10428940201002000
Borrelia burgdorferi9118500104126111
Rickettsia prowazekii11118340004005110
Mycoplasma pneumoniae8166770000000000
Ureaplasma urealyticum7526110000000000
Buchnera sp. APS6415640000000000
Mycoplasma genitalium5804670000000000
Archaea
Archaeoglobus fulgidus217824200255140111000
Halobacterium sp. NRC-120142058015714316000
M. thermoautotrophicum17511869015416308000
Pyrococcus abyssi176517651101001000
Pyrococcus horikoshii1739∼17501001011000
Aeropyrum pernix1670∼17200000000000
Methanococcus jannaschii166517150000000000
Thermoplasma acidophilum156514780010000000
SpeciesaGenome size (kb)Total number of proteinsSensor modulebResponse moduleb
CachePASGAFHisKinCheYcHptCheYdGGDEFEALHD-GYP
Bacteria
Mesorhizobium loti7036675223610621525732181
Pseudomonas aeruginosa62645565642963e19117533213
Escherichia coli46394289314928e563219170
Bacillus subtilis421541001014333e0136430
Bacillus halodurans42024066815536e2148422
Mycobacterium tuberculosis44123918023150013120
Vibrio cholerae403338272030541e11104941229
Caulobacter crescentus401737373266622824511100
Synechocystis sp.357331692262842e17741e23132
Mycobacterium leprae326827200315005320
Xylella fastidiosa26792766041145220331
Deinococcus radiodurans264925800732150251654
Lactococcus lactis236522660217007000
Neisseria meningitidis218421210115005000
Thermotoga maritima18611846642901129+2f09
Haemophilus influenzae183017090104026000
Campylobacter jejuni16411654540711111+100
Helicobacter pylori166815661004119000
Aquifex aeolicus1551152217440051181
Chlamydia pneumoniae123010520101002000
Treponema pallidum1138103110210141+103
Chlamydia trachomatis10428940201002000
Borrelia burgdorferi9118500104126111
Rickettsia prowazekii11118340004005110
Mycoplasma pneumoniae8166770000000000
Ureaplasma urealyticum7526110000000000
Buchnera sp. APS6415640000000000
Mycoplasma genitalium5804670000000000
Archaea
Archaeoglobus fulgidus217824200255140111000
Halobacterium sp. NRC-120142058015714316000
M. thermoautotrophicum17511869015416308000
Pyrococcus abyssi176517651101001000
Pyrococcus horikoshii1739∼17501001011000
Aeropyrum pernix1670∼17200000000000
Methanococcus jannaschii166517150000000000
Thermoplasma acidophilum156514780010000000

aThe names and data for non-obligate parasites are in bold. Complete genome sequences and corresponding references are available in the NCBI Entrez Genome division at http://www.ncbi.nlm.nih.gov/PMGifs/Genomes/micr.html.

bEach number represents the number of proteins in a given genome that contain the corresponding domain; multiple occurrences of the same domain (e.g., PAS, CheY) on a single polypeptide chain are counted as one. The numbers are from the COG database (http://www.ncbi.nlm.nih.gov/COG, [4]) and/or from the results of iterative PSI-BLAST searches of domain-specific profiles against a database of proteins, encoded in completely sequenced microbial genomes [38,53,54]. These numbers were also compared against those in SMART [24]) and those reported in [41]. Complete lists of proteins that contain each particular domain are available at ftp://ncbi.nlm.nih.gov/pub/galperin/TwoCompCensus.html.

cCheY-like domains of the ‘hybrid’ sensor kinases (found on the same polypeptide chain with the His kinase domain, see [39]).

dCheY-like domains found in the response regulators (associated with HTH, ATPase, CheB, GGDEF, or HD-GYP domains), as well as stand-alone CheY-like domains.

eThese numbers do not include signaling proteins of the LytS family (COG3275, COG2972), predicted to be divergent His kinases [40]. The discrepancies with the numbers reported in [39] are due to the addition of Synechocystis sp. His kinase slr1212 and response regulators slr0687, sll1544, sll1879, and slr2041 and the exclusion of stand-alone Hpt domain slr0073 from the list of histidine kinases.

fThese domains are likely to be inactivated.

Recently, it was shown that PleD mutants with an intact CheY domain but lacking the GGDEF domain are defective in flagellar degradation and stalk formation during cell differentiation in C. crescentus[20]. These data directly demonstrate the involvement of the GGDEF domain in signal transduction, a role that was previously proposed on the basis of the association of this domain with CheY and PAS domains [18] in multi-domain proteins.

Although the functions of most of the GGDEF domain-containing proteins remain uncharacterized, some clues have emerged from a study of the regulation of the biosynthesis of extracellular cellulose in Acetobacter xylinum (recently renamed Glucoacetobacter xylinum) [19]. An extensive study of this process by Benziman and colleagues showed that it is regulated by cyclic diguanylate (c-diGMP, bis(3′,5′)-cyclic diguanylic acid), a novel effector molecule that consists of two cGMP moieties bound head-to-tail [31]. The exact mechanism remains unclear, but it apparently involves c-diGMP binding to some membrane proteins that activate expression and/or secretion of cellulose synthetase [32]. The search for the enzymes that synthesize and hydrolyze c-diGMP resulted in the identification of six open reading frames (ORFs) with an almost identical domain composition [19]. All of these proteins contain N-terminal PAS domains, followed by the GGDEF domain and another uncharacterized domain, which was dubbed EAL based on its conserved sequence motif (see below). Based on the properties of mutants in which these ORFs were inactivated by insertions, Tal et al. concluded that the GGDEF domain of each of these proteins was responsible for its diguanylate cyclase activity [19].

This conclusion has not been directly verified by demonstrating the enzymatic activity of the recombinant protein, so there remains a (remote) possibility that the ORFs described by Tal et al. [19] just regulate expression of diguanylate cyclases and phosphodiesterases. However, the notion that the GGDEF domain is a diguanylate cyclase has recently received support from a detailed analysis of its sequence. Using iterative PSI-BLAST searches and threading, Pei and Grishin aligned the GGDEF domain with eukaryotic adenylate cyclases [33]. Although the level of sequence similarity between the two domains was low, conservation of the proposed nucleotide-binding loop, which corresponds to the GGDEF motif, was compatible with the cyclase activity of the GGDEF domain [33].

2.4 EAL domain

The EAL domain (Fig. 1B) was originally described in a study of BvgR protein in Bordetella pertussis[21]. Under the conditions in which the genes encoding the major virulence factors are activated, virulence-repressed (vrg) genes are turned off in a BvgR-dependent fashion. Both these processes are under the control of a two-component regulatory system, BvgAS [21]. These observations established BvgR as a component of the signal transduction system in B. pertussis. A direct interaction of BvgR with DNA was suggested based on the similar expression patterns and molecular masses of BvgR and a putative transcriptional regulator, previously demonstrated to bind to a regulatory sequence within the coding region of vrg genes [34]. However, no direct experimental evidence for such an interaction has been provided. Sequence comparisons indicated the presence of a BvgR-like domain in a number of other poorly characterized proteins from diverse bacteria [21].

The same BvgR-like domain, dubbed EAL after its conserved residues, was independently discovered in tandem with the GGDEF domain in putative diguanylate cyclases and phosphodiesterases that regulate cellulose synthesis in A. xylinum[19]. Since diguanylate cyclase activity has been assigned to the GGDEF domain (see above), the EAL domain emerged as a good candidate for the role of a diguanylate phosphodiesterase. Indeed, the sequence of this domain contains several conserved acidic residues that could participate in metal binding and potentially might form a phosphodiesterase active site (Fig. 1B).

Other experimentally characterized proteins containing the EAL domain are listed in Table 3. It should be noted that YuxH (ComB) protein from B. subtilis, which was originally described as a transcriptional regulator of late competence genes, was later shown not to be required for this regulation [35]. The rtn gene of Proteus vulgaris, originally identified through its effect on the infection by phages λ and N4 [36], was later found in E. coli. The effect of rtn mutation was suppressed in cells grown on maltose, indicating that it might affect expression or membrane localization of LamB, which participates both in maltose transport and attachment of λ and N4.

3

Partly characterized proteins containing GGDEF, EAL, and HD-GYP domains

Organism, protein nameGenBank accession numberDomain organizationaFunction, operon structureReference
Cell differentiation
C. crescentus PleDL42554CheY-xCheY-GGDEFrequired for swarmer-to-stalked cell transition[18,20]
Biosynthesis of extracellular polysaccharides
A. xylinum DGC1AF052517PAS-GGDEF-EALrequired for cellulose biosynthesis[19]
Rhizobium leguminosarum CelR2AF121341CheY-GGDEFrequired for cellulose biosynthesis[55]
X. campestris RpfGAJ251547CheY-HD-GYPrequired for biosynthesis of extracellular polysaccharide[23]
Virulence, biosynthesis of extracellular proteins, adhesion
B. pertussis BvgRAF071567EALregulates transcription of vrgs[21]
Klebsiella pneumoniae FimKAAA25064HTH-EALfollows the operon, encoding type 1 (mannose-sensitive) fimbriae, is required for their adhesiveness, but not for their formationS. Clegg
V. cholerae MshH (VC0398 or YhdA)AF079406PER-GGDEF-EALprecedes the operon encoding type IV (mannose-sensitive) fimbriae, but is not required for its expression[56]
V. cholerae VieAAAC38449CheY-EAL-xCheY-HTHforms an operon with a gene specifically induced in infection[57]
Salmonella typhimurium AdrAAJ271071TM-GGDEFrequired for intercellular adhesion (biofilm formation), biosynthesis of extracellular polysaccharide[47,48]
Resistance to phages, toxic metals
E. coli RtnU83404PER-EALoverexpression confers resistance to phages λ and N4[36]
Pseudomonas stutzeri Urf2 (TnpM)AAC38223EALencoded on mercury resistance transposons, but not required for resistance; appears to affect transposition rate[45,46]
Aeromonas jandaei RrpXU67070xCheY-CheY-GGDEFpermits overexpression in E. coli of A. jandaeiβ-lactamase AsbB1, but not of β-lactamases AsbA1 or AsbM1[58]
Response to oxygen and light
E. coli DosBAA15160PAS-GGDEF-EALreversibly binds O2, CO, and NO with high affinity[59]
Synechocystis sp. Cph2BAA10536GAF-GAF-GGDEF-EAL-GAF-GGDEFphytochrome, dark-induced and repressed by light[29]
Organism, protein nameGenBank accession numberDomain organizationaFunction, operon structureReference
Cell differentiation
C. crescentus PleDL42554CheY-xCheY-GGDEFrequired for swarmer-to-stalked cell transition[18,20]
Biosynthesis of extracellular polysaccharides
A. xylinum DGC1AF052517PAS-GGDEF-EALrequired for cellulose biosynthesis[19]
Rhizobium leguminosarum CelR2AF121341CheY-GGDEFrequired for cellulose biosynthesis[55]
X. campestris RpfGAJ251547CheY-HD-GYPrequired for biosynthesis of extracellular polysaccharide[23]
Virulence, biosynthesis of extracellular proteins, adhesion
B. pertussis BvgRAF071567EALregulates transcription of vrgs[21]
Klebsiella pneumoniae FimKAAA25064HTH-EALfollows the operon, encoding type 1 (mannose-sensitive) fimbriae, is required for their adhesiveness, but not for their formationS. Clegg
V. cholerae MshH (VC0398 or YhdA)AF079406PER-GGDEF-EALprecedes the operon encoding type IV (mannose-sensitive) fimbriae, but is not required for its expression[56]
V. cholerae VieAAAC38449CheY-EAL-xCheY-HTHforms an operon with a gene specifically induced in infection[57]
Salmonella typhimurium AdrAAJ271071TM-GGDEFrequired for intercellular adhesion (biofilm formation), biosynthesis of extracellular polysaccharide[47,48]
Resistance to phages, toxic metals
E. coli RtnU83404PER-EALoverexpression confers resistance to phages λ and N4[36]
Pseudomonas stutzeri Urf2 (TnpM)AAC38223EALencoded on mercury resistance transposons, but not required for resistance; appears to affect transposition rate[45,46]
Aeromonas jandaei RrpXU67070xCheY-CheY-GGDEFpermits overexpression in E. coli of A. jandaeiβ-lactamase AsbB1, but not of β-lactamases AsbA1 or AsbM1[58]
Response to oxygen and light
E. coli DosBAA15160PAS-GGDEF-EALreversibly binds O2, CO, and NO with high affinity[59]
Synechocystis sp. Cph2BAA10536GAF-GAF-GGDEF-EAL-GAF-GGDEFphytochrome, dark-induced and repressed by light[29]

aOnly readily discernible domains are listed. TM, multiple (four or more) transmembrane segments; xCheY, a CheY domain that is inactivated by mutations; PER, a periplasmic sensor module of typical topology followed by a single transmembrane segment.

3

Partly characterized proteins containing GGDEF, EAL, and HD-GYP domains

Organism, protein nameGenBank accession numberDomain organizationaFunction, operon structureReference
Cell differentiation
C. crescentus PleDL42554CheY-xCheY-GGDEFrequired for swarmer-to-stalked cell transition[18,20]
Biosynthesis of extracellular polysaccharides
A. xylinum DGC1AF052517PAS-GGDEF-EALrequired for cellulose biosynthesis[19]
Rhizobium leguminosarum CelR2AF121341CheY-GGDEFrequired for cellulose biosynthesis[55]
X. campestris RpfGAJ251547CheY-HD-GYPrequired for biosynthesis of extracellular polysaccharide[23]
Virulence, biosynthesis of extracellular proteins, adhesion
B. pertussis BvgRAF071567EALregulates transcription of vrgs[21]
Klebsiella pneumoniae FimKAAA25064HTH-EALfollows the operon, encoding type 1 (mannose-sensitive) fimbriae, is required for their adhesiveness, but not for their formationS. Clegg
V. cholerae MshH (VC0398 or YhdA)AF079406PER-GGDEF-EALprecedes the operon encoding type IV (mannose-sensitive) fimbriae, but is not required for its expression[56]
V. cholerae VieAAAC38449CheY-EAL-xCheY-HTHforms an operon with a gene specifically induced in infection[57]
Salmonella typhimurium AdrAAJ271071TM-GGDEFrequired for intercellular adhesion (biofilm formation), biosynthesis of extracellular polysaccharide[47,48]
Resistance to phages, toxic metals
E. coli RtnU83404PER-EALoverexpression confers resistance to phages λ and N4[36]
Pseudomonas stutzeri Urf2 (TnpM)AAC38223EALencoded on mercury resistance transposons, but not required for resistance; appears to affect transposition rate[45,46]
Aeromonas jandaei RrpXU67070xCheY-CheY-GGDEFpermits overexpression in E. coli of A. jandaeiβ-lactamase AsbB1, but not of β-lactamases AsbA1 or AsbM1[58]
Response to oxygen and light
E. coli DosBAA15160PAS-GGDEF-EALreversibly binds O2, CO, and NO with high affinity[59]
Synechocystis sp. Cph2BAA10536GAF-GAF-GGDEF-EAL-GAF-GGDEFphytochrome, dark-induced and repressed by light[29]
Organism, protein nameGenBank accession numberDomain organizationaFunction, operon structureReference
Cell differentiation
C. crescentus PleDL42554CheY-xCheY-GGDEFrequired for swarmer-to-stalked cell transition[18,20]
Biosynthesis of extracellular polysaccharides
A. xylinum DGC1AF052517PAS-GGDEF-EALrequired for cellulose biosynthesis[19]
Rhizobium leguminosarum CelR2AF121341CheY-GGDEFrequired for cellulose biosynthesis[55]
X. campestris RpfGAJ251547CheY-HD-GYPrequired for biosynthesis of extracellular polysaccharide[23]
Virulence, biosynthesis of extracellular proteins, adhesion
B. pertussis BvgRAF071567EALregulates transcription of vrgs[21]
Klebsiella pneumoniae FimKAAA25064HTH-EALfollows the operon, encoding type 1 (mannose-sensitive) fimbriae, is required for their adhesiveness, but not for their formationS. Clegg
V. cholerae MshH (VC0398 or YhdA)AF079406PER-GGDEF-EALprecedes the operon encoding type IV (mannose-sensitive) fimbriae, but is not required for its expression[56]
V. cholerae VieAAAC38449CheY-EAL-xCheY-HTHforms an operon with a gene specifically induced in infection[57]
Salmonella typhimurium AdrAAJ271071TM-GGDEFrequired for intercellular adhesion (biofilm formation), biosynthesis of extracellular polysaccharide[47,48]
Resistance to phages, toxic metals
E. coli RtnU83404PER-EALoverexpression confers resistance to phages λ and N4[36]
Pseudomonas stutzeri Urf2 (TnpM)AAC38223EALencoded on mercury resistance transposons, but not required for resistance; appears to affect transposition rate[45,46]
Aeromonas jandaei RrpXU67070xCheY-CheY-GGDEFpermits overexpression in E. coli of A. jandaeiβ-lactamase AsbB1, but not of β-lactamases AsbA1 or AsbM1[58]
Response to oxygen and light
E. coli DosBAA15160PAS-GGDEF-EALreversibly binds O2, CO, and NO with high affinity[59]
Synechocystis sp. Cph2BAA10536GAF-GAF-GGDEF-EAL-GAF-GGDEFphytochrome, dark-induced and repressed by light[29]

aOnly readily discernible domains are listed. TM, multiple (four or more) transmembrane segments; xCheY, a CheY domain that is inactivated by mutations; PER, a periplasmic sensor module of typical topology followed by a single transmembrane segment.

2.5 HD and HD-GYP domains

Although the predicted phosphodiesterase activity of the EAL domain has not yet been demonstrated, some (predicted) signal transduction proteins do contain bona fide phosphodiesterase domains similar to the ones found in eukaryotic cyclic-nucleotide phosphodiesterases [37]. These domains belong to the recently identified superfamily of metal-dependent phosphohydrolases, designated the HD superfamily after the principal conserved residues implicated in metal binding and catalysis. This superfamily also includes such enzymes as bacterial dGTP triphosphohydrolase and the ppGpp(p) hydrolase SpoT [37]. The version of the HD-type domain that is fused with a CheY domain in response regulator-like proteins from several organisms (Table 4) has many additional highly conserved residues, including a conserved GYP motif (Fig. 1C); this domain was therefore dubbed HD-GYP [22]. Like the GGDEF and EAL domains, the HD-GYP domain was originally implicated in signal transduction on the basis of its association with CheY-like and other signaling domains [22]. Recently, its role in signaling has been demonstrated experimentally. In the plant pathogen Xanthomonas campestris, response regulator RpfG, which contains a CheY-like and an HD-GYP domain, has been shown to activate the synthesis of extracellular enzymes and the extracellular polysaccharide [23]. In-frame deletion of the rpfG gene abolished production of extracellular endoglucanase and significantly decreased the levels of polygalacturonate lyase and extracellular polysaccharide [23]. Increased expression of the cognate histidine kinase RpfC stimulated the production of these extracellular enzymes and even overcame the effect of the rpfG mutation. This latter result suggests that response regulators of the CheY-HD-GYP class, like RpfG, represent important, but not the only, output modules for their corresponding sensor kinases.

4

Diversity of domain fusions in bacterial response regulators

Domain organizationExperimentally studied examplesExamples identified solely from genomic sequences
Simple response regulators
CheY-HTHE. coli ArcA, CitB, FimZ, UvrY, KdpE, NarL, NarP, OmpR, PhoBE. coli YgiX, YedH, YlcA
B. subtilis CitT, ComA, GerE, PhoP, ResDB. subtilis YdbG, YdfI, YfiK, YhcZ, YocG, YufM, YvfU, YvqC, YxjL
CheY-AAA-HTHE. coli AtoC, GlnG, HydGE. coli YfhA, Cj1024c, HP0703, RP562, CT468, CPn0586, TP0519, BB0763
CheY-GGDEFC. crescentus PleD, R. leguminosarum CelR2BB0419, VC1086, RP237, Cj0643
CheY-EALslr1588, PA3947, VC1652
CheY-HD-GYPX. campestris RpfGPA2572, PA4781, slr2100, sll1624, TM0186, TM1147, VC1087, VC1348, VCA0210, XF1113
CheY-GGDEF-EALThiocystis violacea ORF5 (S54369)a, XF0401
HD-GYP-GGDEFaq_2027, DRA0342
Response regulators with extracytoplasmic sensor domains
FliY-GGDEFVC1067, VCA0557
FliY-HD-GYPTM1170
PER-GGDEFVC2285, VC2454, VCA1082, PA0847
PER-EAL
PER-GGDEF-EALYhjK, PA2072, PA1433, slr2077
PER-HD-GYPTM1682, VCA0895
Response regulators with cytoplasmic sensor domains
PAS-GGDEF-EALA. xylinum DGC1, PHEaq_1442
CheY-PAS-GGDEF-EALslr1305, PA4959, XF2624
GAF-GGDEFE. coli YeaP, DRB0044, slr1143, sll0048, PA2771
CheY-GAF-GGDEFslr0687
GAF-GGDEF-EALRhizobium etli ORF1 (AF034831), Rv1354c, VCA0080, VCA0785, PA2567, ML1750
GAF-PAS-GGDEF-EALAzorhizobium caulinodans YntC (X63841), PA5017
Fusions of sensory transduction domains from various signaling systems
GAF-PtsIE. coli PtsP (AAB40476), Azotobacter vinelandii PtsP (Y14681)PA0337, VC0672, mll3436
CheY-RsbUPA3346, VCA1086, slr1983
PAS-RsbUB. subtilis RsbP (YvfP)Rv1364c
Domain organizationExperimentally studied examplesExamples identified solely from genomic sequences
Simple response regulators
CheY-HTHE. coli ArcA, CitB, FimZ, UvrY, KdpE, NarL, NarP, OmpR, PhoBE. coli YgiX, YedH, YlcA
B. subtilis CitT, ComA, GerE, PhoP, ResDB. subtilis YdbG, YdfI, YfiK, YhcZ, YocG, YufM, YvfU, YvqC, YxjL
CheY-AAA-HTHE. coli AtoC, GlnG, HydGE. coli YfhA, Cj1024c, HP0703, RP562, CT468, CPn0586, TP0519, BB0763
CheY-GGDEFC. crescentus PleD, R. leguminosarum CelR2BB0419, VC1086, RP237, Cj0643
CheY-EALslr1588, PA3947, VC1652
CheY-HD-GYPX. campestris RpfGPA2572, PA4781, slr2100, sll1624, TM0186, TM1147, VC1087, VC1348, VCA0210, XF1113
CheY-GGDEF-EALThiocystis violacea ORF5 (S54369)a, XF0401
HD-GYP-GGDEFaq_2027, DRA0342
Response regulators with extracytoplasmic sensor domains
FliY-GGDEFVC1067, VCA0557
FliY-HD-GYPTM1170
PER-GGDEFVC2285, VC2454, VCA1082, PA0847
PER-EAL
PER-GGDEF-EALYhjK, PA2072, PA1433, slr2077
PER-HD-GYPTM1682, VCA0895
Response regulators with cytoplasmic sensor domains
PAS-GGDEF-EALA. xylinum DGC1, PHEaq_1442
CheY-PAS-GGDEF-EALslr1305, PA4959, XF2624
GAF-GGDEFE. coli YeaP, DRB0044, slr1143, sll0048, PA2771
CheY-GAF-GGDEFslr0687
GAF-GGDEF-EALRhizobium etli ORF1 (AF034831), Rv1354c, VCA0080, VCA0785, PA2567, ML1750
GAF-PAS-GGDEF-EALAzorhizobium caulinodans YntC (X63841), PA5017
Fusions of sensory transduction domains from various signaling systems
GAF-PtsIE. coli PtsP (AAB40476), Azotobacter vinelandii PtsP (Y14681)PA0337, VC0672, mll3436
CheY-RsbUPA3346, VCA1086, slr1983
PAS-RsbUB. subtilis RsbP (YvfP)Rv1364c

aFor experimentally studied proteins, not listed in Table 2, GenBank accession numbers are given in parentheses. The names of proteins encoded in completely sequenced genomes are listed exactly as in genome annotations; the corresponding sequences can be retrieved from the NCBI WWW site at http://www.ncbi.nlm.nih.gov/Entrez/ or http://www.ncbi.nlm.nih.gov/COG/. Protein names shown in italics indicate presence of other domains in addition to those listed in the first column.

4

Diversity of domain fusions in bacterial response regulators

Domain organizationExperimentally studied examplesExamples identified solely from genomic sequences
Simple response regulators
CheY-HTHE. coli ArcA, CitB, FimZ, UvrY, KdpE, NarL, NarP, OmpR, PhoBE. coli YgiX, YedH, YlcA
B. subtilis CitT, ComA, GerE, PhoP, ResDB. subtilis YdbG, YdfI, YfiK, YhcZ, YocG, YufM, YvfU, YvqC, YxjL
CheY-AAA-HTHE. coli AtoC, GlnG, HydGE. coli YfhA, Cj1024c, HP0703, RP562, CT468, CPn0586, TP0519, BB0763
CheY-GGDEFC. crescentus PleD, R. leguminosarum CelR2BB0419, VC1086, RP237, Cj0643
CheY-EALslr1588, PA3947, VC1652
CheY-HD-GYPX. campestris RpfGPA2572, PA4781, slr2100, sll1624, TM0186, TM1147, VC1087, VC1348, VCA0210, XF1113
CheY-GGDEF-EALThiocystis violacea ORF5 (S54369)a, XF0401
HD-GYP-GGDEFaq_2027, DRA0342
Response regulators with extracytoplasmic sensor domains
FliY-GGDEFVC1067, VCA0557
FliY-HD-GYPTM1170
PER-GGDEFVC2285, VC2454, VCA1082, PA0847
PER-EAL
PER-GGDEF-EALYhjK, PA2072, PA1433, slr2077
PER-HD-GYPTM1682, VCA0895
Response regulators with cytoplasmic sensor domains
PAS-GGDEF-EALA. xylinum DGC1, PHEaq_1442
CheY-PAS-GGDEF-EALslr1305, PA4959, XF2624
GAF-GGDEFE. coli YeaP, DRB0044, slr1143, sll0048, PA2771
CheY-GAF-GGDEFslr0687
GAF-GGDEF-EALRhizobium etli ORF1 (AF034831), Rv1354c, VCA0080, VCA0785, PA2567, ML1750
GAF-PAS-GGDEF-EALAzorhizobium caulinodans YntC (X63841), PA5017
Fusions of sensory transduction domains from various signaling systems
GAF-PtsIE. coli PtsP (AAB40476), Azotobacter vinelandii PtsP (Y14681)PA0337, VC0672, mll3436
CheY-RsbUPA3346, VCA1086, slr1983
PAS-RsbUB. subtilis RsbP (YvfP)Rv1364c
Domain organizationExperimentally studied examplesExamples identified solely from genomic sequences
Simple response regulators
CheY-HTHE. coli ArcA, CitB, FimZ, UvrY, KdpE, NarL, NarP, OmpR, PhoBE. coli YgiX, YedH, YlcA
B. subtilis CitT, ComA, GerE, PhoP, ResDB. subtilis YdbG, YdfI, YfiK, YhcZ, YocG, YufM, YvfU, YvqC, YxjL
CheY-AAA-HTHE. coli AtoC, GlnG, HydGE. coli YfhA, Cj1024c, HP0703, RP562, CT468, CPn0586, TP0519, BB0763
CheY-GGDEFC. crescentus PleD, R. leguminosarum CelR2BB0419, VC1086, RP237, Cj0643
CheY-EALslr1588, PA3947, VC1652
CheY-HD-GYPX. campestris RpfGPA2572, PA4781, slr2100, sll1624, TM0186, TM1147, VC1087, VC1348, VCA0210, XF1113
CheY-GGDEF-EALThiocystis violacea ORF5 (S54369)a, XF0401
HD-GYP-GGDEFaq_2027, DRA0342
Response regulators with extracytoplasmic sensor domains
FliY-GGDEFVC1067, VCA0557
FliY-HD-GYPTM1170
PER-GGDEFVC2285, VC2454, VCA1082, PA0847
PER-EAL
PER-GGDEF-EALYhjK, PA2072, PA1433, slr2077
PER-HD-GYPTM1682, VCA0895
Response regulators with cytoplasmic sensor domains
PAS-GGDEF-EALA. xylinum DGC1, PHEaq_1442
CheY-PAS-GGDEF-EALslr1305, PA4959, XF2624
GAF-GGDEFE. coli YeaP, DRB0044, slr1143, sll0048, PA2771
CheY-GAF-GGDEFslr0687
GAF-GGDEF-EALRhizobium etli ORF1 (AF034831), Rv1354c, VCA0080, VCA0785, PA2567, ML1750
GAF-PAS-GGDEF-EALAzorhizobium caulinodans YntC (X63841), PA5017
Fusions of sensory transduction domains from various signaling systems
GAF-PtsIE. coli PtsP (AAB40476), Azotobacter vinelandii PtsP (Y14681)PA0337, VC0672, mll3436
CheY-RsbUPA3346, VCA1086, slr1983
PAS-RsbUB. subtilis RsbP (YvfP)Rv1364c

aFor experimentally studied proteins, not listed in Table 2, GenBank accession numbers are given in parentheses. The names of proteins encoded in completely sequenced genomes are listed exactly as in genome annotations; the corresponding sequences can be retrieved from the NCBI WWW site at http://www.ncbi.nlm.nih.gov/Entrez/ or http://www.ncbi.nlm.nih.gov/COG/. Protein names shown in italics indicate presence of other domains in addition to those listed in the first column.

If the HD-GYP domain is indeed a phosphatase or a phosphodiesterase, its highly conserved sequence suggests high substrate specificity. Notably, at least two proteins, Aquifex aeolicus aq_2027 and Deinococcus radiodurans DRA0342, contain a HD-GYP-GGDEF domain combination [22]. Thus, the HD-GYP domain may be involved in the metabolism of cyclic diguanylate or in dephosphorylation of a phosphotransfer domain.

A modified version of the HD-GYP domain is fused to the C-terminus of the EAL domain in the ComB (YuxH) protein from B. subtilis, two A. aeolicus proteins, and three Vibrio cholerae proteins. This version lacks the conserved distal portion of the HD-GYP domain (Fig. 1C) and has certain substitutions in the characteristic metal-binding residues of the HD superfamily phosphohydrolases [37], which likely render the domain catalytically inactive.

3 Census of signaling domains in completely sequenced prokaryotic genomes

With the complete sequences of over 30 bacterial and archaeal genomes currently available, we were interested in obtaining accurate counts of the number of signaling domains in each of them. Sequence profiles were constructed for each of these domains (see Fig. 1) and compared using iterative BLAST searches [38] against a database of protein sequences encoded in each of the completely sequenced genomes. The results obtained (Table 2) are very close to those reported earlier for E. coli, Synechocystis sp., and C. crescentus[39–42], and reveal several interesting trends. First, some variations notwithstanding, these domains are abundant in the genomes of all free-living bacteria but are much less common in obligate parasites. This difference is particularly striking in the case of the GGDEF, EAL, and HD-GYP domains (compare the data in Table 2 for A. aeolicus and Helicobacter pylori, two bacteria with nearly the same number of genes, but with a free-living and parasitic life style, respectively). It appears that these newly discovered domains might be particularly important for sensing the more diverse environmental stimuli encountered by free-living or non-obligatory parasitic bacteria. The minimal genomes of mycoplasmas and Buchnera do not encode any signaling proteins at all (Table 2). Second, the signaling domains are generally less abundant and less evenly distributed in archaea than they are in bacteria. Although certain archaeal species, such as Archaeoglobus fulgidus and Methanobacterium thermoautotrophicum, encode a significant number of histidine kinases, CheY-like response domains, and PAS domains, the GGDEF, EAL, and HD-GYP domains have not been detected in any of the archaeal genomes sequenced thus far (Table 2). This result is particularly surprising because all of these domains are abundant in the hyperthermophilic bacteria A. aeolicus and Thermotoga maritima, which appear to have undergone horizontal gene exchange with the archaea on a massive scale [43,44]. Furthermore, two of the sequenced archaeal genomes, those of Aeropyrum pernix (the only sequenced representative of the Crenarchaeota, one of the two major archaeal branches) and Methanococcus jannaschii, do not appear to encode any of the currently recognized signaling domains (Table 2). This uneven phyletic distribution lends credence to a scenario whereby the two-component signaling system and the potential c-diGMP signaling system emerged early during bacterial evolution, with some of the components subsequently acquired by certain archaeal lineages via horizontal gene transfer.

4 Genomic context of the new signaling domains

The abundance of the uncharacterized signaling domains (Table 2) was one of the most unexpected features of bacteria revealed by genome sequencing. Indeed, none of the 19 copies of the GGDEF domain and 17 copies of the EAL domain encoded in the E. coli genome belongs to an experimentally characterized protein (see, e.g., COG2199 and COG2200 in the COG database, http://www.ncbi.nlm.nih.gov/COG[4]). The number of these domains in the recently sequenced genomes of P. aeruginosa and V. cholerae is even higher, and again the functions of the encoded proteins are as obscure as they probably are diverse (Table 3). Gram-positive bacteria appear to encode fewer of these signaling domains; there are only four proteins with the GGDEF domain and three proteins with the EAL domain in B. subtilis (Table 2), none of them has been characterized, either. Therefore, it is becoming increasingly clear that we have been missing major aspects of the regulatory circuits present in bacterial cells.

In the absence of direct experimental data, some clues to the range of functions of these newly described domains could be revealed by their genomic context, including operon structures and conserved domain fusions [5]. Unfortunately, the GGDEF-, EAL-, and HD-GYP-encoding genes are seldom found in operons, let alone conserved ones. Indeed, of the 29 E. coli genes that encode a GGDEF domain, an EAL domain, or both, only one, yfiN, forms a potential operon with another uncharacterized gene. Six more are paired into potential operons yddVU, yeaIJ, and yliEF. Thus, the majority of the genes coding for these domains in E. coli (and in most other bacteria) are not predicted to be in operons. Furthermore, in many cases the orientation of these genes is opposite to that of their nearest neighbors.

An interesting feature of many GGDEF and EAL domain-encoding genes is their presence in various transposons. For example, an EAL-encoding Urf2 has been found at the end of the mer operons in transposons Tn21 and Tn501, although it is not required for mercury resistance and is deleted in Tn5053[45]. A part of this gene, named tnpM for transposition modulator, has been shown to enhance Tn21 transposition by activating transposase expression and decreasing resolvase expression [46]. Whether the full-length Urf2 has the same activity is unknown. Genes for stand-alone EAL and GGDEF domains have also been found in the Lactococcus lactis transposon Tn5481.

4.1 Conserved fusions of novel signaling domains

Two-domain fusion proteins consisting of a phosphoacceptor CheY-like domain and either a GGDEF or an EAL domain were classified as response regulators even before the abundance of such fusions has become apparent [18,39,40]. A systematic analysis of domain organization of signaling proteins in completely sequenced bacterial genomes shows numerous domain fusions that pair novel output domains (GGDEF, EAL, and HD-GYP) not just with CheY-like domains but also with extracytoplasmic ligand-binding sensor domains or with cytoplasmic PAS and GAF sensor domains (Table 4). The variety of these multi-domain proteins seems to mirror that of sensor kinases. This circumstance apparently reflects an underlying uniformity of the mechanisms of signal transduction in the cell, from an N-terminal sensor domain to a transmitter domain to a C-terminal response output domain, and suggests that the novel domains comprise a distinct signaling (cyclic diguanylate-based?) system that complements the classical two-component system.

Indeed, in several independent cases [18,20,21,23] it has been shown that predicted response regulators containing these novel domains are regulated by, and act in parallel with, the ‘standard’ (CheY-HTH) response regulators. They seem to provide an additional output module and, potentially, a means of feedback control. The systems that are regulated by these novel response regulators include those responsible for the interaction of the bacteria with the environment (fimbriae, extracellular proteins, virulence) and with each other (biofilm formation [47,48], Table 3). Although such systems are extremely important in vivo, they may act in response to stimuli that have not yet been replicated in vitro.

4.2 Cross-talk between different signaling systems

The variety of fusions between signaling domains discussed above extends to fusions of these domains with components of other signaling pathways, creating a complex network of regulatory interactions. Perhaps the most interesting case is a fusion of a GAF domain to the Enzyme I of the PEP-dependent sugar: phosphotransferase system, first described for the E. coli PtsP protein [30]. PtsP and similar proteins encoded in the genomes of P. aeruginosa, V. cholerae, and Mesorhizobium loti might modulate the activity of the phosphotransferase system in response to the levels of cGMP or some other ligand that interacts with its N-terminal GAF domain.

Another notable example of cross-talk between different regulatory systems (Table 4) is the fusion of CheY-like and GAF domains with phosphatase domains of the PP2C type, found in RsbU-like regulators of σB subunit, which participate in the stress response in B. subtilis and many other bacteria [49,50]. Such fusion proteins should be able to couple stress responses directly to perturbations in oxygen and/or cGMP levels.

5 Conclusions and perspectives

Comparative analysis of complete microbial genomes reveals a network of regulatory interactions that is much more complex than was assumed previously. This complexity is mostly limited to free-living bacteria, whereas parasites with degraded genomes have few, if any, sensory transduction systems. Functions of some of the signaling domains are already known, whereas the functions of others remain to be discovered. If the GGDEF and EAL domains indeed function as a diguanylate cyclase and a c-diGMP phosphodiesterase, respectively [19], c-diGMP could emerge as a major cell regulator in bacteria but, remarkably, not in archaea. Experimental characterization of the functions of these domains will significantly advance our understanding of the principles and mechanisms governing the prokaryotic regulatory machinery.

Note added in proof

While this paper was under review, the c-diGMP phosphodiesterase activity of the Acetobacter xylinum DGC1-like protein (see Table 3) was shown to be regulated by oxygen [61]. Also, we became aware of the data implicating GGDEF-containing proteins in hemin storage in Yersinia pestis [62] and in flagellar function in E. coli [63].

References

[1]

Koonin
E.V.
Galperin
M.Y.
(
1997
)
Prokaryotic genomes: the emerging paradigm of genome-based microbiology
.
Curr. Opin. Genet. Dev.
7
,
757
763
.

[2]

Nelson
K.E.
Paulsen
I.T.
Heidelberg
J.F.
Fraser
C.M.
(
2000
)
Status of genome projects for nonpathogenic bacteria and archaea
.
Nat. Biotechnol.
18
,
1049
1054
.

[3]

Tatusov
R.L.
Koonin
E.V.
Lipman
D.J.
(
1997
)
A genomic perspective on protein families
.
Science
278
,
631
637
.

[4]

Tatusov
R.L.
Galperin
M.Y.
Natale
D.A.
Koonin
E.V.
(
2000
)
The COG database: a tool for genome-scale analysis of protein functions and evolution
.
Nucleic Acids Res.
28
,
33
36
.

[5]

Galperin
M.Y.
Koonin
E.V.
(
2000
)
Who's your neighbor? New computational approaches for functional genomics
.
Nat. Biotechnol.
18
,
609
613
.

[6]

Matsushika
A.
Mizuno
T.
(
1998
)
The structure and function of the histidine-containing phosphotransfer (HPt) signaling domain of the Escherichia coli ArcB sensor
.
J. Biochem. Tokyo
124
,
440
445
.

[7]

Ponting
C.P.
Aravind
L.
(
1997
)
PAS: a multifunctional domain family comes to light
.
Curr. Biol.
7
,
R674
R677
.

[8]

Zhulin
I.B.
Taylor
B.L.
Dixon
R.
(
1997
)
PAS domain S-boxes in Archaea, Bacteria and sensors for oxygen and redox
.
Trends Biochem. Sci.
22
,
331
333
.

[9]

Anantharaman
V.
Aravind
L.
(
2000
)
Cache – a signaling domain common to animal Ca2+-channel subunits and a class of prokaryotic chemotaxis receptors
.
Trends Biochem. Sci.
25
,
535
537
.

[10]

Aravind
L.
Ponting
C.P.
(
1997
)
The GAF domain: an evolutionary link between diverse phototransducing proteins
.
Trends Biochem. Sci.
22
,
458
459
.

[11]

Ho
Y.S.
Burden
L.M.
Hurley
J.H.
(
2000
)
Structure of the GAF domain, a ubiquitous signaling motif and a new class of cyclic GMP receptor
.
EMBO J.
19
,
5288
5299
.

[12]

Aravind
L.
Ponting
C.P.
(
1999
)
The cytoplasmic helical linker domain of receptor histidine kinase and methyl-accepting proteins is common to many prokaryotic signalling proteins
.
FEMS Microbiol. Lett.
176
,
111
116
.

[13]

Taylor
B.L.
Zhulin
I.B.
(
1999
)
PAS domains: internal sensors of oxygen, redox potential, and light
.
Microbiol. Mol. Biol. Rev.
63
,
479
506
.

[14]

Dutta
R.
Qin
L.
Inouye
M.
(
1999
)
Histidine kinases: diversity of domain organization
.
Mol. Microbiol.
34
,
633
640
.

[15]

Grebe
T.W.
Stock
J.B.
(
1999
)
The histidine protein kinase superfamily
.
Adv. Microb. Physiol.
41
,
139
227
.

[16]

Hoch
J.A.
(
2000
)
Two-component and phosphorelay signal transduction
.
Curr. Opin. Microbiol.
3
,
165
170
.

[17]

Stock
A.M.
Robinson
V.L.
Goudreau
P.N.
(
2000
)
Two-component signal transduction
.
Annu. Rev. Biochem.
69
,
183
215
.

[18]

Hecht
G.B.
Newton
A.
(
1995
)
Identification of a novel response regulator required for the swarmer-to-stalked-cell transition in Caulobacter crescentus
.
J. Bacteriol.
177
,
6223
6229
.

[19]

Tal
R.
Wong
H.C.
Calhoon
R.
Gelfand
D.
Fear
A.L.
Volman
G.
Mayer
R.
Ross
P.
Amikam
D.
Weinhouse
H.
Cohen
A.
Sapir
S.
Ohana
P.
Benziman
M.
(
1998
)
Three cdg operons control cellular turnover of cyclic di-GMP in Acetobacter xylinum: genetic organization and occurrence of conserved domains in isoenzymes
.
J. Bacteriol.
180
,
4416
4425
.

[20]

Aldridge
P.
Jenal
U.
(
1999
)
Cell cycle-dependent degradation of a flagellar motor component requires a novel-type response regulator
.
Mol. Microbiol.
32
,
379
391
.

[21]

Merkel
T.J.
Barros
C.
Stibitz
S.
(
1998
)
Characterization of the bvgR locus of Bordetella pertussis
.
J. Bacteriol.
180
,
1682
1690
.

[22]

Galperin
M.Y.
Natale
D.A.
Aravind
L.
Koonin
E.V.
(
1999
)
A specialized version of the HD hydrolase domain implicated in signal transduction
.
J. Mol. Microbiol. Biotechnol.
1
,
303
305
.

[23]

Slater
H.
Alvarez-Morales
A.
Barber
C.E.
Daniels
M.J.
Dow
J.M.
(
2000
)
A two-component system involving an HD-GYP domain protein links cell-cell signalling to pathogenicity gene expression in Xanthomonas campestris
.
Mol. Microbiol.
38
,
986
1003
.

[24]

Schultz
J.
Copley
R.R.
Doerks
T.
Ponting
C.P.
Bork
P.
(
2000
)
SMART: a web-based tool for the study of genetically mobile domains
.
Nucleic Acids Res.
28
,
231
234
.

[25]

Bateman
A.
Birney
E.
Durbin
R.
Eddy
S.R.
Howe
K.L.
Sonnhammer
E.L.
(
2000
)
The Pfam protein families database
.
Nucleic Acids Res.
28
,
263
266
.

[26]

Tam
R.
Saier
M.H.
Jr.
(
1993
)
Structural, functional, and evolutionary relationships among extracellular solute-binding receptors of bacteria
.
Microbiol. Rev.
57
,
320
346
.

[27]

Charbonneau
H.
Prusti
R.K.
LeTrong
H.
Sonnenburg
W.K.
Mullaney
P.J.
Walsh
K.A.
Beavo
J.A.
(
1990
)
Identification of a noncatalytic cGMP-binding domain conserved in both the cGMP-stimulated and photoreceptor cyclic nucleotide phosphodiesterases
.
Proc. Natl. Acad. Sci. USA
87
,
288
292
.

[28]

Davis
S.J.
Vener
A.V.
Vierstra
R.D.
(
1999
)
Bacteriophytochromes: phytochrome-like photoreceptors from nonphotosynthetic eubacteria
.
Science
286
,
2517
2520
.

[29]

Park
C.M.
Kim
J.I.
Yang
S.S.
Kang
J.G.
Kang
J.H.
Shim
J.Y.
Chung
Y.H.
Park
Y.M.
Song
P.S.
(
2000
)
A second photochromic bacteriophytochrome from Synechocystis sp. PCC 6803 spectral analysis and down-regulation by light
.
Biochemistry
39
,
10840
10847
.

[30]

Reizer
J.
Reizer
A.
Merrick
M.J.
Plunkett
G.
Rose
D.J.
Saier
M.H.
(
1996
)
Novel phosphotransferase-encoding genes revealed by analysis of the Escherichia coli genome: a chimeric gene encoding an Enzyme I homologue that possesses a putative sensory transduction domain
.
Gene
181
,
103
108
.

[31]

Ross
P.
Mayer
R.
Benziman
M.
(
1991
)
Cellulose biosynthesis and function in bacteria
.
Microbiol. Rev.
55
,
35
58
.

[32]

Weinhouse
H.
Sapir
S.
Amikam
D.
Shilo
Y.
Volman
G.
Ohana
P.
Benziman
M.
(
1997
)
c-di-GMP-binding protein, a new factor regulating cellulose synthesis in Acetobacter xylinum
.
FEBS Lett.
416
,
207
211
.

[33]

Pei
J.
Grishin
N.V.
(
2001
)
GGDEF domain is homologous to adenylyl cyclase
.
Proteins
42
,
210
216
.

[34]

Beattie
D.T.
Mahan
M.J.
Mekalanos
J.J.
(
1993
)
Repressor binding to a regulatory site in the DNA coding sequence is sufficient to confer transcriptional regulation of the vir-repressed genes (vrg genes) in Bordetella pertussis
.
J. Bacteriol.
175
,
519
527
.

[35]

Dubnau
D.
(
1991
)
Genetic competence in Bacillus subtilis
.
Microbiol. Rev.
55
,
395
424
.

[36]

Chae
K.S.
Yoo
O.J.
(
1986
)
Cloning of the lambda resistant genes from Brevibacterium albidum and Proteus vulgaris into Escherichia coli
.
Biochem. Biophys. Res. Commun.
140
,
1101
1105
.

[37]

Aravind
L.
Koonin
E.V.
(
1998
)
The HD domain defines a new superfamily of metal-dependent phosphohydrolases
.
Trends Biochem. Sci.
23
,
469
472
.

[38]

Chervitz
S.A.
Aravind
L.
Sherlock
G.
Ball
C.A.
Koonin
E.V.
Dwight
S.S.
Harris
M.A.
Dolinski
K.
Mohr
S.
Smith
T.
Weng
S.
Cherry
J.M.
Botstein
D.
(
1998
)
Comparison of the complete protein sets of worm and yeast: orthology and divergence
.
Science
282
,
2022
2028
.

[39]

Mizuno
T.
Kaneko
T.
Tabata
S.
(
1996
)
Compilation of all genes encoding bacterial two-component signal transducers in the genome of the cyanobacterium, Synechocystis sp. strain PCC 6803
.
DNA Res.
3
,
407
414
.

[40]

Mizuno
T.
(
1997
)
Compilation of all genes encoding two-component phosphotransfer signal transducers in the genome of Escherichia coli
.
DNA Res.
4
,
161
168
.

[41]

Koretke
K.K.
Lupas
A.N.
Warren
P.V.
Rosenberg
M.
Brown
J.R.
(
2000
)
Evolution of two-component signal transduction
.
Mol. Biol. Evol.
17
,
1956
1970
.

[42]

Nierman
W.C.
Feldblyum
T.V.
Laub
M.T.
Paulsen
I.T.
Nelson
K.E.
Eisen
J.
Heidelberg
J.F.
Alley
M.R.
Ohta
N.
Maddock
J.R.
Potocka
I.
Nelson
W.C.
Newton
A.
Stephens
C.
Phadke
N.D.
Ely
B.
DeBoy
R.T.
Dodson
R.J.
Durkin
A.S.
Gwinn
M.L.
Haft
D.H.
Kolonay
J.F.
Smit
J.
Craven
M.B.
Khouri
H.
Shetty
J.
Berry
K.
Utterback
T.
Tran
K.
Wolf
A.
Vamathevan
J.
Ermolaeva
M.
White
O.
Salzberg
S.L.
Venter
J.C.
Shapiro
L.
Fraser
C.M.
(
2001
)
Complete genome sequence of Caulobacter crescentus
.
Proc. Natl. Acad. Sci. USA
98
,
4136
4141
.

[43]

Aravind
L.
Tatusov
R.L.
Wolf
Y.I.
Walker
D.R.
Koonin
E.V.
(
1998
)
Evidence for massive gene exchange between archaeal and bacterial hyperthermophiles
.
Trends Genet.
14
,
442
444
.

[44]

Nelson
K.E.
Clayton
R.A.
Gill
S.R.
Gwinn
M.L.
Dodson
R.J.
Haft
D.H.
Hickey
E.K.
Peterson
J.D.
Nelson
W.C.
Ketchum
K.A.
McDonald
L.
Utterback
T.R.
Malek
J.A.
Linher
K.D.
Garrett
M.M.
Stewart
A.M.
Cotton
M.D.
Pratt
M.S.
Phillips
C.A.
Richardson
D.
Heidelberg
J.
Sutton
G.G.
Fleischmann
R.D.
Eisen
J.A.
Fraser
C.M.
(
1999
)
Evidence for lateral gene transfer between Archaea and bacteria from genome sequence of Thermotoga maritima
.
Nature
399
,
323
329
.

[45]

Brown
N.L.
Misra
T.K.
Winnie
J.N.
Schmidt
A.
Seiff
M.
Silver
S.
(
1986
)
The nucleotide sequence of the mercuric resistance operons of plasmid R100 and transposon Tn501: further evidence for mer genes which enhance the activity of the mercuric ion detoxification system
.
Mol. Gen. Genet.
202
,
143
151
.

[46]

Hyde
D.R.
Tu
C.P.
(
1985
)
tnpM: a novel regulatory gene that enhances Tn21 transposition and suppresses cointegrate resolution
.
Cell
42
,
629
638
.

[47]

Romling
U.
Rohde
M.
Olsen
A.
Normark
S.
Reinkoster
J.
(
2000
)
AgfD, the checkpoint of multicellular and aggregative behaviour in Salmonella typhimurium regulates at least two independent pathways
.
Mol. Microbiol.
36
,
10
23
.

[48]

Zogaj
X.
Nimtz
M.
Rohde
M.
Bokranz
W.
Romling
U.
(
2001
)
The multicellular morphotypes of Salmonella typhimurium and Escherichia coli produce cellulose as the second component of the extracellular matrix
.
Mol. Microbiol.
39
,
1452
1463
.

[49]

Vijay
K.
Brody
M.S.
Fredlund
E.
Price
C.W.
(
2000
)
A PP2C phosphatase containing a PAS domain is required to convey signals of energy stress to the sigmaB transcription factor of Bacillus subtilis
.
Mol. Microbiol.
35
,
180
188
.

[50]

Koonin
E.V.
Aravind
L.
Galperin
M.Y.
(
2000
)
A comparative-genomic view of the microbial stress response
. In:
Bacterial Stress Responses
(
Storz
G.
Hengge-Aronis
R.
, Eds.), pp.
417
444
.
ASM Press
,
Washington, DC
.

[51]

Volz
K.
(
1993
)
Structural conservation in the CheY superfamily
.
Biochemistry
32
,
11741
11753
.

[52]

Neuwald
A.F.
Aravind
L.
Spouge
J.L.
Koonin
E.V.
(
1999
)
AAA+: A class of chaperone-like ATPases associated with the assembly, operation, and disassembly of protein complexes
.
Genome Res.
9
,
27
43
.

[53]

Natale
D.A.
Galperin
M.Y.
Tatusov
R.L.
Koonin
E.V.
(
2000
)
Using the COG database to improve gene recognition in complete genomes
.
Genetica
108
,
9
17
.

[54]

Galperin
M.Y.
(
2001
)
Conserved ‘hypothetical’ proteins: new hints and new puzzles
.
Comp. Funct. Genomics
2
,
14
18
.

[55]

Ausmees
N.
Jonsson
H.
Hoglund
S.
Ljunggren
H.
Lindberg
M.
(
1999
)
Structural and putative regulatory genes involved in cellulose synthesis in Rhizobium leguminosarum bv. trifolii
.
Microbiology
145
,
1253
1262
.

[56]

Marsh
J.W.
Taylor
R.K.
(
1999
)
Genetic and transcriptional analyses of the Vibrio cholerae mannose-sensitive hemagglutinin type 4 pilus gene locus
.
J. Bacteriol.
181
,
1110
1117
.

[57]

Lee
S.H.
Angelichio
M.J.
Mekalanos
J.J.
Camilli
A.
(
1998
)
Nucleotide sequence and spatiotemporal expression of the Vibrio cholerae vieSAB genes during infection
.
J. Bacteriol.
180
,
2298
2305
.

[58]

Alksne
L.E.
Rasmussen
B.A.
(
1997
)
Expression of the AsbA1, OXA-12, and AsbM1 beta-lactamases in Aeromonas jandaei AER 14 is coordinated by a two-component regulon
.
J. Bacteriol.
179
,
2006
2013
.

[59]

Delgado-Nixon
V.M.
Gonzalez
G.
Gilles-Gonzalez
M.A.
(
2000
)
Dos, a heme-binding PAS protein from Escherichia coli, is a direct oxygen sensor
.
Biochemistry
39
,
2685
2691
.

[60]

Schneider
T.D.
Stephens
R.M.
(
1990
)
Sequence logos: a new way to display consensus sequences
.
Nucleic Acids Res.
18
,
6097
6100
.

[61]

Chang
A.L.
Tuckerman
J.R.
Gonzalez
G.
Mayer
R.
Weinhouse
H.
Volman
G
Amikam
D.
Benziman
M.
Gilles-Gonzalez
M.A.
(
2001
)
Phosphodiesterase A1, a regulator of cellulose synthesis in Acetobacter xylinum, is a heme-based sensor
.
Biochemistry
40
,
3420
3426
.

[62]

Jones
H.A.
Lillard
J.W.
Jr.
Perry
R.D.
(
1999
)
HmsT, a protein essential for expression of the haemin storage (Hms+) phenotype of Yersinia pestis
.
Microbiology
145
,
2117
2128
.

[63]

Ko
M.
Park
C.
(
2000
)
Two novel flagellar components and H-NS are involved in the motor function of Escherichia coli
.
J. Mol. Biol.
303
,
371
382
.