Recent studies identified c-di-GMP as a universal bacterial secondary messenger regulating biofilm formation, motility, production of extracellular polysaccharide and multicellular behavior in diverse bacteria. However, except for cellulose synthase, no protein has been shown to bind c-di-GMP and the targets for c-di-GMP action remain unknown. Here we report identification of the PilZ (‘pills’) domain (Pfam domain PF07238) in the sequences of bacterial cellulose synthases, alginate biosynthesis protein Alg44, proteins of enterobacterial YcgR and firmicute YpfA families, and other proteins encoded in bacterial genomes and present evidence indicating that this domain is (part of) the long-sought c-di-GMP-binding protein. Association of the PilZ domain with a variety of other domains, including likely components of bacterial multidrug secretion system, could provide clues to multiple functions of the c-di-GMP in bacterial pathogenesis and cell development.
The recent identification of bis-(3′-5′)-cyclic dimeric guanosine monophosphate, c-di-GMP, as a universal secondary messenger in bacteria was a key advance in microbiology, made possible, in part, by comparative genome analysis (Galperin et al., 2001; D'Argenio and Miller, 2004; Galperin, 2004; Jenal, 2004; Römling et al., 2005). The GGDEF (formerly DUF1) and EAL (formerly DUF2) domains, whose involvement in c-di-GMP turnover was discovered in the groundbreaking work by Moshe Benziman and co-workers (Tal et al., 1998), were found to be among the most abundant domains encoded in bacterial genomes, suggesting that c-di-GMP-dependent regulation was widespread in the bacterial world (Galperin et al., 2001; Galperin, 2005). Indeed, just in the past two years, c-di-GMP was implicated in regulating transition between motility and sessility in Escherichia coli and Salmonella typhimurium, twitching motility in Pseudomonas aeruginosa, biofilm formation in Vibrio cholerae and Yersinia pestis, and photosynthesis gene expression in Synechococcus elongatus (Huang et al., 2003; Kirillina et al., 2004; Simm et al., 2004; Thomas et al., 2004; Tischler and Camilli, 2004). An important advance was the recent demonstration that the diguanylate cyclase (c-di-GMP synthetase) activity resides in the GGDEF domain (Paul et al., 2004; Ryjenkov et al., 2005), whereas the EAL domain functions as c-di-GMP-specific phosphodiesterase, hydrolyzing c-di-GMP to linear diguanylate GpGp (Bobrov et al., 2005; Christen et al., 2005; Schmidt et al., 2005). Still, mechanisms of c-di-GMP-dependent signaling remain unknown, owing to the scarcity of data on the targets of c-di-GMP action.
In the original study of the regulation of the cellulose synthase in Acetobacter xylinum (currently Gluconacetobacter xylinus) and other bacteria, Benziman and co-workers detected c-di-GMP binding to the cellulose synthase with most label bound to its β-subunit, BcsB (Amikam and Benziman, 1989; Mayer et al., 1991). These data suggested that BcsB was the c-di-GMP binding protein, which was reflected in its SwissProt annotation. Subsequent studies, however, revealed that c-di-GMP was actually binding to a 200 kD membrane-bound protein complex (Weinhouse et al., 1997), which has not been further characterized, but could correspond to the dimer of the α-subunit or to the second form of cellulose synthase whose single polypeptide chain contained both subunits (Saxena and Brown, 1995). Hence, it remained unclear which part of cellulose synthase, if any, would bind c-di-GMP and what were its other cellular targets. Our previous attempts to identify the c-di-GMP-binding adaptor protein by computational means have been unsuccessful, as no known protein exhibited the same phyletic distribution as the GGDEF and EAL domains (Römling et al., 2005). Here we report identification of the PilZ (‘pills’) domain (Pfam domain PF07238, Bateman et al., 2004) in the sequence of bacterial cellulose synthases and present evidence indicating that this domain is the long-sought c-di-GMP-binding protein.
2 RESULTS and DISCUSSION
Development of twitching motility in P.aeruginosa is governed by about 40 genes (Mattick, 2002). Functions of most of them are known or could be predicted based on the available experimental data. PilZ, encoded by P. aeruginosa PA2960 gene, is a 118 amino acid protein (Fig. 1) that remains one of the very few without an assigned function. pilZ mutants produce normal amounts of pilin but are unable to assemble functional pili (Alm et al., 1996). Close homologs of PilZ are encoded in many beta- and gamma-proteobacteria with distant homologs showing up in a variety of bacteria (Table S1 in the Supplementary Material). Sequence analysis of the PilZ protein was prompted by the discovery in Geobacter sulfurreducens of a response regulator GSU3263 with REC-PilZ domain architecture (Fig. 2), which suggested that PilZ might have a regulatory role. This analysis greatly benefited from the crystal structure of the V.cholerae protein VCA0042 (Protein Data Bank entry 1YLN), which contains C-terminal PilZ domain, solved recently by the Midwest Center for Structural Genomics (R. Zhang, M. Zhou, S. Moyi, F. Collart and A. Joachimiak, to be published). In addition, an NMR structure of stand-alone PilZ domain (PDB entry 1YWU) has been solved by the Northeast Structural Genomics Consortium (T.A. Ramelot, A.A. Yee, A. Semesi, C.H. Arrowsmith and M.A. Kennedy, to be published).
Iterative PSI-BLAST searches of the NCBI protein database started from the C-terminal 120 amino acid fragment of GSU3263 and retrieved more than 600 sequences from a variety of bacteria. Importantly, this search identified the PilZ domain near the C-terminus of the α-subunit of cellulose synthase from G.xylinus, E. coli, and other bacteria (Fig. 2). In contrast, no PilZ domain was detected in otherwise closely related eukaryotic cellulose synthases from the slime mold Dictyostelium discoideum and marine urochordate Ciona savignyi (Blanton et al., 2000; Matthysse et al., 2004). Eukaryotes do not seem to encode GGDEF domains and presumably do not produce c-di-GMP.
In addition, PilZ domain was found in the Alg44 proteins that are required for alginate biosynthesis in P.aeruginosa and Azotobacter vinelandii (Maharaj et al., 1993; Mejia-Ruiz et al., 1997). Unlike cellulose, alginate is a polymer of mannuronic acid that is produced from GDP-mannuronate in a different biosynthetic pathway, which indicates that PilZ plays a regulatory, rather than enzymatic, role in controlling alginate formation and pili biogenesis in P.aeruginosa. Given that the phenotype of pilZ mutation (see above) is similar to that resulting from a deletion of the GGDEF-EAL domain protein FimX (Huang et al., 2003), these observations suggested that PilZ might serve as the c-di-GMP-binding domain of cellulose synthase and as the c-di-GMP-binding protein in other instances.
Binding of c-di-GMP by PilZ explains an earlier observation that an E.coli motility defect caused by an hns mutation could be reversed either by increased expression of the yhjH gene or by inactivation of the ycgR gene (Ko and Park, 2000). Since the yhjH gene encodes a stand-alone EAL domain, its overexpression results in a decreased cellular level of c-di-GMP, which indeed stimulates motility (Simm et al., 2004). The mutation reversal without decreasing the cellular level of c-di-GMP could have been caused by the loss of the c-di-GMP receptor. This suggests that the product of the ycgR gene, which contains a PilZ domain (Figs 1 and 2), serves as the cellular receptor for c-di-GMP. Indeed, YcgR and cellulose synthase are the only PilZ-containing proteins encoded in E.coli (see Table S1 in the Supplementary Material).
Several other PilZ-containing proteins have been experimentally characterized, revealing phenotypes that are consistent with its role as the c-di-GMP adaptor protein. An Azospirillum brasilense chemotaxis receptor Tlp1, which contains a C-terminal PilZ domain (Fig. 2), in addition to energy taxis, was found to be required for colonization of plant roots (Greer-Phillips et al., 2004). In Myxococcus xanthus, a protein kinase combining the Ser/Thr kinase and PilZ domains was shown to control onset of cell differentiation; its deletion caused premature differentiation, resulting in poor spore production (Muñoz-Dorado et al., 1991).
Judging from the structure of PilZ domain (PDB: 1YWU), c-di-GMP binding might require its oligomerization or interaction with additional protein domains. Several alpha-proteobacterial proteins, such as Bradyrhizobium japonicum proteins Bll4394 and Blr5568, contain tandem duplications of the PilZ domain, whereas in VCA0042 and related proteins (PDB: 1YLN), PilZ is bound to a separate N-terminal domain, PilZNR (Fig. 2). The same two-domain organization, albeit with an apparently unrelated N-terminal domain (PilZN), is seen in the proteins of the YcgR family. Domain architectures of other PilZ-containg proteins (Fig. 2) include its fusions with signaling domains, such as the CheY-like receiver domain, GGDEF, EAL and HD-GYP domains, and are consistent with the notion that PilZ binds c-di-GMP. In Alg44 family proteins, the PilZ domain is fused to a domain, very similar to HlyD, the membrane component of a multidrug secretion system (Lewis, 2001; Holland et al., 2005). Such association could explain the observed role of c-di-GMP in regulating protein secretion and production of extracellular polysaccharide. In addition, PilZ forms a number of clade-specific fusions with uncharacterized domains that are found only in proteins from beta- (e.g. CV2716), gamma- (e.g. VC2344, PA2989) or delta- (e.g. GSU0137, GSU0943) proteobacteria (data not shown).
As would be expected of a c-di-GMP adaptor protein, the phyletic distribution of PilZ domain is generally similar with those of the GGDEF and EAL domains. Like GGDEF and EAL domains, PilZ domain is encoded in many bacterial genomes, including those of early-branching bacteria Thermotoga maritima and Aquifex aeolicus, but not in any archaeal or eukaryotic genome. Some genomes encode multiple copies of the PilZ domain, up to 15 such genes in Bdellovibrio bacteriovorus. Among proteobacteria belonging to beta, gamma and delta subdivisions, chlamydia, spirochetes and several other lineages, there is absolute correlation between presence or absence of consensus GGDEF and EAL domains and presence or absence of the PilZ domain (see Table S2 in the Supplementary Material). In alpha-proteobacteria, however, this correlation is not absolute. Intracellular bacterial parasites and symbionts, representing genera Bartonella, Brucella, Rickettsia, Ehrlichia and Wolbachia, appear to encode functional GGDEF domains but do not encode discernible PilZ domains. No PilZ domains have been found in certain GGDEF-encoding actinobacteria, cyanobacteria and firmicutes, including Staphylococcus aureus, which has been experimentally shown to respond to exogenous c-di-GMP (Karaolis et al., 2005). These organisms might harbor c-di-GMP adaptors other than PilZ or just PilZ-related domains that have diverged beyond recognition by sequence comparison alone.
Sequence analysis shows that PilZ domain is encoded in a variety of bacterial genomes with a phyletic pattern similar to those of the diguanylate cyclase (GGDEF) and c-di-GMP-specific phosphodiesterase (EAL) domains. The notion that PilZ serves as c-di-GMP-binding adaptor protein is supported by its presence in bacterial cellulose synthases and other proteins and is consistent with the available experimental data. However, since most genetic data involve loss of function, they only demonstrate that the PilZ domain is necessary for c-di-GMP binding in many bacteria, but not whether it is sufficient for binding or requires additional protein domains. PilZ forms numerous domain associations, which could mediate the diverse signaling mechanisms by c-di-GMP. Many of these domain fusions involve uncharacterized protein domains, opening new avenues for further studies of c-di-GMP-mediated signal transduction.
D.A. was supported by Tel-Hai Academic College, Tel-Hai, and Sharett Institute of Oncology, Hadassah University Medical Center, Israel. M.Y.G. was supported by the Intramural Research Program of the National Library of Medicine at the National Institutes of Health. Funding to pay the Open Access publication charges for this article was provided by the NIH Intramural Research Program.
Conflict of Interest: none declared.