Abstract

In recent years there has been a dramatic increase in reports of glycosylation of proteins in various Gram-negative systems including Neisseria meningitidis, Neisseria gonorrhoeae, Campylobacter jejuni, Pseudomonas aeruginosa, Escherichia coli, Caulobacter crescentus, Aeromonas caviae and Helicobacter pylori. Although this growing list contains many important pathogens (reviewed by Benz and Schmidt [Mol. Microbiol. 45 (2002) 267–276]) and the glycosylations are found on proteins important in pathogenesis such as pili, adhesins and flagella the precise role(s) of the glycosylation of these proteins remains to be determined. Furthermore, the details of the glycosylation biosynthetic process have not been determined in any of these systems. The definition of the precise role of glycosylation and the mechanism of biosynthesis will be facilitated by a detailed understanding of the genes involved.

Introduction

Increasingly, reports of glycosylation of bacterial proteins are also identifying the genes required for this process. In this review we provide a brief description of each of the Gram-negative protein glycosylation systems in which there has been some investigation of the genes involved. We then conduct a comparative analysis of the published systems coupled with analysis of recently completed Gram-negative bacterial genomes to examine the genetics of glycosylation in an effort to reveal common themes and processes.

Description of glycosylation in Gram-negatives

Neisseria

Neisseria meningitidis is the causative agent of meningococcal disease. Pili are essential for the adherence of pathogenic Neisseria to host cells. Pili are polymers composed primarily of a single pilin subunit [2,3] and were first reported to be potentially glycosylated by Robertson et al. [4]. The structure and the nature of this glycosylation were first determined in N. meningitidis strain C311 where it was revealed that pilin is glycosylated at serine 63 with an unusual trisaccharide molecule: Gal (β1,4) Gal (α1,3) 2,4-diacetamido-2,4,6-trideoxyhexose (see Fig. 2) [5] and then in the Neisseria gonorrhoeae strain MS11 which is glycosylated at the same region of the pilin molecule and the modification is an O-linked disaccharide Gal (α1,3) GlcNAc (see Fig. 2) [6], rather than the trisaccharide reported in N. meningitidis[6,7].

2

Glycosylation structures. The identified glycosylation structures in Gram-negative bacteria reviewed in this paper. Modified from Power et al. [10], Stimson et al. [5], Castric et al. [10] and Thibault et al. [17].

2

Glycosylation structures. The identified glycosylation structures in Gram-negative bacteria reviewed in this paper. Modified from Power et al. [10], Stimson et al. [5], Castric et al. [10] and Thibault et al. [17].

To date, eight genes have been identified that are involved in glycosylation of pilin in Neisseria. The first gene was identified in N. meningitidis by Jennings et al. [8] and called pglA. PglANm has homology to type I glycosyltransferases and is postulated to catalyse the transfer of galactose to the diacetamido trideoxy-hexose (DATDH) via an α1,3 linkage (see Fig. 2). A homopolymeric tract of guanosine (G) residues is thought to mediate phase variation of pglANm[8]. Recent work has demonstrated that PglANm is also involved in the addition of the Gal (α1,3) residue in the disaccharide structures in MS11 and 8013SB (Power et al., submitted). The same observation has been reported by Bannerjee et al. [9] who call pglApgtA’.

A further glycosylation locus was identified in N. meningitidis strain C311#3 by Power et al. [10] and also in strain NMB by Kahler et al. [11]. The locus is polymorphic and contains from four to nine genes. In strain C311#3 the locus contains four genes. These include genes encoding a membrane spanning protein, PglFNm, and genes which may be involved in the biosynthesis and transfer of the unusual DATDH; an aminotransferase, PglCNm; a dehydratase, PglDNm; and a putative bifunctional enzyme with acetyltransferase and glycosyltransferase domains, PglBNm.

Two polymorphisms have been identified the pglBCDNm locus: an insertion of approximately 2 kb containing an additional two genes with homology to glycosyltransferases, pglGNm and pglHNm, and an insertion-deletion in pglBNm were found in some strains [11,12]. pglGNm and pglHNm contain homopolymeric tracts of G residues which may mediate phase variation. Two further genes which are not linked to the loci described above are involved in the biosynthesis of the trisaccharide: PglI, which has homology with acetyltransferases and is proposed to be involved in biosynthesis of the DATDH [13], and PglENm, a type 2 glycosyltransferase that catalyses the transfer of the terminal galactose to the C311 trisaccharide. pglENm contains heptanucleotide repeats (up to 53 copies of 5′-CAAACAA-3′) which mediate phase variation in laboratory conditions with a frequency of approximately 1/125 [12]. The phase variation of pglENm is responsible for switching between disaccharide and trisaccharide structures.

The neisserial pilin glycosylation system has a number of interesting features: it has a high degree of polymorphism between strains, it has a large number of putatively phase-variable genes and the glycosylation genes do not appear to be linked with the structural gene of the protein they modify. The role of glycosylation in pathogenic Neisseria is not well characterised, however several interesting observations have been made: Gubish et al. [14] reported that periodate or galactosidase treatment of pili markedly reduced attachment, suggesting the importance of galactose residues on pili for their attachment function. Hamadeh et al. [15] suggest that the binding of naturally occurring anti-gal antibodies to the terminal galactose may interfere in complement-mediated lysis and Marceau and Nassif [16] have shown that the presence of the glycosylation at serine 63 can influence the amount of truncated or S-pilin (soluble pilin) produced.

Campylobacter

Flagellin glycosylation

Campylobacter jejuni is the leading cause of food-borne illness in North America. Flagella are required for Campylobacter motility and are, as such, an essential virulence determinant and are required for colonisation of the gastrointestinal tract and invasion of intestinal epithelial cells. Flagellin is also the immunodominant protein recognised during infection and has been suggested to be an immunoprotective antigen [17].

Thibault et al. [17] have recently reported the structures of the flagellin-linked glycans from C. jejuni strain 81-176. The flagellin is modified on 19 Ser/Thr residues by the attachment of pseudaminic acid (Pse5Ac7Ac) or its derivatives: Pse5Am7Ac and Pse5Pr7Pr (Fig. 2) [17].

The first reports of genes involved in glycosylation of flagellin were in Campylobacter coli. Alm et al. [18] observed that the antigenicity and apparent molecular mass of the flagellin depended on the strain it was expressed in and not the flagellin amino acid sequence. In addition, they found that genes associated with conferring T1 or T2 antiserum binding were loosely linked with the flagellin structural genes and that during homologous recombination approximately 11% of strains switched their antiserum binding [18]. It was suggested that post-translational modifications accounted for the differences. Using this information Guerry et al. [19] identified two glycosylation genes downstream of the flagellin structural genes in C. coli: ptmA and ptmB, encoding a 3-oxoacyl-(acyl carrier protein) reductase homologue and a NeuA homologue (CMP-N-acetyl neuraminic acid synthetase) respectively. These genes have homologues in C. jejuni (% identity/similarity PtmA 79/86, PtmB 83/90) and are found in the same arrangement (Fig. 1) [19].

1

Genetic arrangement of glycosylation loci in Gram-negative bacteria. Arrows above the thick black lines represent the size and orientation of ORFs. ORFs for which evidence of their involvement in glycosylation exists are indicated by a heavy black border. The name of the ORF is indicated above or below it. Boxes or lines within the ORFs represent homopolymeric tracts or heptanucleotide repeat regions that do or may mediate phase-variable expression. Black lines beneath the arrows represent the genome sequence. Breaks in these lines and double vertical lines indicate a discontinuation. The GenBank accession numbers of these sequences are indicated beneath these lines. Above or beneath these lines, polymorphisms described in the text are indicated by white boxes, lines from these boxes indicate the region of insertion. Homology between ORFs has been indicated by the ORFs being represented in the same colour. Different colours represent the different homologies and are described in the figure key. Small black arrows indicate proximity to other genes. The asterisks indicate genes that are not present in some strains of C. jejuni. This polymorphism consists of one gene in N. gonorrhoeae strain FA1090 but contains two ORFs in N. meningitidis strain Z2491.

1

Genetic arrangement of glycosylation loci in Gram-negative bacteria. Arrows above the thick black lines represent the size and orientation of ORFs. ORFs for which evidence of their involvement in glycosylation exists are indicated by a heavy black border. The name of the ORF is indicated above or below it. Boxes or lines within the ORFs represent homopolymeric tracts or heptanucleotide repeat regions that do or may mediate phase-variable expression. Black lines beneath the arrows represent the genome sequence. Breaks in these lines and double vertical lines indicate a discontinuation. The GenBank accession numbers of these sequences are indicated beneath these lines. Above or beneath these lines, polymorphisms described in the text are indicated by white boxes, lines from these boxes indicate the region of insertion. Homology between ORFs has been indicated by the ORFs being represented in the same colour. Different colours represent the different homologies and are described in the figure key. Small black arrows indicate proximity to other genes. The asterisks indicate genes that are not present in some strains of C. jejuni. This polymorphism consists of one gene in N. gonorrhoeae strain FA1090 but contains two ORFs in N. meningitidis strain Z2491.

Linton et al. [20] investigated the role of the N-acetyl neuraminic acid (NANA) biosynthetic gene neuB in glycosylation. Three copies of neuB were identified in C. jejuni strain NCTC 11168. Linton et al. [20] demonstrated that each of these neuB genes can complement a neuB mutant in Escherichia coli. NeuB1 was shown to be involved in lipooligosaccharide (LOS) biosynthesis [20,21] and is located in the LOS biosynthesis locus. The NeuB2 and NeuB3 mutants did not affect LOS expression. No phenotype was observed for the NeuB2 mutant in strain NCTC 11168 but this mutant resulted in a reduction in the apparent molecular mass of flagellin in strain G1. The NeuB3 mutants lacked flagella and were non-motile in all strains tested. This phenotype is consistent with NeuB3 playing a role in the glycosylation of flagella since mutations in flagellin glycosylation genes have been shown to inhibit flagella expression [22–24].

The genes neuB2 and neuB3 are both associated with a large locus that contains the flagellar structural genes ptmA and ptmB in addition to other potential glycosylation genes (Fig. 1). As well as neuB there are multiple copies of the other NANA biosynthetic genes (two copies of neuA and a single copy of neuC) associated with the flagellar structural genes in the genome of strain NCTC 11168. NeuA3 was called ptmB and was described by Guerry et al. [19]. Further, it is reasonable to suggest that neuA2 and the neuC2 may be involved in Pse5Ac7Ac biosynthesis. Lüneberg et al. [25] demonstrated that neuA and neuB are required for legionaminic acid biosynthesis in Legionella pneumophila and are also able to complement neuA and neuB mutations in E. coli. Legionaminic acid is similar to pseudaminic acid and is a 5Am,7Ac,8O-acetyl-3,5,7,9-tetradeoxy-l-glycero-d-galacto-non-2-ulosonic acid (Fig. 2).

A further gene involved in the glycosylation of flagellin has been identified by Thibault et al. [17]. They reported that the gene pseA was involved in the glycosylation of flagellin. Mutants with inactive PseA exhibited multiple glycoforms but at a lower pI range than wild-type. This shift toward the more acidic region of the IEF gel is consistent with the loss of a basic functionality on the Pse5Ac7Ac structure [17].

The publication of the C. jejuni genome (strain NCTC 11168) [26] allows examination of the relative location of the glycosylation genes. In the genome strain a large locus of approximately 50 genes is found adjacent to the flagellin structural genes. Many of these genes have homology to glycosylation genes in other species and some have been demonstrated to be involved in glycosylation in Campylobacter. In this locus there is an apparent duplication of functions with two copies of neuA, neuB, pseA, similar aminotransferases, acetyltransferases and multiple maf genes [20,27]. The role of the multiple maf genes is unknown but it is suggested they may mediate frequent deletion or duplication events, further Karlyshev et al. [27] suggest that the maf gene products are likely to be cytoplasmic or inner membrane bound and may be involved in flagellin modification.

General glycosylation

In addition to the glycosylation of flagellin, other proteins have been reported to be glycosylated in C. jejuni via the ‘general’ glycosylation pathway. Recently Young et al. [28] have identified 22 proteins which are glycosylated. The glycosylation structure was determined by NMR to be a heptasaccharide consisting of GalNAc-α1,4-[Glcβ1,3-]GalNAc-α1,4-GalNAc-α1,4-GalNAc-α1,4-GalNAc-α1,3-Bac-β1,N-Asn-Xaa (Bac is bacillosamine, 2,4-diacetamido-2,4,6-trideoxyglucopyranose) [28].

A distinct locus has been identified which is responsible for ‘general’ glycosylation. Fry et al. [29] were the first to report the wlaB–H genes in C. jejuni 81116. This region was found to alter lipopolysaccharide (LPS) expression in E. coli when introduced on a plasmid. The region was therefore thought to be a LOS biosynthetic locus. Site-specific mutation of wlaF (pglBCj), wlaG (pglACj), wlaH (pglCCj), wlaI (pglDCj), wlaK (pglECj), wlaL (pglFCj) and wlaM (pglGCj) by Szymanski et al. [30] and wlaC (pglHCj), wlaD (pglICj), wlaE (pglJCj) by Linton et al. [31] revealed that these genes were not involved in LOS biosynthesis in C. jejuni strain 81-176 but were glycosylation genes involved in the ‘general’ glycosylation pathway. The genes pglBCj, pglCCj, pglDCj, pglECj have significant homology to neisserial glycosylation genes [10] and are present in the same order in the genome (Table 1 and Fig. 1).

1

Genes involved in glycosylation in Gram-negative bacteria

Gene name Homology Evidence Accession number Reference Notes 
Campylobacter jejuni 
Flagellin/general glycosylation 
CjpglA glycosyltransferase 1 AF108897 [30] similar to NmpglA 
CjpglB putative oligosaccharide transferase †s AF108897 [28,30]  
CjpglC glycosyltransferase †s AF108897 [30] similar to Neisseria pglB 1st half 
CjpglD acetyltransferase †s AF108897 [30] similar to Cj1321, NmPglB, PaORFE, PaORFG 
CjpglE aminotransferase (degT family) AF108897 [30] similar to NmpglC, Cj1294, Cj1320, CcFlmB, PaORFA 
CjpglF dehydratase similar to NmpglD 33/49 †s AF108897 [30] similar to NmpglD 
CjpglG unknown †s AF108897 [30]  
pglH glycosyltransferase 1 p¢ Y11648 [31]  
pglI glycosyltransferase 2 p¢ Y11648 [31]  
pglJ glycosyltransferase 1 p¢ Y11648 [31]  
wlaJ transmembrane protein     
wlaB ABC-type transport protein p¢ Y11648 [29]  
galE galactose epimerase Y11648 [29]  
neuB2 N-acetyl neuraminic acid synthetase c§• CAB73754 [20]  
neuB3 (Cj1317) N-acetyl neuraminic acid synthetase c¶ CAB73744 [20]  
ptmA 3-oxoacyl-(acyl-carrier-protein) reductase (fabG)  AAB48075 [19]  
ptmB (neuA3) acylneuraminate cytidylyltransferase  AAB48074 [19]  
pseA (Cj1316) pseudaminic acid biosynthesis MS AAK58486 [17]  
Cj1293 nucleotide sugar dehydratase    similar to HpFlaA1, CcFlmA, Cj1319 
Cj1294 aminotransferase (degT family)    similar to NmpglC, Cj1320, CcFlmB, PaORFA 
Cj1296-97 aminoglycoside N 3′-acetyltransferase     
Cj1299 (acpP2) acyl carrier protein     
Cj1303 (FabH2) putative 3-oxoacyl-(acyl-carrier-protein) synthase     
Cj1304 (acpP3) probable acyl carrier protein     
Cj1311 (neuA2) acylneuraminate cytidylyltransferase     
Cj1312 rkpO putative polysaccharide biosynthesis protein    similar to CcFlmD, AcFlmD, HpFlmD 
Cj1313 acetyltransferase (GNAT)    similar to CcflmH, HpFlmH 
Cj1324 (pseA2) similar to pseA (32/48)     
Cj1319 nucleotide sugar dehydratase    similar to HpFlaA1, CcFlmA, Cj1293 
Cj1320 aminotransferase (degT family)    similar to NmpglC, CjPglE, Cj1294, CcFlmB, PaORFA 
Cj1321 acetyltransferase    similar to CjpglD, NmPglB, PaORFE, PaORFG 
Cj1327 neuB2     
Cj1328 neuC2     
Cj1329 putative sugar-phosphate nucleotide transferase     
Caulobacter crescentus 
Flagellin glycosylation 
FlmA nucleotide sugar dehydratase ¶MW¢ U27301 [22] similar to HpFlaA1, Cj1293, Cj1319, AFlmA 
FlmB aminotransferase (degT family)  U27301 [22] similar to NmpglC, CjPglE, Cj1294, Cj1320, PaORFA 
FlmC spsF, 3-deoxy-d-manno-octulosonate cytidylyltransferase  U27302 [22] KDO biosynthesis 
FlmD rkpO putative polysaccharide biosynthesis protein ¶MW¢ U27302 [22] similar to AcFlmD, HpFlmD, Cj1312 
FlmE unknown ¶MW¢ U27302 [22]  
FlmF unknown  U27302 [22]  
FlmG similar to eukaryotic O-linked GlcNAc transferase ¶MW¢ U28867 [22]  
FlmH acetyltransferase (GNAT) ¶MW¢ U28867 [22] similar to HpFlmH, Cj1313 
Pseudomonas aeruginosa 
Pilin glycosylation 
pilO unknown, putative transmembrane protein  CAA58769 [35] similar transmembrane profile to NmpglF 
Flagellin glycosylation 
orfA aminotransferase (degT family) §g AF332547 [34] similar to NmpglC, Cj1294, Cj1320, CjPglE, CcFlmB 
orfB acyl carrier protein AF332547 [34]  
orfC 3-oxoacyl-(acyl-carrier-protein) synthase III (fabH) AF332547 [34]  
orfD 3-oxoacyl-(acyl-carrier-protein) reductase (fabG) AF332547 [34]  
orfE acetyltransferase AF332547 [34] similar to Cj1321, CjPglD, NmPglB 
orfF o-halobenzoate 1,2-dioxygenase large subunit AF332547 [34]  
orfG acetyltransferase AF332547 [34] similar to Cj1321, CjPglD, NmPglB 
orfH unknown LPS (Rhizobium etliAF332547 [34]  
orfI unknown AF332547 [34]  
orfJ spsF, 3-deoxy-d-manno-octulosonate cytidylyltransferase AF332547 [34] similar to CcFlmC 
orfK unknown AF332547 [34]  
orfL ubiquinone biosynthesis: 3-demethylubiquinone-9 3-methyltransferase (Nm) AF332547 [34]  
orfM 4-hydroxy-2-ketovalerate aldolase AF332547 [34]  
orfN(rfbC) glycosyltransferase 2 (X2) §g AF332547 [34]  
Neisseria 
Pilin glycosylation 
pglA glycosyltransferase 1 v§†∞ U73942 [8]  
pglB glycosyltransferase/acetyltransferase §†∞ AF014804 [10] similar to CjPglC, Cj1321, CjPglD, PaORFE, PaORFG 
pglB2 glycosyltransferase/ATP binding vMW AAK56074 [11]  
pglB2a glycosyltransferase vp  [12]  
pglC aminotransferase (degT family) §†∞ AF014804 [10] similar to CcFlmB, Cj1294, Cj1320, CjPglE, PaORFA 
pglD dehydratase §†∞ AF014804 [10] similar to CjpglF 
pglE glycosyltransferase 2 §† AY028717 [12]  
pglF transmembrane protein §†  [12]  
pglG glycosyltransferase 1 vp  [12]  
pglH glycosyltransferase 1 vp  [12]  
galE galactose epimerase §†∞  [8]  
pglI acetyltransferase (pfam01757) v§†  [13]  
pglJ sugar glycosyltransferase     
acpP2 acyl carrier protein     
fabF 3-oxoacyl-(acyl-carrier-protein) synthase II     
Escherichia coli 
AIDA-I, TibA (autotransporter adhesion) glycosylation 
Aah heptosyltransferase §∞†¢ AJ304444 [1]  
tibC heptosyltransferase §∞†¢c AAD46996 [42]  
rfaE ADP-heptose synthase §∞† BAB37358 [1]  
Aeromonas 
Flagellin glycosylation 
flmA nucleotide sugar dehydratase  AAD45656 [23] similar to HpFlaA1, Cc1293, Cj1319, CcFlmA 
flmB aminotransferase (degT family) ¶L¢ AAD45657 [23] similar to NmpglC, CjPglE, Cj1294, Cj1320, PaORFA, CcFlmB 
neuA neuA ¶L AAD45658 [23]  
flmD rkpO putative polysaccharide biosynthesis protein ¶L AAD45659 [23] similar to CcFlmD, HpFlmD, Cj1312 
neuB neuB  AAD45660 [23]  
Helicobacter pylori 
Flagellin glycosylation 
neuA (HP0326a) neuA §¶∞  [24]  
flmD (HP0326b) rkpO putative polysaccharide biosynthesis protein   [24] similar to CcFlmD, AcFlmD, Cj1312 
flmH (HP0327) acetyltransferase (GNAT)   [24] similar to CcFlmH, Cj1313 
flaA1 (HP0840) nucleotide sugar dehydratase   [46] similar to HpFlaA1, Cc1293, Cj1319, CcFlmA, AFlmA 
Gene name Homology Evidence Accession number Reference Notes 
Campylobacter jejuni 
Flagellin/general glycosylation 
CjpglA glycosyltransferase 1 AF108897 [30] similar to NmpglA 
CjpglB putative oligosaccharide transferase †s AF108897 [28,30]  
CjpglC glycosyltransferase †s AF108897 [30] similar to Neisseria pglB 1st half 
CjpglD acetyltransferase †s AF108897 [30] similar to Cj1321, NmPglB, PaORFE, PaORFG 
CjpglE aminotransferase (degT family) AF108897 [30] similar to NmpglC, Cj1294, Cj1320, CcFlmB, PaORFA 
CjpglF dehydratase similar to NmpglD 33/49 †s AF108897 [30] similar to NmpglD 
CjpglG unknown †s AF108897 [30]  
pglH glycosyltransferase 1 p¢ Y11648 [31]  
pglI glycosyltransferase 2 p¢ Y11648 [31]  
pglJ glycosyltransferase 1 p¢ Y11648 [31]  
wlaJ transmembrane protein     
wlaB ABC-type transport protein p¢ Y11648 [29]  
galE galactose epimerase Y11648 [29]  
neuB2 N-acetyl neuraminic acid synthetase c§• CAB73754 [20]  
neuB3 (Cj1317) N-acetyl neuraminic acid synthetase c¶ CAB73744 [20]  
ptmA 3-oxoacyl-(acyl-carrier-protein) reductase (fabG)  AAB48075 [19]  
ptmB (neuA3) acylneuraminate cytidylyltransferase  AAB48074 [19]  
pseA (Cj1316) pseudaminic acid biosynthesis MS AAK58486 [17]  
Cj1293 nucleotide sugar dehydratase    similar to HpFlaA1, CcFlmA, Cj1319 
Cj1294 aminotransferase (degT family)    similar to NmpglC, Cj1320, CcFlmB, PaORFA 
Cj1296-97 aminoglycoside N 3′-acetyltransferase     
Cj1299 (acpP2) acyl carrier protein     
Cj1303 (FabH2) putative 3-oxoacyl-(acyl-carrier-protein) synthase     
Cj1304 (acpP3) probable acyl carrier protein     
Cj1311 (neuA2) acylneuraminate cytidylyltransferase     
Cj1312 rkpO putative polysaccharide biosynthesis protein    similar to CcFlmD, AcFlmD, HpFlmD 
Cj1313 acetyltransferase (GNAT)    similar to CcflmH, HpFlmH 
Cj1324 (pseA2) similar to pseA (32/48)     
Cj1319 nucleotide sugar dehydratase    similar to HpFlaA1, CcFlmA, Cj1293 
Cj1320 aminotransferase (degT family)    similar to NmpglC, CjPglE, Cj1294, CcFlmB, PaORFA 
Cj1321 acetyltransferase    similar to CjpglD, NmPglB, PaORFE, PaORFG 
Cj1327 neuB2     
Cj1328 neuC2     
Cj1329 putative sugar-phosphate nucleotide transferase     
Caulobacter crescentus 
Flagellin glycosylation 
FlmA nucleotide sugar dehydratase ¶MW¢ U27301 [22] similar to HpFlaA1, Cj1293, Cj1319, AFlmA 
FlmB aminotransferase (degT family)  U27301 [22] similar to NmpglC, CjPglE, Cj1294, Cj1320, PaORFA 
FlmC spsF, 3-deoxy-d-manno-octulosonate cytidylyltransferase  U27302 [22] KDO biosynthesis 
FlmD rkpO putative polysaccharide biosynthesis protein ¶MW¢ U27302 [22] similar to AcFlmD, HpFlmD, Cj1312 
FlmE unknown ¶MW¢ U27302 [22]  
FlmF unknown  U27302 [22]  
FlmG similar to eukaryotic O-linked GlcNAc transferase ¶MW¢ U28867 [22]  
FlmH acetyltransferase (GNAT) ¶MW¢ U28867 [22] similar to HpFlmH, Cj1313 
Pseudomonas aeruginosa 
Pilin glycosylation 
pilO unknown, putative transmembrane protein  CAA58769 [35] similar transmembrane profile to NmpglF 
Flagellin glycosylation 
orfA aminotransferase (degT family) §g AF332547 [34] similar to NmpglC, Cj1294, Cj1320, CjPglE, CcFlmB 
orfB acyl carrier protein AF332547 [34]  
orfC 3-oxoacyl-(acyl-carrier-protein) synthase III (fabH) AF332547 [34]  
orfD 3-oxoacyl-(acyl-carrier-protein) reductase (fabG) AF332547 [34]  
orfE acetyltransferase AF332547 [34] similar to Cj1321, CjPglD, NmPglB 
orfF o-halobenzoate 1,2-dioxygenase large subunit AF332547 [34]  
orfG acetyltransferase AF332547 [34] similar to Cj1321, CjPglD, NmPglB 
orfH unknown LPS (Rhizobium etliAF332547 [34]  
orfI unknown AF332547 [34]  
orfJ spsF, 3-deoxy-d-manno-octulosonate cytidylyltransferase AF332547 [34] similar to CcFlmC 
orfK unknown AF332547 [34]  
orfL ubiquinone biosynthesis: 3-demethylubiquinone-9 3-methyltransferase (Nm) AF332547 [34]  
orfM 4-hydroxy-2-ketovalerate aldolase AF332547 [34]  
orfN(rfbC) glycosyltransferase 2 (X2) §g AF332547 [34]  
Neisseria 
Pilin glycosylation 
pglA glycosyltransferase 1 v§†∞ U73942 [8]  
pglB glycosyltransferase/acetyltransferase §†∞ AF014804 [10] similar to CjPglC, Cj1321, CjPglD, PaORFE, PaORFG 
pglB2 glycosyltransferase/ATP binding vMW AAK56074 [11]  
pglB2a glycosyltransferase vp  [12]  
pglC aminotransferase (degT family) §†∞ AF014804 [10] similar to CcFlmB, Cj1294, Cj1320, CjPglE, PaORFA 
pglD dehydratase §†∞ AF014804 [10] similar to CjpglF 
pglE glycosyltransferase 2 §† AY028717 [12]  
pglF transmembrane protein §†  [12]  
pglG glycosyltransferase 1 vp  [12]  
pglH glycosyltransferase 1 vp  [12]  
galE galactose epimerase §†∞  [8]  
pglI acetyltransferase (pfam01757) v§†  [13]  
pglJ sugar glycosyltransferase     
acpP2 acyl carrier protein     
fabF 3-oxoacyl-(acyl-carrier-protein) synthase II     
Escherichia coli 
AIDA-I, TibA (autotransporter adhesion) glycosylation 
Aah heptosyltransferase §∞†¢ AJ304444 [1]  
tibC heptosyltransferase §∞†¢c AAD46996 [42]  
rfaE ADP-heptose synthase §∞† BAB37358 [1]  
Aeromonas 
Flagellin glycosylation 
flmA nucleotide sugar dehydratase  AAD45656 [23] similar to HpFlaA1, Cc1293, Cj1319, CcFlmA 
flmB aminotransferase (degT family) ¶L¢ AAD45657 [23] similar to NmpglC, CjPglE, Cj1294, Cj1320, PaORFA, CcFlmB 
neuA neuA ¶L AAD45658 [23]  
flmD rkpO putative polysaccharide biosynthesis protein ¶L AAD45659 [23] similar to CcFlmD, HpFlmD, Cj1312 
neuB neuB  AAD45660 [23]  
Helicobacter pylori 
Flagellin glycosylation 
neuA (HP0326a) neuA §¶∞  [24]  
flmD (HP0326b) rkpO putative polysaccharide biosynthesis protein   [24] similar to CcFlmD, AcFlmD, Cj1312 
flmH (HP0327) acetyltransferase (GNAT)   [24] similar to CcFlmH, Cj1313 
flaA1 (HP0840) nucleotide sugar dehydratase   [46] similar to HpFlaA1, Cc1293, Cj1319, CcFlmA, AFlmA 

The gene name is based on the name by which it was referred to in the paper where it was described with reference to glycosylation or the name it was given in a genome annotation (where available).

The homology is based on the literature where it was described, BLASTP similarity searches and genome annotation.

The experimental evidence for this gene's role in glycosylation. The symbols used are: MS, mass spectrometry; §, apparent MW change of flagellin/pilin; †, altered antisera binding to flagellin/pilin; ∞, altered sugar labelling; •, no phenotype observed in some strains; ¶, aflagellate and non-motile; g, member of ‘genomic island’ shown to be involved in A-type flagellin glycosylation; ¢, not involved in LPS biosynthesis; p, putative; c, complements E. coli mutant; L, effect LPS; s, altered antiserum binding to other proteins; v, phase-variable.

Linton et al. [31] identified PEB3 and CgpA by purification using the lectin, soybean agglutinin (SBA), and subsequently Young et al. [28] identified a further 20 proteins using the same lectin. SBA specifically binds N-acetylgalactosamine (GalNAc)-containing glycoproteins. The number of aminotransferases, acetyltransferases, dehydratases and glycosyltransferases in the general glycosylation locus of C. jejuni (see Table 1, Fig. 1) suggests that the general glycosylation genes may be involved in the biosynthesis of a novel oligosaccharide. The high degree of similarity between these genes and neisserial glycosylation genes suggests they may be biosynthesised by a similar pathway (discussed below). Young et al. demonstrated that PglBCj, which has homology to the STT3 subunit of the N-linked oligosaccharidetransferase of Saccharomyces cerevisiae, was shown to specifically affect the glycosylation of the identified glycoproteins. Further, they postulate that PglBCj may act as the oligosaccharidetransferase perhaps responsible for the transfer of the oligosaccharide to the Asn residue. Thus it appears that PglBCj may be responsible for the N-linkage to the protein. A homologue of PglBCj is not present in the neisserial genomes and this may account for the differences in amino acid linkage between the two systems.

Recently, C. jejuni 81-176 pgl mutants impaired in general protein glycosylation showed reduced ability to adhere to and invade INT407 cells and to colonise intestinal tracts of mice [32]. There are several lines of evidence that suggest that the flagellin and general glycosylation pathways, although sharing some of the same constituents, may be dedicated to modification of different target proteins. Szymanski et al. [30] reported the mutants resulted in altered binding of typing sera to flagellin. O:23 sera had altered binding to flagellin in pglBCj, pglCCj and pglDCj mutants and O:36 sera had altered binding in pglCCj, pglFCj and pglGCj mutants. Young et al. suggests that this phenotype may be the indirect result of the non-glycosylation of the potential flagellin biosynthesis protein Cj1565 (PflA). This could be an example of a possible role for protein glycosylation in the modulation of protein function.

Pseudomonas aeruginosa

Flagellin glycosylation genes (PAK)

Brimer and Montie [33] demonstrated that type A, but not type B, flagellins are glycosylated but the structure of the glycosylation has not yet been defined. Flagella of P. aeruginosa primarily consist of a FliC (flagellin) subunit. Arora et al. [34] identified a genomic island, adjacent to the flagellin biosynthesis genes, which contains 14 genes (Fig. 1). In heterologous expression studies they demonstrated that this locus contains all genes required for glycosylation of type-A flagellin. Arora et al. also insertionally inactivated the first and last genes of the island, orfA and orfN, resulting in a lower apparent molecular mass of the flagellin subunit on SDS–PAGE. OrfA has homology with aminotransferases, including PglCNm and PglECj (see Fig. 1, Table 1). OrfN has two regions of homology with glycosyltransferases of the GT-2 family including PglENm and PglICj.

Pilin glycosylation

The pili of P. aeruginosa are crucial for motility and adherence [35,36]. The pilin of P. aeruginosa strain 1244 (O7 immunotype) has been shown to be glycosylated with a trisaccharide: α5NBOHC47NFmPse (β2,4)-Xyl (β1,3) FucNAc (β1,3)- on the N-terminal serine residue (Fig. 2) [37,38]. This trisaccharide is the same as the O-antigen repeating unit found in the O7 immunotype LPS. Castric [37] speculates that the structural similarity between the strain 1244 pilin glycan and the O7 repeating unit suggests a common biosynthetic origin in which pilin glycosylation may occur as a branch of the pathway of O-antigen production.

In addition to the oligosaccharide biosynthesis genes, Castric [37] identified a further gene required for pilin structural glycosylation, pilO. This gene is adjacent to the pilin gene (pilA). PilO shows no significant homology with any characterised proteins [39]. The transmembrane profile shows it contains 13 membrane spanning regions with a similar profile to PglFNm, which is required for neisserial pilin glycosylation. Only some strains of P. aeruginosa express glycosylated pilin. PAO and PAK express non-glycosylated pilin and do not contain the pilO gene. The glycosylation of P. aeruginosa pilin contrasts with the situation in Neisseria and Campylobacter where the LOS/LPS and pilin glycosylation genes are separate.

Escherichia coli

Two adhesins of E. coli have been identified as being glycosylated, AIDA-I and TibA. AIDA-I is an autotransporter adhesin and is responsible for the diffuse adherence phenotype of E. coli strain 2787 (O126:H27). Benz and Schmidt [1] demonstrated that AIDA-I is modified by heptose residues in a heptose:AIDA-I molar ratio of 19:1. The glycosylation of AIDA-I is essential for AIDA-I-mediated adherence. Immediately upstream of the aidA coding sequence is the aah gene (Fig. 1). Benz and Schmidt reported that deletion of the aah gene results in reduced apparent molecular mass of the AIDA-I protein on SDS–PAGE, loss of digoxigen hydrazide glycan labelling and a loss of its ability to mediate adherence. Aah shares a low degree of similarity with (mono)heptosyltransferases [1]. A survey of strains conducted by Niewerth et al. [40] revealed that the aah gene is associated with all members of the aidA family examined. Subsequently an adhesin of the same family as AIDA-I, TibA, has been demonstrated to have a similar heptose glycosylation [41]. TibA is chromosomally encoded in the ETEC strain H10407 and is immediately downstream of tibC, which encodes a protein that has a high degree of similarity to Aah (% similarity/identity 65/76). Moormann et al. [42] demonstrated that tibC can complement an aah mutant.

Caulobacter crescentus

Johnson et al. [43] first reported that the genes flmA, flmD, flmE, flmG and flmH were required for the synthesis of normal flagella in C. crescentus. Strains with mutations in any of these genes have a normal basal body and hook structure but fail to assemble a flagellar filament. Leclerc et al. [22] cloned, sequenced and insertionally inactivated flmA–H. The flmA, B, C, D genes have homology to proteins involved in capsule, spore coat and LPS biosynthesis in other organisms (see Table 1). In unpublished work cited in Leclerc et al. [22] mutations in flmA, flmD, flmE, flmG and flmH result in a lower molecular mass flagellin. These mutants did not affect the LPS profile of these strains. The recently released C. crescentus genome [44] has allowed us to further characterise the flmGH locus described by Leclerc et al. confirming that it is part of the flagellin biosynthesis locus. Homologues of FlmA, FlmB, FlmD, FlmH are found in the glycosylation loci of other pathogens (see Table 1, Fig. 1). The flmAflmB pair are unlinked to the flm genes and are homologous to dehydratases (FlaA1Hp, Cj1293, Cj1319, FlmAAc) and aminotransferases (degT family: pglCNm, pglECj, orfAPa, flmBAc, cj1294, cj1320), respectively. FlmDCc and FlmHCc have homologues in the C. jejuni flagellar glycosylation locus.

Aeromonas caviae

Gryllos et al. [23] have recently described five genes which are involved in motility and flagella and LPS O-antigen expression in A. caviae. Three of these genes, flmA, flmB and flmD, are homologous to C. crescentus flagellar glycosylation genes. The remaining two genes, neuA and neuB, are homologous to NANA biosynthesis genes. FlmB mutants in five other mesophilic Aeromonas species resulted in loss of motility, flagella and adherence but did not result in altered LPS expression. A survey of Aeromonas strains revealed that only flmA and flmB genes were present in all mesophilic Aeromonas spp. tested [23]. Mutant strains with inactive copies of these genes, like those in C. crescentus, do not allow proper flagellar expression. It is interesting to note that only in some strains does the inactivation of these genes affect LPS expression [45].

Helicobacter pylori

Josenhans et al. [24] have recently identified a locus of three potential flagellin glycosylation genes adjacent to flagellar structural genes in H. pylori. These genes have homology to NeuAAp, FlmDAp and FlmHCc (see Table 1, Fig. 1). A flmD mutant was constructed and expresses a similar phenotype to other flagellin glycosylation gene mutants: it is non-motile, does not express normal flagella, and expresses lower molecular mass flagellin in SDS–PAGE. An interesting observation was that only extracellular flagellin appeared to be glycosylated. Josenhans et al. [24] suggest that the glycosylation of flagellin may be linked to its secretion.

The FlaA1 protein described by Creuzenet et al. [46] is a candidate for involvement in the glycosylation of H. pylori flagellin. FlaA1 has homology to FlmACc and PglFCj and was shown to be a C6 dehydratase/C4 reductase specific for UDP-GlcNAc. Creuzenet et al. suggest that the FlmACc/PglFCj/FlaA1 proteins may form part of the larger WbpM/TrsG/BplL family which includes PglDNm[46]. Complementation of a WbpM mutant of P. aeruginosa by FlaA1 demonstrated that WbpM and FlaA1 are functionally equivalent [46].

Outlook

Genetic organisation

In many of these glycosylation systems reviewed, the genes required for glycosylation are closely linked to the target of their glycosylation. For example in E. coli the genes encoding AIDA-I and TibA proteins are immediately downstream of the genes which encode the proteins responsible for their glycosylation [1,47]. A similar arrangement is seen with the genes responsible for flagellar glycosylation in C. jejuni, P. aeruginosa and C. crescentus. The flagellin glycosylation genes reviewed in this paper, with the exception of some Caulobacter genes, are all found next to flagellin subunit genes or other genes involved in flagella biosynthesis. In Caulobacter, Campylobacter, Helicobacter and Aeromonas flagellin glycosylation appears to be an integral part of the proper expression of flagella.

In contrast to the close association of the flagellin, AIDA-I and TibA glycosylation genes with the structural genes of the proteins they modify, there appears to be no such association in the general glycosylation pathway of Campylobacter or the pilin glycosylation pathway of Neisseria and their respective target proteins. The two proteins that have been identified as being glycosylated in Campylobacter are not linked to the general glycosylation locus. The C. jejuni general glycosylation locus also contains a copy of the galE gene, which is required for galactose utilisation. The similarity between this locus and the adjacent LOS locus suggests that the general glycosylation locus may have been the result of a duplication of the LOS locus.

Common structures and common genes

There is considerable heterogeneity in the glycosylation structures and the genetic loci that encode their biosynthesis; however, there are also some notable common features which are discussed below.

PilO/PglF

PglFNm of Neisseria[12] and PilO of P. aeruginosa[39] are predicted to be transmembrane proteins and have a similar transmembrane profile (containing 13 and 12 transmembrane regions respectively). They are required for glycosylation of pilin in their respective species. PglFNm is homologous to Wzx of Yersinia enterocolitica, which is a putative O-antigen ‘flippase’ [48]. These proteins have little amino acid sequence similarity but have structural similarity consisting of 12/13 transmembrane domains. The presence of 13 transmembrane regions in conjunction with the homology to other ‘flippase’ transport proteins suggests that PglFNm may transport the di/trisaccharide across the inner membrane where it may be ligated to pilin. This suggests that in some systems the biosynthesis of the glycosylation oligosaccharide may occur in a system similar to that of Wzx-dependent LPS O-antigen biosynthesis [6,49].

Acyl carrier proteins (ACP)

ACPs are associated with glycosylation loci in C. jejuni, P. aeruginosa and N. meningitidis. ACPs are small, abundant, highly anionic proteins. ACPs are involved in the biosynthesis of acyl chains in several biosynthesis pathways including fatty acid (FA), phospholipid, membrane-derived oligosaccharide (MDO), LPS (LOS) biosynthesis and the acylation of haemolysin. All these pathways, with the exception of MDO biosynthesis, require ACP to act as an acyl chain donor. This activity is mediated by a serine-linked phosphopantheine prosthetic group. Vibrio cholerae requires RfbK, a specialised ACP, for the biosynthesis of the LPS constituent tetronate (a perosamine modified with 3-deoxy-l-glycero-tetronyl). RfbK acts to carry the five-carbon, 3-deoxy-l-glycero-tetronyl which is then condensed with a molecule of GDP-perosamine [50].

A similar process could exist in the biosynthesis of the Campylobacter flagellin-linked glycan Pse5Pr7Pr (pseudaminic acid with the three-carbon, hydroxyproprionyl groups). It seems likely that ACPs may play a similar role to RfbK and act as a donor of the hydroxypropionyl group which is then condensed with Pse5Ac7Ac. ACP-propionyl is involved in the first step of the biosynthesis of FA with an odd number of carbons. In 1996 Guerry et al. [19] showed that PtmA, a FabG homologue, was involved in the glycosylation of flagella in C. coli. We have also noted a FabH homologue (CJ1303) closely linked to the flagella glycosylation locus (see Fig. 1). In addition to these Fab homologues three ACP homologues (acpP2, 3, 4) are also found in this locus. In the P. aeruginosa flagella glycosylation island ACP, FabG and FabH homologues are also present.

acpP and fabF homologues are found adjacent to pglANm in N. meningitidis strain MC58. The glycosylation structure of the Neisseria pilin-linked glycan has no obvious requirement for an acyl donor. Interestingly, the biosynthesis of MDOs requires ACP but does not require its acyl donor abilities [51,52]. The phosphopantheine group which carries the acyl chains on ACP is not required for this activity. MDOs are chains of β1,2-linked glucose. The glycosyltransferase, MdoH, which makes the MDO glucose chains requires ACP for its activity. The exact role for ACP in the biosynthesis of MDO and the nature of its interaction with MdoH are unknown. This offers the intriguing possibility that ACPs may be required and may interact with the glycosyltransferases that make the pilin-linked glycan and therefore constitute an essential part of some glycosylation systems. ACP2 from Campylobacter does not contain the conserved DSL motif which is believed to be required for the attachment of the prosthetic phosphopantheine group [51,52]. The role of this ACP, and that of the ACP3Nm, if any, remains to be elucidated. The ACP3Nm is the focus of further study in our laboratory.

Biosynthesis of acetamido sugars

Several of the structures described to date contain heavily modified sugars such as pseudaminic acid and its derivatives, and DATDH (see Fig. 2). These sugars have acetamido groups in common. The biosynthetic pathway of a 4-acetamido-4,6-dideoxyhexose in E. coli has been described by Dietzler and Strominger [53], Matsuhashi and Strominger [54] and Marolda et al. [55]. These works suggest that the biosynthesis of these sugars requires a number of enzymes including dehydratases, to remove the OH groups creating deoxy sugars, which are then acted on by aminotransferases and subsequently acetyltransferases resulting in acetamido sugars. All of the glycosylation loci examined in this study contain some, if not all, of these genes and often in similar genetic arrangements, for example the similar arrangement of pglCCj, pglDCj, pglECj and pglFCj in C. jejuni and pglBNm, pglCNm and pglDNm in N. meningitidis (Fig. 1). A number of dehydratases have been identified as being involved in glycosylation of proteins in Gram-negative bacteria (see Table 1). They can be separated into two groups based on homology: the larger membrane-bound PglDNm/PglFCj/(WbpM)-like proteins and those similar to FlaA1 (FlmACc, FlmAAc, Cj1319, Cj1293). Work by Creuzenet et al. [46] demonstrated that these two families can be considered functionally equivalent. Creuzenet et al. have demonstrated that FlaA1 is a C6 dehydratase/C4 reductase specific for UDP-GlcNAc (the first step in the biosynthesis of 2,6-deoxy sugars: the conversion of UDP-GlcNAc to Qui2NAc) [46,56,57]. The presence of DATDHs in both the C. jejuni‘general’ protein-linked glycan and the N. meningitidis pilin-linked glycans suggests that PglDNm/PglFCj may act as the first step in the biosynthesis of 2,4-diacetamido 2,4,6-trideoxy sugars. In all cases in this review these dehydratases are adjacent to an aminotransferase from the DegT/DnrJ/EryC1/StrS family (PFAM PF01041) of pyridoxal-phosphate-dependent aminotransferases (see Table 1, Fig. 1). PglCNm, Cj1294, Cj1320, PglECj, FlmBCj and FlmBAp are members of this family. A number of acetyltransferases have been identified that have similarity with PglDCj (Cj1321, PglBNm, ORFEPa and ORFGPa) and are adjacent to the above aminotransferases (see Table 1 and Fig. 1). Acetyltransferases with similarity to FlmHCc include FlmHHp and Cj1313.

The role of these genes in the neisserial pilin glycosylation system has been deduced from mutagenesis studies [10]. A number of biosynthetic proteins and a number of glycosyltransferases work in concert to produce the trisaccharide structure. PglB, C, DNm are responsible for biosynthesis of the DATDH and glycosyltransferases PglANm and PglENm add the two galactoses (see Fig. 2). Similar genetic organisation is seen in the general glycosylation locus of C. jejuni (PglA, C, D, E, FCj are homologous to PglA, B, B, C, DNm respectively [10]) and these genes are believed to be responsible for the biosynthesis of the DATDH bacillosamine [28].

Variability

Variability of protein-linked glycan structures may be due to either polymorphisms in glycosylation loci or phase variation (high frequency reversible switching of gene expression) in expression of glycosylation genes. Examples of both of these mechanisms of variation are seen in the systems reviewed here. A recent study by Dorrel et al. [58] utilised whole genome microarrays to analyse the presence or absence of genes in 11 C. jejuni strains revealing a high degree of heterogeneity in both the flagellin modification and general glycosylation loci. Analysis of their unpublished data (http://www.sghms.ac.uk/depts/medmicro/bugs/GR-1858/) revealed a number of the genes in the C. jejuni NCTC 11168 genome are absent from the 11 strains surveyed (marked by asterisks in Fig. 1). The pgl genes in Neisseria are perhaps the best described in terms of the potential for structural variation. The primary pglB–DNm locus has been reported to have two large polymorphisms, the insertion of two open reading frames between pglF and pglB and an insertion of approximately 2 kb which alters pglBNm[11,12] (see Fig. 1). A large number of putatively phase-variable genes are involved in the glycosylation of pilin in Neisseria: pglANm[8], pglB2Nm, pglENm, pglGNm, pglHNm[12]. Phase-variable genes do not appear to be part of other glycosylation systems with the exception of a number of potentially phase-variable genes in the Campylobacter flagellin glycosylation loci, however their role in glycosylation has not been examined.

Dedicated or shared pathways for glycosylation of proteins in Gram-negative organisms?

The biosynthetic pathways for glycosylation described in this review can be divided into two categories: those that use the whole LOS/LPS glycans and attach them to the target protein and those that have a dedicated biosynthetic pathway for protein-linked glycans. Pseudomonas pilin expressed by strain 1244 has an O-linked glycan that is identical to its LPS O-antigen [39]. The biosynthesis of these structures is presumably via the same pathway. Only a single gene, pilO, has been identified that is required for addition of the structure to pilin. The glycosylation of Aeromonas flagellin is mediated by a locus which also alters LPS expression in some strains [23].

In Neisseria, Campylobacter and Pseudomonas (flagella) glycosylation mutants do not alter LOS/LPS expression. The separation of the Campylobacter‘general’ glycosylation pathway and the LOS biosynthesis pathway is remarkable considering the close linkage of the pglA–JCj genes and the genes responsible for LOS biosynthesis (Fig. 1). This separation of the flagellar glycosylation and LPS biosynthetic pathways is illustrated by the example of the three copies of the NANA biosynthesis gene neuA in C. jejuni[20]. All three neuA genes can complement an E. coli NeuA mutant. NeuA1 is located in the LPS biosynthesis loci of strains that contain NANA in their LPS and does not affect flagellin glycosylation [21]. NeuA2 and NeuA3 do not alter the expression of LPS but affect the glycosylation of flagellin. What mediates the separation of these biosynthetic pathways and links the ‘general’ biosynthetic pathway and its target proteins is unknown. The mechanisms that enable this separation of LPS oligosaccharide biosynthesis and the glycosylation biosynthesis are also unknown.

Conclusions

In this review we describe the state of the art in terms of the genetics of glycosylation in a number of Gram-negative systems. The characterised glycosylation systems vary in complexity from a single gene, in the case of the E. coli AIDA and TibC proteins, to complex systems utilising 10 or more genes. Despite this heterogeneity and the wide evolutionary distance represented by the organisms examined, we identified a number of conserved features in these glycosylation systems. (1) In seven of the nine systems, where the location of the target protein is known, the genes responsible for glycosylation are adjacent to the protein they glycosylate. This feature provides an obvious approach to the identification of new glycosylation genes as well as identifying potentially glycosylated proteins. (2) The high frequency of amino/acetamido sugar biosynthesis genes and their common arrangement, most notably between the C. jejuni general and the N. meningitidis pilin glycosylation loci (Fig. 1) which suggests a common evolutionary origin for these glycosylation loci. (3) The presence of ACPs. To date there is no direct experimental evidence that ACPs are required for glycosylation, but their presence in a number of glycosylation loci suggests they may play a role in glycosylation. (4) In C. jejuni and N. meningitidis the large number of polymorphisms between strains and the mechanisms for phase-variable expression of glycosylation genes within a strain suggests periodic immune or functional selection for variation in the glycan structure.

Many key questions remain unanswered as to the mechanism of glycosylation and the role of glycosylated proteins in the biology of these organisms. What is the precise mechanism for addition of the glycan to the glycosylated proteins, in particular the transferase involved in glycan addition and recognition of glycosylation site of the target protein? At what site in the cell are the target proteins glycosylated? What is the role of glycosylation in the biosynthesis of flagella, which are not correctly assembled in some glycosylation mutants? What permits or prevents cross-talk between biosynthetic pathways for LPS, capsule and protein glycosylation? The characterisation of the genetics of glycosylation will undoubtedly contribute to the evolving field of understanding the mechanism of biosynthesis and the role of bacterial protein glycosylation in pathogenesis and other cellular functions.

Acknowledgements

MPJ Lab is supported by NHMRC grant 210310 and ARC grant DP02010205. The authors would like to thank Kate Seib for careful reading of the manuscript.

References

[1]
Benz
I.
Schmidt
M.A.
(
2001
)
Mol. Microbiol.
 
40
,
1403
1413
.
[2]
Virji
M.
Kayhty
H.
Ferguson
D.J.
Alexandrescu
C.
Heckels
J.E.
Moxon
E.R.
(
1991
)
Mol. Microbiol.
 
5
,
1831
1841
.
[3]
McGee
Z.A.
Stephens
D.S.
(
1984
)
Surv. Synth. Pathol. Res.
 
3
,
1
10
.
[4]
Robertson
J.N.
Vincent
P.
Ward
M.E.
(
1977
)
J. Gen. Microbiol.
 
102
,
169
177
.
[5]
Stimson
E.
(
1995
)
Mol. Microbiol.
 
17
,
1201
1214
.
[6]
Parge
H.E.
Forest
K.T.
Hickey
M.J.
Christensen
D.A.
Getzoff
E.D.
Tainer
J.A.
(
1995
)
Nature.
 
378
,
32
38
.
[7]
Marceau
M.
Forest
K.
Beretti
J.L.
Tainer
J.
Nassif
X.
(
1998
)
Mol. Microbiol.
 
27
,
705
715
.
[8]
Jennings
M.P.
Virji
M.
Evans
D.
Foster
V.
Srikhanta
Y.N.
Steeghs
L.
van der Ley
P.
Moxon
E.R.
(
1998
)
Mol. Microbiol.
 
29
,
975
984
.
[9]
Banerjee
A.
(
2002
)
J. Exp. Med.
 
196
,
147
162
.
[10]
Power
P.M.
Roddam
L.F.
Dieckelmann
M.
Srikhanta
Y.N.
Tan
Y.C.
Berrington
A.W.
Jennings
M.P.
(
2000
)
Microbiology.
 
146
,
967
979
.
[11]
Kahler
C.M.
Martin
L.E.
Tzeng
Y.L.
Miller
Y.K.
Sharkey
K.
Stephens
D.S.
Davies
J.K.
(
2001
)
Infect. Immun.
 
69
,
3597
3604
.
[12]
Power
P.M.
Roddam
L.F.
Fitzpatrick
S.Z.
Srikhanta
Y.N.
Jennings
M.P.
Mol. Microbiol.
  (
submitted
).
[13]
Warren
M.J.
Roddam
L.F.
Power
P.M.
Terry
T.D.
Jennings
M.P.
Infect. Immun.
  (
submitted
).
[14]
Gubish
E.R.
Jr.
Chen
K.C.
Buchanan
T.M.
(
1982
)
Infect. Immun.
 
37
,
189
194
.
[15]
Hamadeh
R.M.
Estabrook
M.M.
Zhou
P.
Jarvis
G.A.
Griffiss
J.M.
(
1995
)
Infect. Immun.
 
63
,
4900
4906
.
[16]
Marceau
M.
Nassif
X.
(
1999
)
J. Bacteriol.
 
181
,
656
661
.
[17]
Thibault
P.
Logan
S.M.
Kelly
J.F.
Brisson
J.R.
Ewing
C.P.
Trust
T.J.
Guerry
P.
(
2001
)
J. Biol. Chem.
 
276
,
34862
34870
.
[18]
Alm
R.A.
Guerry
P.
Power
M.E.
Trust
T.J.
(
1992
)
J. Bacteriol.
 
174
,
4230
4238
.
[19]
Guerry
P.
Doig
P.
Alm
R.A.
Burr
D.H.
Kinsella
N.
Trust
T.J.
(
1996
)
Mol. Microbiol.
 
19
,
369
378
.
[20]
Linton
D.
Karlyshev
A.V.
Hitchen
P.G.
Morris
H.R.
Dell
A.
Gregson
N.A.
Wren
B.W.
(
2000
)
Mol. Microbiol.
 
35
,
1120
1134
.
[21]
Gilbert
M.
Karwaski
M.F.
Bernatchez
S.
Young
N.M.
Taboada
E.
Michniewicz
J.
Cunningham
A.M.
Wakarchuk
W.W.
(
2002
)
J. Biol. Chem.
 
277
,
327
337
.
[22]
Leclerc
G.
Wang
S.P.
Ely
B.
(
1998
)
J. Bacteriol.
 
180
,
5010
5019
.
[23]
Gryllos
I.
Shaw
J.G.
Gavin
R.
Merino
S.
Tomas
J.M.
(
2001
)
Infect. Immun.
 
69
,
65
74
.
[24]
Josenhans
C.
Vossebein
L.
Friedrich
S.
Suerbaum
S.
(
2002
)
FEMS Microbiol. Lett.
 
210
,
165
172
.
[25]
Lüneberg
E.
Zetzmann
N.
Alber
D.
Knirel
Y.A.
Kooistra
O.
Zahringer
U.
Frosch
M.
(
2000
)
Int. J. Med. Microbiol.
 
290
,
37
49
.
[26]
Parkhill
J.
(
2000
)
Nature.
 
403
,
665
668
.
[27]
Karlyshev
A.V.
Linton
D.
Gregson
N.A.
Wren
B.W.
(
2002
)
Microbiology.
 
148
,
473
480
.
[28]
Young
N.M.
(
2002
)
J. Biol. Chem.
 
16
,
16
.
[29]
Fry
B.N.
Korolik
V.
ten Brinke
J.A.
Pennings
M.T.
Zalm
R.
Teunis
B.J.
Coloe
P.J.
van der Zeijst
B.A.
(
1998
)
Microbiology.
 
144
,
2049
2061
.
[30]
Szymanski
C.M.
Yao
R.
Ewing
C.P.
Trust
T.J.
Guerry
P.
(
1999
)
Mol. Microbiol.
 
32
,
1022
1030
.
[31]
Linton
D.
Allan
E.
Karlyshev
A.V.
Cronshaw
A.D.
Wren
B.W.
(
2002
)
Mol. Microbiol.
 
43
,
497
508
.
[32]
Szymanski
C.M.
Burr
D.H.
Guerry
P.
(
2002
)
Infect. Immun.
 
70
,
2242
2244
.
[33]
Brimer
C.D.
Montie
T.C.
(
1998
)
J. Bacteriol.
 
180
,
3209
3217
.
[34]
Arora
S.K.
Bangera
M.
Lory
S.
Ramphal
R.
(
2001
)
Proc. Natl. Acad. Sci. USA.
 
98
,
9342
9347
.
[35]
Semmler
A.B.
Whitchurch
C.B.
Mattick
J.S.
(
1999
)
Microbiology.
 
145
,
2863
2873
.
[36]
Hahn
H.P.
(
1997
)
Gene.
 
192
,
99
108
.
[37]
Castric
P.
(
1995
)
Microbiology.
 
141
,
1247
1254
.
[38]
Comer
J.E.
Marshall
M.A.
Blanch
V.J.
Deal
C.D.
Castric
P.
(
2002
)
Infect. Immun.
 
70
,
2837
2845
.
[39]
Castric
P.
Cassels
F.J.
Carlson
R.W.
(
2001
)
J. Biol. Chem.
 
276
,
26479
26485
.
[40]
Niewerth
U.
Frey
A.
Voss
T.
Le Bouguenec
C.
Baljer
G.
Franke
S.
Schmidt
M.A.
(
2001
)
Clin. Diagn. Lab. Immunol.
 
8
,
143
149
.
[41]
Lindenthal
C.
Elsinghorst
E.A.
(
1999
)
Infect. Immun.
 
67
,
4084
4091
.
[42]
Moormann
C.
Benz
I.
Schmidt
M.A.
(
2002
)
Infect. Immun.
 
70
,
2264
2270
.
[43]
Johnson
R.C.
Ferber
D.M.
Ely
B.
(
1983
)
J. Bacteriol.
 
154
,
1137
1144
.
[44]
Nierman
W.C.
(
2001
)
Proc. Natl. Acad. Sci. USA.
 
98
,
4136
4141
.
[45]
Rabaan
A.A.
Gryllos
I.
Tomas
J.M.
Shaw
J.G.
(
2001
)
Infect. Immun.
 
69
,
4257
4267
.
[46]
Creuzenet
C.
Schur
M.J.
Li
J.
Wakarchuk
W.W.
Lam
J.S.
(
2000
)
J. Biol. Chem.
 
275
,
34873
34880
.
[47]
Lindenthal
C.
Elsinghorst
E.A.
(
2001
)
Infect. Immun.
 
69
,
52
57
.
[48]
Skurnik
M.
Venho
R.
Toivanen
P.
al-Hendy
A.
(
1995
)
Mol. Microbiol.
 
17
,
575
594
.
[49]
Feldman
M.F.
Marolda
C.L.
Monteiro
M.A.
Perry
M.B.
Parodi
A.J.
Valvano
M.A.
(
1999
)
J. Biol. Chem.
 
274
,
35129
35138
.
[50]
Morona
R.
Stroeher
U.H.
Karageorgos
L.E.
Brown
M.H.
Manning
P.A.
(
1995
)
Gene.
 
166
,
19
31
.
[51]
Therisod
H.
Weissborn
A.C.
Kennedy
E.P.
(
1986
)
Proc. Natl. Acad. Sci. USA.
 
83
,
7236
7240
.
[52]
Therisod
H.
Kennedy
E.P.
(
1987
)
Proc. Natl. Acad. Sci. USA.
 
84
,
8235
8238
.
[53]
Dietzler
D.N.
Strominger
J.L.
(
1973
)
J. Biol. Chem.
 
248
,
104
109
.
[54]
Matsuhashi
M.
Strominger
J.L.
(
1964
)
J. Biol. Chem.
 
239
,
2454
2463
.
[55]
Marolda
C.L.
Feldman
M.F.
Valvano
M.A.
(
1999
)
Microbiology.
 
145
,
2485
2495
.
[56]
Creuzenet
C.
Lam
J.S.
(
2001
)
Mol. Microbiol.
 
41
,
1295
1310
.
[57]
Creuzenet
C.
Urbanic
R.V.
Lam
J.S.
(
2002
)
J. Biol. Chem.
 
277
,
26769
26778
.
[58]
Dorrell
N.
(
2001
)
Genome Res.
 
11
,
1706
1715
.