Sequence divergence in type III secretion gene clusters of the Burkholderia cepacia complex

The Burkholderia cepacia complex (BCC) comprises a group of bacteria associated with opportunistic infections, especially in cystic ﬁbrosis patients. B. cenocepacia J2315, of the transmissible ET12 lineage, contains a type III secretion (TTS) gene cluster implicated in pathogenicity. PCR and hybridisation assays indicate that the TTS gene cluster is present in all members of the BCC except B. cepacia (formerly genomovar I). The TTS gene clusters of B. cenocepacia J2315 and B. multivorans are similar in organisation but have variable levels of gene identity. Nucleotide sequence data obtained for the equivalent region of the B. cepacia genome indicate the absence of TTS structural genes due to a rearrangement likely to involve more than one step. (cid:1)


Introduction
Members of the Burkholderia cepacia complex (BCC), comprising at least nine genomovars [1][2][3], are important opportunistic pathogens, especially in relation to cystic fibrosis (CF).Although representatives of each genomovar within the BCC have been isolated, B. cenocepacia (genomovar III) and B. multivorans (genomovar II) are by far the most common genomovars isolated from patients with CF [4].
Type III secretion systems (TTS) play crucial roles in the pathogenicity of a number of Gram negative bacterial pathogens [5].The genes encoding TTS systems and their secreted effectors are usually clustered in pathogenicity islands.TTS gene clusters have already been identified in the plant pathogen Ralstonia (formerly Burkholderia) solanacearum [6] and in B. pseudomallei, the causative agent of melioidosis, which carries three separate putative TTS gene clusters [7][8][9].In a previous paper we reported the identification of TTS genes in B. cenocepacia J2315 (genomovar IIIa, ET12 lineage).Based on assays for the detection of two TTS structural genes, we concluded that TTS genes were present in all genomovars of the BCC, with the exception of genomovar I [10].Recently, it was demonstrated that a TTS mutant of B. cenocepacia J2315 was attenuated for virulence in a murine model of infection [11], suggesting a role for TTS in pathogenicity.
In this paper, we report the complete sequence of the TTS gene cluster from strain J2315 and a comprehensive survey of its distribution amongst the BCC.Furthermore, we have used nucleotide sequencing to characterise the divergence between B. cepacia (genomovar I) and B. cenocepacia in the genomic region harbouring the TTS gene cluster, and we report the sequence of the B. multivorans (genomovar II) TTS gene cluster encompassing the region of divergence.

Bacterial strains and plasmids
BCC strains used in this study are listed in Table 1 and are taken mainly from the panel of strains representing diversity within the BCC [12,13].B. cepacia E242 and B. multivorans E243 [14] were identified as genomovars I and II respectively using genomovar-specific PCR [15].

Construction of gene libraries
Genomic DNA was extracted from BCC strains as described previously [16].DNA from strains J2315, E242 and E243 was employed to construct gene libraries using the SuperCos 1 Cosmid Vector Kit (Stratagene Europe) and the conditions recommended by the supplier.

Identification of hybridising genomic regions
Oligonucleotide primers, obtained from Sigma-Genosys, for PCR amplification of TTS genes were designed using the sequence information available for the J2315 TTS gene region [10], the hrp locus of R. solanacearum [6] or the TTS1 gene cluster of B. pseudomallei [8].Amplified products, labelled with digoxigenin-11-2 0 -dUTP (DIG) (Roche Diagnostics Ltd.), were used as probes to identify homologous gene-containing clones from BCC strain J2315, E242 and E243 cosmid libraries.The presence of DIG on colony blots was detected using anti-DIG-AP Fab fragments and the chemiluminescent substrate CDP-Star (Roche Diagnostics Ltd.) in the procedure recommended by the supplier.Smaller hybridising fragments, identified by Southern blot analysis of digested cosmid clones, were sub-cloned into the plasmid vector pUC19 (Helena Biosciences).

Nucleotide sequencing and computer analyses
DNA was purified from putative clones using a QIAprep Spin Miniprep Kit (Qiagen Ltd.).Both strands of the cloned insert DNA were sequenced by primer walking (Lark Technologies, Inc.).Nucleotide and protein sequences were analysed using the GCG sequence analysis software package (Genetics Computer Group, University of Wisconsin).BLAST searches were conducted using the site http://www.ncbi.nlm.nih.gov/blast/blast.cgi.

PCR assays and Southern blots
The panel of strains was screened by PCR amplification using the oligonucleotide primer sets listed in Table 2.For Southern blots, DNA isolated from BCC strains using a Wizard Genomic DNA Purification Kit (Promega) was digested with XhoI, electrophoresed on a 0.7% agarose gel, and transferred to a nylon membrane using standard procedures.The procedures for the labelling of probes with DIG and subsequent detection of hybridising bands were as for colony blots.)

Construction of sequence similarity phylogram
A phylogram for analysing relationships between TTS systems was constructed by concatamerising protein sequences from the YscR, YscT and YscU families, and aligning the sequences using the CLUSTALW program at the site http://www.ebi.ac.uk.Evolutionary relationships were determined using the genetic distance-based neighbour-joining algorithms of the Data Analysis in Molecular Biology software (DAMBE; http://web.hku.hk~xxia.software.htm).Phylogenetic trees were drawn using TreeExplorer software (http://evolgen.biol.metro-u.ac.jp/TE/TE_man.html).

Nucleotide sequence of the J2315 TTS genes
Previously we reported the partial sequence of the TTS gene cluster from B. cenocepacia strain J2315, identifying bcscV and the bcscQR-virB1 region as being present on the same cosmid clone [10].Following further sub-cloning of cosmid clones, we sequenced the rest of the TTS gene cluster from J2315 and deposited the information in GenBank (AY028431).Recently, Tomich et al. [11] submitted part of the strain J2315 sequence, describing it as the complete sequence, but omitting the bcscQR-virB1 region (Accession number AY166598).In strain J2315, the TTS gene cluster is flanked at one end by a gene encoding a predicted protein with significant homology to manganese transporter proteins (MTPs).MTPs share some homology with NRAMP proteins, which play a role in bacterial responses to reactive oxygen and may have a role in pathogenesis.An NRAMP protein has already been identified in B. cepacia [17].In general, the best similarities for 11 B. cenocepacia TTS conserved proteins were against B. pseudomallei TTS1 and TTS2, R. solanacearum hrp or Xanthomonas spp.TTS proteins, with the exception of BcscS, where highest similarity and identity was obtained with a protein from the genus Erwinia.BcscK shared best similarity with BscK of Bordetella bronchiseptica (Table 3), but homologues of this protein are not present in all TTS systems.orf1 and orf2 encode putative proteins with best homology to transcriptional regulatory proteins and share greater similarity with each other than to any protein in the database.Upstream of the bcscQR-virB1 region is a putative ORF (Orf7) sharing similarity with the C-terminal DNA binding domains of various twocomponent response regulator proteins.

Distribution of the TTS genes
Previously, we used two conserved, structural TTS genes (bcscV and bcscQ) as targets in PCR assays and Southern blot hybridisations to determine the distribution of genes amongst members of the BCC.The designations bcsc and bpsc were used in our original study to distinguish BCC and B. pseudomallei TTS genes.The alternative bsc designation has also been proposed [11].We used a similar approach to look at the distribution of putative genes that are not known conserved TTS genes but lie within or adjacent to the cluster (orf2, orf3, orf5, orf6 and MTP).Essentially, the PCR assays and blots suggested that strains of all genomovars (including genomovar I) contain MTP, orf2 and orf5 (encoding a putative asparagine synthetase); only B. cenocepacia IIIa and B. stabilis strains contain all of the genes assayed; B. multivorans, B. vietnamiensis, B. cenocepacia IIIb,  B. dolosa, and B. ambifaria appear to lack at least part of orf6, a large ORF separating the known TTS genes (Table 1, Fig. 1).These data indicated that whilst B. cepacia (genomovar I) lacks TTS structural genes, it contains MTP, orf2 and orf5.
Although in most cases the results of PCR assays and Southern blots were in accordance, some PCR-negative blot-positive results were obtained.In addition, there are variations in the sizes of hybridising bands between genomovars for several of the probes, including those targeting bcscV and bcscQ [10].We therefore chose to sequence TTS-related loci from representatives of B. cepacia (genomovar I) and B. multivorans to ensure that we had detailed data from representatives of each of the major distribution groups.

Nucleotide sequence of the B. cepacia (genomovar I) TTS region indicates absence of the TTS genes
Probes for strain J2315 orf5 and MTP were used to identify clones from a cosmid gene library of B. cepacia (genomovar I) strain E242.Probes to TTS structural genes did not hybridise with any of the library clones.A map of the nucleotide sequence obtained (GenBank Accession Number AY380559) is presented in Fig. 1.A putative asparagine synthetase gene sharing 98% identity in predicted protein to that of strain J2315 orf5 is present in strain E242.However, there is significant divergence from the strain J2315 sequence on either side of orf5.Although the complete MTP sequence was not obtained, it is clear that B. cepacia E242 carries the equivalent genes to B. cenocepacia J2315 MTP, orf1 and orf2 upstream of orf5, but the cluster of genes from bcscV to bcscU is entirely absent.Downstream of orf5 lies a remnant sequence with similarity to part of virB1 followed by an ORF with similarity to orf7.bcscQ, bcscR, most of virB1 and all of orf6, except for the region overlapping with orf5, are missing from B. cepacia E242.

Nucleotide sequence of the B. multivorans TTS genes indicates similar gene organisation to B. cenocepacia but variable levels of sequence identity
PCR and hybridisation assays suggested that parts of the strain J2315 TTS gene cluster, especially some of orf6, might be missing from B. multivorans.We used several TTS probes from strain J2315 to identify hybridising clones from a cosmid library of B. multivorans strain E243 and sequenced a region incorporating orf6 and either side of orf5, the point of divergence between genomovar I and B. cenocepacia.The sequence obtained (from bcscC to bcscQ) is deposited in GenBank AY380558.Essentially, the TTS gene cluster of strain E243 resembles that of strain J2315 in organisation and there is strong identity between the TTS structural, conserved proteins (>90%) (Fig. 1(b)).However, al-though equivalents of orf4 and orf6 are present, there is poor homology in the C-terminal region of Orf4 and in parts of Orf6.In addition, there is considerable variation in the putative signal sequence of virB1.The sequence data also predicted a shorter orf6 that does not overlap with the divergently transcribed orf5 (Fig. 1).
Using the primers CP125 and CP240, we PCR amplified and sequenced the start of this orf5/orf6 overlapping region using strain BC7, another representative of the B. cenocepacia ET12 lineage.The results indicated the presence of an additional 134 bp that was absent from strain J2315.If incorporated into a complete orf6 sequence, this would lead to a stop codon in an equivalent position to that observed for B. multivorans E243 orf6.Thus it seems likely that the orf6 of strain J2315 has undergone a deletion that causes read-through into orf5.In order to determine whether the strain J2315 or strain BC7 sequence was more typical for this region, the PCR amplification was also carried out on two other ET12 strains.This confirmed that strain J2315 is different from other ET12 strains and B. multivorans in this region (data not shown).A composite map of the consensus B. cenocepacia/B.multivorans map for the TTS region is presented in Fig. 1(b).

Discussion
The organisation of the B. cenocepacia/B.multivorans TTS genes does not resemble that of any other Burkholderia or R. solanacearum TTS cluster (Fig. 1).The lack of a gene sharing sequence identity with yscS immediately downstream of genes sharing identity with yscQ and yscR is particularly unusual.Instead, there is a putative gene similar to the virB1 gene of the Brucella suis type IV secretion system [18].Genome sequence data for strain J2315 has revealed the presence of a cluster of putative virB-related type IV secretion genes on the same chromosome as the TTS genes (chromosome 2).This cluster includes a putative virB1 gene.The predicted protein sequences of the two VirB1 homologues share 36% identity and 44% similarity.Interestingly the GC content of the two putative virB1 genes varies significantly.Whereas the TTS-linked virB1 has a GC content of 74%, the type IV secretion-linked virB1 has a GC content of 61%.
A phylogram based on predicted BcscR, BcscT and BcscU protein sequences, and equivalents in other bacteria, confirmed the relationships indicated by BLASTP searches of the database (Fig. 2).The position of B. cenocepacia/B.multivorans on the phylogram is separate from, but closest to a group comprising B. pseudomallei, R. solanacearum and Xanthomonas spp., but not including the B. pseudomallei Bsa TTS system proteins.This provides an interesting comparison within the genus Burkholderia.Whereas, as clearly indicated on the phylogram, the Bsa locus appears to have been obtained from outside the genus, the positions of B. pseudomallei TTS1 and TTS2 loci, and the B. cenocepacia TTS locus, reflect overall phylogenetic relationships between the bacterial species used in the comparison.Yet, both B. pseudomallei TTS1, which is present in B. pseudomallei but absent from close relatives B. mallei and B. thailandensis [8], and the B. cenocepacia TTS locus, demonstrate a degree of instability associated with genomic islands.
The overall GC content for the B. cenocepacia J2315 TTS gene cluster was 70.0%, compared to a genome average of 66.9%.Thus, GC content alone did not indicate the presence of a genomic island obtained from another organism.In addition, there are no tRNA genes, phage integrase genes or insertion sequences flanking the gene cluster.However, the apparent deletion of TTS structural genes from B. cepacia genomovar I is typical of the kind of instability often associated with pathogenicity islands.The organisation of genes in the equivalent region of genomovar I suggests that the differences observed are the result of multiple events.These could have been two separate deletions from genomovar I occurring either side of orf5, or a large deletion, followed by re-acquisition of orf5.The alternative scenario is that an ancestral B. cepacia acquired the TTS genes, and that all members of the BCC other than genomovar I are derived from this ancestor.However, this scenario is complicated by the apparent retention of part of the virB1 sequence.The nature of these events is difficult to resolve because we have seen only the current state of evolution, and have no knowledge of any intermediate stages.It is perhaps ironic that the genospecies chosen to retain the original epithet B. cepacia is genomovar I [4], the one genospecies that does not possess the TTS gene cluster.B. cepacia can be distinguished genetically from other members of the BCC by PCR assays for TTS genes [10,19].
The presence of the TTS structural genes in all members of the BCC other than those in genomovar I, raises interesting questions concerning the role of TTS within the complex.B. cepacia was first reported as an onion pathogen, exemplified by the type strain, which is from genomovar I.It has been reported that onion maceration occurs in 100% of clinical and environmental genomovar I and B. cenocepacia strains [20].These observations suggest that TTS is unlikely to be involved in the onion pathogenicity of the BCC, unless different genomovars employ different mechanisms.Using RT-PCR, we have been unable to detect expression of TTS genes in B. cenocepacia or B. multivorans in a variety of culture conditions shown to induce TTS systems in other bacteria (data not shown).This observation is consistent with the reported lack of variation in extracellular protein profiles between wild-type and TTS mutant strains grown under various culture conditions [11], and suggests that the signals required to induce the TTS system in the BCC are not easy to mimic in vitro.There are putative regulatory genes flanking the TTS structural genes.However, the fact that orf2, orf3 and orf7 are all retained by genomovar I, might suggest that these genes are not involved in regulation of TTS.
B. cenocepacia J2315 is effectively an orf6 frame-shift mutant when compared with other ET12 lineage strains.The mutation occurs towards the C-terminus, leaving much of the predicted protein intact, but leading to a considerable increase in length.It is not clear whether Orf6 plays any role in TTS, other than the location of its large putative gene between the two TTS structural gene clusters.However, it is interesting that orf6 but not orf5 (the asparagine synthetase) is absent from genomovar I, and that Orf6 is maintained as a large open reading frame in B. multivorans.We have observed other events that would lead to mutations in strain J2315 not apparent in other members of the ET12 lineage [21].There was much lower sequence similarity between B. multivorans and B. cenocepacia orf6 than between other genes of the cluster.This explains why orf6 was not detected in B. multivorans using probes derived from B. cenocepacia (Table 1).
There are no predicted proteins in the B. cenocepacia J2315 genome sequence with similarity to known TTS translocator proteins from animal pathogens or to the R. solanacearum HrpY TTS pilus component implicated in plant pathogenicity [22].In addition, there are no matches when the genome is interrogated with B. pseudomallei Bops or Bips, the only known TTS secreted proteins from B. pseudomallei [9].Thus, the role of the BCC TTS system is still unclear and the effectors remain to be identified.However, attenuation of virulence of a TTS mutant (bscN) in an animal model [11] does indicate that these genes are active in vivo, and that they may have a part to play in the opportunistic pathogenicity of the BCC.

Fig. 1 .
Fig. 1.Map of the B. cepacia complex TTS gene cluster, including distribution of genes amongst the B. cepacia complex.Letter designations (V, S, C, D, J, K, L, N, T, U, R, Q) refer to genes/proteins with similarity to Yersinia ysc genes given the same designation.ORFs 1-7 are only given numbers.B1 shares identity with virB1 of Brucella suis.MTP shares identity with genes/proteins of the NRAMP family.Shading is used to indicate predicted protein similarity to structural TTS proteins (dots) and ORFS of unknown function linked to TTS genes (chequered).Black shading is used to indicate partial identity to virB1 sequence in genomovar I.For figure (a), boxes below the strain J2315 map are used to indicate target regions for PCR/Southern blot hybridisation screening.The distributions of these target regions, determined by PCR/Southern blot hybridisation screening, are indicated by shading as: present in all genomovars (black); present in all except genomovar I (diagonals); present in genomovar IIIa and B. stabilis only (white).Equivalent ORFs for B. cepacia genomovar I (strain E242) are indicated, along with% identity values between B. cenocepacia and B. cepacia predicted proteins.Figure 1b shows an amended consensus map of the B. cenocepacia/B.multivorans TTS gene cluster.Percentage identity for predicted proteins of strain J2315 and strain E243 are given where known.

Table 3
Best BLASTP matches for predicted proteins within and flanking the B. cenocepacia TTS gene cluster