Abstract

Although it is well known that there is no long range colinearity in gene order in bacterial genomes, it is thought that there are several regions that are under strong structural constraints during evolution, in which gene order is extremely conserved. One such region is the str locus, containing the S10spcalpha operons. These operons contain genes coding for ribosomal proteins and for a number of housekeeping genes. We compared the organisation of these gene clusters in 111 sequenced prokaryotic genomes (99 bacterial and 12 archaeal genomes). We also compared the organisation to the phylogeny based on 16S ribosomal RNA gene sequences and the sequences of the ribosomal proteins L22, L16 and S14. Our data indicate that there is much variation in gene order and content in these gene clusters, both in bacterial as well as in archaeal genomes. Our data indicate that differential gene loss has occurred on multiple occasions during evolution. We also noted several discrepancies between phylogenetic trees based on 16S rRNA gene sequences and sequences of ribosomal proteins L16, L22 and S14, suggesting that horizontal gene transfer did play a significant role in the evolution of the S10spcalpha gene clusters.

Introduction

Shortly following the completion of the first two prokaryotic genomes (that of Haemophilus influenzae and that of Escherichia coli) it was noted that there is no long range colinearity in gene order in bacterial genomes [1,2]. Subsequent studies on more taxa confirmed this initial finding: apparently dynamic rearrangements have occurred frequently enough to break up operon structures [3,5] and although gene order is extremely conserved in closely related taxa, it rapidly becomes less conserved with evolutionary distance [6,7]. However, even in distantly related genomes, several highly conserved regions can be found, probably regions that are under strong structural constraints during evolution [5,7]. Systematic genome comparisons have revealed that functionally related genes tend to be neighbours more often than unrelated genes [8] and this provides strong support for the concept that conserved gene order could be correlated with physical interactions between the encoded proteins [9]. One region in which gene order generally appears to be conserved is the str locus that contains the S10spcalpha operons, encoding ribosomal proteins and a number of housekeeping genes [5,10].

In E. coli, 53 ribosomal proteins have been identified [11,12]. Approximately half of these are encoded by genes that are located at the str locus, while the rest are scattered around the genome in clusters of 1–4 genes. The genetic organisation of the ribosomal protein clusters is complex, with many operons containing genes for non-ribosomal proteins. In addition, the organisation of many ribosomal protein operons does not follow the promotor-structural gene-terminator paradigm [12]. The physiological relevance of this complex organisation is at present not entirely clear. In E. coli, the S10 operon contains the genes coding for ribosomal proteins S10, L3, L4, L23, L2, S19, L22, S3, L16, L29 and S17. The spc operon contains the genes coding for ribosomal proteins L14, L24, L5, S14, S8, L6, L18, S5, L15 and L36. In addition, between the genes coding for L15 and L36, the secY gene is found, coding for a preprotein translocase. The alpha operon contains ribosomal proteins S13, S11, S4 and L17, with rpoA (coding for the α-subunit of RNA polymerase) inserted between S4 and L17. Subsequently, the organisation of the S10, spc and alpha gene clusters was determined for a number of other bacterial taxa (including Mycoplasma capricolum [13], Chlamydia trachomatis [14], Bacillus subtilis [15], Synechococcus sp. [16] and Sinorhizobium meliloti [15]), as well as for a number of archaeal species (including Sulfolobus solfataricus [17] and Halobacterium halobium [18]). While for several organisms the organisation of S10, spc and alpha gene clusters was very similar to the organisation seen in E. coli, deletions or insertions of additional genes and/or translocations of genes were often noticed. For example, in contrast to E. coli and H. influenzae, the spc gene clusters of B. subtilis and Mycoplasma genitalium contain three additional genes coding for non-ribosomal proteins: adk (coding for adenylate kinase), map (coding for methionine aminopetidase) and infA (coding for translation initiation factor I) [5]. When comparing the gene order conservation in 35 sequenced prokaryotic genomes, Tamames [7] found varying levels of conservation of gene order (15–88%, expressed as the ratio between the number of times the gene is conserved in the run and the total number of times the gene is present) for members of these gene clusters.

Although it has been hypothesised that genes coding for proteins involved in multiple interactions, including ribosomal proteins, are less likely to be horizontally transferred (the complexity hypothesis [19]), horizontal gene transfer has been described for some ribosomal protein genes, including S14 and L27[20,,23]. Especially the case of the S14 gene is intriguing, as there seems to have been recurrent transfers of this gene between various bacterial groups [20]. Other studies have demonstrated the importance of ribosomal protein gene duplications and lineage-specific gene loss [21,24]. This suggests that many evolutionary forces are involved in shaping the organisation of ribosomal protein gene clusters.

Now that bacterial genome sequences are published almost weekly, it is possible to compare the organisation of S10, spc and alpha gene clusters in a wide range of taxa. In the present study we compared the organisation of the S10, spc and alpha gene clusters in 99 sequenced bacterial genomes. Twelve archaeal genomes were included for comparison. We also compared the organisation of S10, spc and alpha gene clusters with groupings obtained by comparing 16S ribosomal RNA gene sequences and the sequences of multiple ribosomal proteins.

Materials and methods

Genome sequences

We downloaded 99 bacterial and 12 archaeal genome sequences from the GenBank database. If several strains from a single species were sequenced we only included one. An overview of all taxa included (including strain number and GenBank accession number) is given in Tables S1 and S2 that can be found as supplementary data online at http://allserv.ugent.be/~tcoenye/cepacia/page40.html.

Sequence alignment and numerical analysis

16S ribosomal RNA gene and amino acid sequences from ribosomal proteins L16, L22 and S14 were extracted from the whole-genome sequence. Sequences were aligned using the emma interface (EMBOSS). Tree construction and bootstrap analyses (100 replicates) were performed using the Bionumerics 3.5 (Applied Maths) and Treecon [25] software packages. Phylogenetic trees were constructed using the neighbour-joining method [26] (no specific substitution model was applied). In the case of the genetic organisation of the S10, spc and alpha gene clusters, all individual genes were considered as multistate characters. Genes were considered to belong to one of the following categories: (i) present in the genome in the same place and order as in the E. coli and/or B. subtilis genomes; (ii) present in the genome but in a different place and/or order than in the E. coli and/or B. subtilis genomes or (iii) absent from the genome. Trees were constructed using the Bionumerics 3.5 software package, using the categorical coefficient. Absence of a gene from a genome was confirmed by performing a BLASTP analysis [27], using the ribosomal protein sequence of the closest relative as the query sequence.

Results and discussion

Organisation of S10, spc and alpha gene clusters in bacterial genomes

When we compared the organisation of S10, spc and alpha gene clusters in 99 sequenced bacterial genomes, 42 different organisations were observed (see Table S1). Based on this organisation we constructed a dendrogram, using the categorical coefficient (Fig. 1). A schematic overview of the organisations observed is given in Fig. 2. Most variation occurs in the 3 prime half of the spc gene cluster and in the alpha gene cluster, while the S10 gene cluster appears to be more conserved. It is worth noting that only 14 organisms showed the same organisation as seen in E. coli.

1

Dendrogram derived from the unweighted pair group average linkage of categorical coefficients between the organisation of S10,spc and alpha gene clusters in sequenced bacterial genomes.

1

Dendrogram derived from the unweighted pair group average linkage of categorical coefficients between the organisation of S10,spc and alpha gene clusters in sequenced bacterial genomes.

2

Schematic overview of the organisation of some S10, spc and alpha gene clusters in 99 bacterial genomes. The three lateral arrows below the gene names represent the operon organisation in Escherichia coli. +, presence; −, absence; x, found in another position in genome. Numbers in first column refer to organisation of the cluster as indicated in Table S1 (which taxa belong to which group can also be found in Table S1).

2

Schematic overview of the organisation of some S10, spc and alpha gene clusters in 99 bacterial genomes. The three lateral arrows below the gene names represent the operon organisation in Escherichia coli. +, presence; −, absence; x, found in another position in genome. Numbers in first column refer to organisation of the cluster as indicated in Table S1 (which taxa belong to which group can also be found in Table S1).

Many bacterial genomes do not encode all ribosomal proteins found in the S10spcalpha operons in E. coli (Table S1 and Fig. 2). While some of these gene losses appear to be specific for one or more lineages (for example L30 is absent from the genomes of members of the Chlamydiae, the ε-Proteobacteria, the Cyanobacteria and the mycoplasmas), other losses are restricted to one or a few members of a lineage (for example L2 is present in all bacterial genomes, except in that of Streptococcus mutans). Most gene loss is seen in Clostridium tetani: the genome of C. tetani appears to lack the genes that code for 10 ribosomal proteins proteins found in the S10spcalpha operons in E. coli.

In a number of genomes, additional genes coding for non-ribosomal proteins can be found in the S10, spc and alpha gene clusters. In several cases the inserted genes encode unknown and/or hypothetical proteins (Table 1). Some of these are very short and it remains to be determined if these are true protein coding genes or open reading frames that occur by chance [28]. However, in several genomes, there is evidence for the insertion of true protein-coding genes in the S10, spc and alpha gene clusters (Table 1).

1

Overview of non-ribosomal proteins inserted in the S10, spc and alpha gene clusters in sequenced bacterial genomes

Organism Inserted non-ribosomal genes Location 
Bacillus halodurans C125 Hypothetical protein Between map and infA 
Bifidobacterium longum NCC2705 Hypothetical protein Between S13 and rpoA 
Corynebacterium diphtheriae NCTC13129 Putative secreted protein, putative ABC transport system Between S17 and L14 
 ATP-binding protein and putative ABC transport system integral membrane protein  
 Serine transporter, l-serine dehydratase and putative secreted Between L5 and S8 
 amino acid hydrolase  
 Putative transport protein, putative sugar binding secreted protein, Between L15 and secY 
 putative sugar ABC transport system membrane protein and putative ABC transport system membrane protein  
 Putative sialidase precursor and putative secreted protein Between map and infA 
Corynebacterium efficiens YS-314 2 Hypothetical proteins and putative Between L5 and S8 
 glucose-6-phosphate dehydrogenase  
 Hypothetical protein Between map and infA 
Corynebacterium glutamicum ATCC13032 2 Hypothetical proteins Between L5 and S8 
 Uncharacterised protein Between map and infA 
Coxiella burnetii RSA493 Hypothetical protein Between secY and S13 
Haemophilus ducreyi 35000HP InsA and InsB Between S17 and L14 
Helicobacter hepaticus ATCC51449 Conserved hypothetical protein Between L5 and S8 
Lactococcus lactis IL1403 Unknown protein Between S14 and S8 
 Hypothetical protein yvfC, IS 1077F transposase, Between adk and infA 
 hypothetical protein yvfD and IS 904H transposase  
Mesorhizobium loti 2 Unknown proteins Between L18 and L15 
Mycobacterium bovis AF2122/97 Possible arylsulfatases ATSa and ATSb, conserved hypothetical Between S17 and L14 
 protein and conserved transmembrane protein  
 Possible protease IV sppA, possible d-xylulose kinase B and Between L15 and secY 
 conserved hypothetical protein  
Mycobacterium leprae TN Arylsulfatase pseudogene and 2 hypothetical proteins Between S17 and L14 
 Possible protease IV sppA, possible d-xylulose kinase B and Between L15 and secY 
 conserved hypothetical protein  
Mycobacterium tuberculosis H37Rv Arylsulfatase and 2 hypothetical proteins Between S17 and L14 
 Possible protease IV sppA, possible d-xylulose kinase B and Between L15 and secY 
 conserved hypothetical protein  
Neisseria meningitidis Z2491 3 Hypothetical proteins Between S10 and L3 
Pirellula sp. strain 1 Hypothetical protein Between S10 and L3 
Thermoanaerobacter tengcongensis MB4T Hypothetical protein Between map and infA 
Organism Inserted non-ribosomal genes Location 
Bacillus halodurans C125 Hypothetical protein Between map and infA 
Bifidobacterium longum NCC2705 Hypothetical protein Between S13 and rpoA 
Corynebacterium diphtheriae NCTC13129 Putative secreted protein, putative ABC transport system Between S17 and L14 
 ATP-binding protein and putative ABC transport system integral membrane protein  
 Serine transporter, l-serine dehydratase and putative secreted Between L5 and S8 
 amino acid hydrolase  
 Putative transport protein, putative sugar binding secreted protein, Between L15 and secY 
 putative sugar ABC transport system membrane protein and putative ABC transport system membrane protein  
 Putative sialidase precursor and putative secreted protein Between map and infA 
Corynebacterium efficiens YS-314 2 Hypothetical proteins and putative Between L5 and S8 
 glucose-6-phosphate dehydrogenase  
 Hypothetical protein Between map and infA 
Corynebacterium glutamicum ATCC13032 2 Hypothetical proteins Between L5 and S8 
 Uncharacterised protein Between map and infA 
Coxiella burnetii RSA493 Hypothetical protein Between secY and S13 
Haemophilus ducreyi 35000HP InsA and InsB Between S17 and L14 
Helicobacter hepaticus ATCC51449 Conserved hypothetical protein Between L5 and S8 
Lactococcus lactis IL1403 Unknown protein Between S14 and S8 
 Hypothetical protein yvfC, IS 1077F transposase, Between adk and infA 
 hypothetical protein yvfD and IS 904H transposase  
Mesorhizobium loti 2 Unknown proteins Between L18 and L15 
Mycobacterium bovis AF2122/97 Possible arylsulfatases ATSa and ATSb, conserved hypothetical Between S17 and L14 
 protein and conserved transmembrane protein  
 Possible protease IV sppA, possible d-xylulose kinase B and Between L15 and secY 
 conserved hypothetical protein  
Mycobacterium leprae TN Arylsulfatase pseudogene and 2 hypothetical proteins Between S17 and L14 
 Possible protease IV sppA, possible d-xylulose kinase B and Between L15 and secY 
 conserved hypothetical protein  
Mycobacterium tuberculosis H37Rv Arylsulfatase and 2 hypothetical proteins Between S17 and L14 
 Possible protease IV sppA, possible d-xylulose kinase B and Between L15 and secY 
 conserved hypothetical protein  
Neisseria meningitidis Z2491 3 Hypothetical proteins Between S10 and L3 
Pirellula sp. strain 1 Hypothetical protein Between S10 and L3 
Thermoanaerobacter tengcongensis MB4T Hypothetical protein Between map and infA 

Multiple genes found in the S10spcalpha operons in E. coli, are found outside these gene clusters in many bacterial genomes investigated (Table S1 and Fig. 2). Two different classes can be distinguished. In several genomes, genes coding for ribosomal proteins found in the S10spcalpha operons in E. coli are now found outside these clusters. These genes are not grouped together but are found on different positions, scattered throughout the genome. This is the case for Agrobacterium tumefaciens, Rhodopseudomonas palustris, Rickettsia conorii, Rickettsia prowazekii, S. meliloti, Neisseria meningitidis, Ralstonia solanacearum, Staphylococcus aureus, Pasteurella multocida, Photorhabdus luminescens, Pseudomonas aeruginosa, Pseudomonas putida, Pseudomonas syringae, Salmonella enterica, Shewanella oneidensis, Vibrio cholerae, Vibrio parahaemolyticus, Vibrio vulnificus, Xanthomonas axonopodis, Xanthomonas campestris, Xylella fastidiosa, Prochlorococcus marinus, Synechococcus sp., Synechocystis sp., Thermosynechococcus elongatus, Pirellula sp. and Treponema pallidum. However, it appears that in other genomes, multiple genes found in the S10spcalpha operons in E. coli have formed novel, separate ribosomal gene clusters (data not shown). For example, in the Campylobacter jejuni genome the genes infA, L36, S13, S11, S4, rpoA and L17 form a separate gene cluster located in a different position of the genome. Whether these genes or gene clusters found outside the S10, spc and alpha gene operons represent horizontal gene transfer followed by deletion of the original gene or gene clusters in the S10, spc and alpha operons, or are the result of a single or multiple genomic rearrangement(s) within the genome is at present not clear. In several bacterial genomes, genes located in the S10spcalpha operons in E. coli are found in other ribosomal gene clusters (data not shown). For example, the S10 gene is located in the S12 ribosomal gene cluster in the genome of all species of the Chlamydiae and the Cyanobacteria. Similarly, the S4 gene of Mycoplasma gallisepticum is also located in the S12 cluster. There are also several examples of changes in gene order within the S10, spc and alpha gene clusters; for example, in Thermotoga maritima, infA is localised at the 3’ end of the L17, while in Mycoplasma penetrans, S3 is localised between S17 and L29.

The genome of several organisms included in this study consists of multiple replicons (A. tumefaciens, Brucella melitensis, Brucella suis, R. solanacearum, V. cholerae, V. vulnificus, V. parahaemolyticus, Leptospira interrogans and Deinococcus radiodurans). In all these organisms the S10, spc and alpha gene clusters were located on the largest replicon.

Organisation of S10, spc and alpha gene clusters in archaeal genomes

When we compared the organisation of S10, spc and alpha gene clusters in 12 sequenced archaeal genomes, 10 different organisations were observed (see Table S2). The organisation of the S10, spc and alpha gene clusters of Methanosarcina acetivorans, Pyrococcus furiosus, Archaeoglobus fulgidus, S. solfataricus, Halobacterium sp., Thermoplasma acidophilum and Methanothermobacter thermoautotrophicum is somewhat similar to the organisation in bacterial genomes, with most variation being localised in the 3′ half of the spc gene cluster and in the alpha gene cluster. However, the organisation of the S10, spc and alpha gene clusters of Aeropyrum pernix, Methanocaldococcus janaschii, Methanopyrus kandleri, Nanoarchaeum equitans and Pyrobaculum aerophilum is totally different (Table S2) and these gene clusters actually appear to be nonexisting in P. aerophilum and N. equitans.

We also noted the insertion of several genes coding for other ribosomal proteins between genes localised in the S10, spc and alpha gene clusters (data not shown). For example, the genes coding for ribosomal proteins L32 and L19 were inserted between the genes coding for L6 and L18 in all archaeal genomes (except in A. pernix, N. equitans and P. aerophilum), while L7 was inserted between S5 and L15 in T. acidophilum. The S10 gene is colocalised with the S12 gene cluster in all archaeal genomes (except those of P. furiosus, N. equitans and P. aerophilum) while the S4 gene is located between L24 and L5 in all archaeal genomes (except those of A. pernix, M. thermoautotrophicum, N. equitans and P. aerophilum). There are also several examples of genes normally found in the S10, spc and alpha gene clusters that now form a separate gene cluster on another location in the genome. This is for example the case for the S13, S4 and S11 genes in all archaeal genomes investigated, and the L3, L4 and L23 genes in A. pernix.

Phylogenies based on amino acid sequences of ribosomal proteins L22, L16 and S14 and comparison with phylogenies based on 16S rRNA gene sequences and on organisation of S10, spc and alpha gene clusters

The 16S rRNA gene has been widely used to infer phylogenetic relationships among prokaryotes. There is however considerable concern that single-gene trees may not adequately reflect phylogenetic relationships, because of the possibility of horizontal gene transfer. For this reason, the sequences of protein coding genes have been used to deduce phylogenetic relationships between organisms, including genes coding for ribosomal proteins [29,30]. Data from the present study indicate that, from the ribosomal proteins encoded by genes localised in the S10, spc and alpha gene clusters, L22 and L16 are the most “stable” genes (i.e. they are present in all bacterial genomes in the same location within the S10, spc and alpha gene clusters). As there is some evidence that the S14 gene might be horizontally transferred [20], we also included the S14 protein in our phylogenetic analysis.

A phylogenetic tree based on 16S rRNA gene sequences is shown in Fig. 3. Overall, the phylogenies derived from L16 and L22 sequences were similar to each other and to the phylogeny derived from the 16S rRNA gene sequences (Table 2, Fig. 4). The main discrepancies between the 16S rRNA gene sequence based tree and the tree based on L16 sequences were: (i) the close relationships between the Actinobacteria and the Firmicutes; (ii) the separate postions of the mycoplasmas and members of the genus Clostridia; (iii) the fact that the β-Proteobacteria appear as a subgroup of the γ-Proteobacteria; (iv) the fact that the δ-proteobacterium Geobacter sulfurreducens does not group with the other Proteobacteria; (v) the separate position of the spirochaete L. interrogans. The main discrepancies between the 16S rRNA gene sequence based tree and the tree based on L22 sequences were: (i) the fact that the β-Proteobacteria appear as a subgroup of the γ-Proteobacteria; (ii) the fact that the δ- and ε-Proteobacteria seem unrelated to each other and the other Proteobacteria; (iii) the separate position of L. interrogans; (iv) the close relationship between the Cyanobacteria and the Actinobacteria. The correlation between the grouping obtained based on 16S rRNA gene sequence similarity and S14 protein sequence similarities was lower, and several differences between both trees can be observed (Figs. 3 and 5). The main discrepancies between the 16S rRNA gene sequence based tree and the tree based on S14 sequences were: (i) the subdivision of the Actinobacteria; (ii) the separate position of Clostridium perfringens and Clostridium acetobutylicum; (iii) the separate position of Streptococcus pneumoniae; (iv) the separate positions of the δ- and ε-Proteobacteria. When comparing the sequence-based trees to the tree based on the organisation of the S10, spc and alpha gene clusters, several differences were noted. Most remarkable were the diversity of the ε-Proteobacteria, and the positions of the clostridia, the spirochaetes, S. mutans, Gloeobacter violaceus, M. gallisepticum and M. penetrans in the tree based on the organisation of the S10, spc and alpha gene clusters (Fig. 1). The overall Pearson product moment correlation coefficients between organisational similarity and 16S rRNA gene, L16, L22 and S14 sequence similarity were high (86.7%, 77.4%, 76.9% and 88.0%, respectively) (Table 2, Fig. 4).

3

Phylogenetic tree based on 16S rRNA gene sequences. The scale bar represents 10% sequence disimilarity.

3

Phylogenetic tree based on 16S rRNA gene sequences. The scale bar represents 10% sequence disimilarity.

2

Pearson product moment correlation coefficients between the similarity matrices of the different data sets

Organisational similarity 100     
      
Sequence similarity of 16S rRNA gene 86.7 100    
L16 77.4 82.3 100   
L22 76.9 80.7 79.5 100  
S14 88.0 72.7 73.9 74.5 100 
Organisational similarity 100     
      
Sequence similarity of 16S rRNA gene 86.7 100    
L16 77.4 82.3 100   
L22 76.9 80.7 79.5 100  
S14 88.0 72.7 73.9 74.5 100 
4

Concordance between the phylogeny derived from the organisation of the S10, spc and alpha gene clusters, and the phylogenies derived from the 16S rRNA gene sequences and the sequences of ribosomal proteins L16, L22 and S14.

4

Concordance between the phylogeny derived from the organisation of the S10, spc and alpha gene clusters, and the phylogenies derived from the 16S rRNA gene sequences and the sequences of ribosomal proteins L16, L22 and S14.

5

Phylogenetic tree based on S14 sequences. The scale bar represents 10% sequence disimilarity.

5

Phylogenetic tree based on S14 sequences. The scale bar represents 10% sequence disimilarity.

Conclusions

Although it was previously reported that the S10spcalpha operon, encoding ribosomal proteins and a number of housekeeping genes, was similar in all bacterial genomes [5,9], data from the present study clearly indicate that there is much variation in gene order and content in these gene clusters. Whether or not the differences in organisation are partially or entirely due to: (i) genomic rearrangements in the genome; (ii) lineage-specific gene loss (preceded by gene duplications or not) and/or (iii) horizontal gene transfer, is at present not clear. However, evidence for the role of horizontal gene transfer in the evolution of ribosomal proteins was presented before [20,,23] and the observed discrepancies between phylogenetic trees based on 16S rRNA gene sequences and sequences of ribosomal proteins L16, L22 and S14 also suggest that horizontal gene transfer may have played a significant role in the evolution of the S10spcalpha operon. More detailed studies will be required to confirm this. Our data also indicate that differential gene loss has occurred on multiple occasions during evolution. In addition, the determination of the organisation of the S10, spc and alpha gene clusters can provide additional, sequence-independent, information that can be used to deduce phylogenetic relationships between prokaryotes.

Acknowledgements

T.C. and P.V. are indebted to the Fund for Scientific Research – Flanders (Belgium) for a position as postdoctoral fellow and research grants, respectively. T.C. also acknowledges the support from the Belgian Federal Government (Federal Office for Scientific, Technical and Cultural Affairs).

References

[1]
Tatusov
R.L.
Mushegian
A.R.
Bork
P.
Brown
N.P.
Hayes
W.S.
Borodovsky
M.
Rudd
K.E.
Koonin
E.V.
(
1996
)
Metabolism and evolution of Haemophilus influenzae deduced from a whole-genome comparison with Escherichia coli
.
Curr. Biol.
 
6
279
291
.
[2]
Mushegian
A.R.
Koonin
E.V.
(
1996
)
Gene order is not conserved in bacterial evolution
.
Trends Genet.
 
12
289
290
.
[3]
Kolsto
A.B.
(
1997
)
Dynamic bacterial genome organisation
.
Mol. Microbiol.
 
24
241
248
.
[4]
Siefert
J.L.
Martin
K.A.
Abdi
F.
Widger
W.R.
Fox
G.E.
(
1997
)
Conserved gene clusters in bacterial genomes provide further support for the primacy of RNA
.
J. Mol. Evol.
 
45
467
472
.
[5]
Watanabe
H.
Mori
H.
Ithoh
T.
Gojobori
T.
(
1997
)
Genome plasticity as a paradigm of eubacterial evolution
.
J. Mol. Evol.
 
44
S57
S64
.
[6]
Suyama
M.
Bork
P.
(
2001
)
Evolution of prokaryotic gene order: genome rearrangements in closely related species
.
Trends Gen.
 
17
10
13
.
[7]
Tamames
J.
(
2001
)
Evolution of gene order conservation in prokaryotes
.
Genome Biol.
 
2
0020.1
0020.11
.
[8]
Tamames
J.
Casari
G.
Ouzounis
C.
Valencia
A.
(
1997
)
Conserved gene clusters of functionally related genes in two bacterial genomes
.
J. Mol. Evol.
 
44
66
73
.
[9]
Dandekar
T.
Snel
B.
Huynen
M.
Bork
P.
(
1998
)
Conservation of gene order: a fingerprint of proteins that physically interact
.
Trends Biochem. Sci.
 
23
324
328
.
[10]
Itoh
T.
Takemoto
K.
Mori
H.
Gojobori
T.
(
1999
)
Evolutionary instability of operon structures disclosed by sequence comparisons of complete microbial genomes
.
Mol. Biol. Evol.
 
16
332
346
.
[11]
Nomura
M.
Gourse
R.
Baughman
G.
(
1984
)
Regulation of the synthesis of ribosomes and ribosomal components
.
Annu. Rev. Microbiol.
 
53
75
117
.
[12]
Lindahl
L.
Zengel
J.M.
(
1986
)
Ribosomal genes in Escherichia coli
.
Annu. Rev. Genet.
 
20
297
326
.
[13]
Ohkubo
S.
Muto
A.
Kawauchi
Y.
Yamao
F.
Osawa
S.
(
1987
)
The ribosomal protein gene cluster of Mycoplasma capricolum
.
Mol. Gen. Genet.
 
210
314
322
.
[14]
Kaul
R.
Gray
G.J.
Koehncke
N.R.
Gu
L.
(
1992
)
Cloning and sequence analysis of the Chlamydia trachomatis spc ribosomal protein gene cluster
.
J. Bacteriol.
 
174
1205
1212
.
[15]
Barloy-Huber
F.
Lelaure
V.
Galibert
F.
(
2001
)
Ribosomal protein gene cluster analysis in eubacterium genomics: homology between Sinorhizobium meliloti strain 1021 and Bacillus subtilis
.
Nucleic Acids Res.
 
29
2747
2756
.
[16]
Sugita
M.
Sugishita
H.
Fujishiro
T.
Tsuboi
M.
Sugita
C.
Endo
T.
Sugiura
M.
(
1997
)
Organisation of a large gene cluster encoding ribosomal proteins in the cyanobacterium Synechococcus sp. strain PCC6301: comparison of gene clusters among Cyanobacteria, Eubacteria and Chloroplast genomes
.
Gene
 
195
73
79
.
[17]
Ianniciello
G.
Gallo
M.
Arcari
P.
Bocchini
V.
(
1994
)
Organisation of a Sulfolobus solfataricus gene cluster homologous to the Escherichia coli str operon
.
Biochem. Mol. Biol. Int.
 
33
927
937
.
[18]
Fujita
T.
Itoh
T.
(
1995
)
Organisation and nucleotide sequence of a gene cluster comprising the translation elongation factor 1 alpha, ribosomal protein S10 and tRNA(Ala) from Halobacterium halobium
.
Biochem. Mol. Biol. Int.
 
37
107
115
.
[19]
Jain
R.
Rivera
M.C.
Lake
J.A.
(
1999
)
Horizontal gene transfer among genomes: the complexity hypothesis
.
Proc. Natl. Acad. Sci. USA
 
96
3801
3806
.
[20]
Brochier
C.
Philippe
H.
Moreira
D.
(
2000
)
The evolutionary history of ribosomal protein RpS14: horizontal gene transfer at the heart of the ribosome
.
Trend Genet.
 
16
529
533
.
[21]
Makarova
K.S.
Ponomarev
V.A.
Koonin
E.
(
2001
)
Two C or not two C: recurrent disruption of Zn-ribbons, gene duplication, lineage-specific gene loss, and horizontal gene transfer in evolution of bacterial ribosomal proteins
.
Genome Biol.
 
2
0033.1
0033.14
.
[22]
Garcia-Vallvé
S.
Simo
F.X.
Montero
M.A.
Arola
L.
Romeu
A.
(
2002
)
Simultaneous horizontal gene transfer of a gene coding for ribosomal protein L27 and operational genes in Arthrobacter sp
.
J. Mol. Evol.
 
55
632
637
.
[23]
Matte-Tailliez
O.
Brochier
C.
Forterre
P.
Philippe
H.
(
2002
)
Archaeal phylogeny based on ribosomal proteins
.
Mol. Biol. Evol.
 
19
631
639
.
[24]
Lecompte
O.
Ripp
R.
Thierry
J.C.
Moras
D.
Poch
O.
(
2002
)
Comparative analysis of ribosomal proteins in complete genomes: an example of reductive evolution at the domain scale
.
Nucleic Acids Res.
 
30
5382
5390
.
[25]
De Peer
Van Y.
Wachter
de R.
(
1994
)
TREECON for Windows: a software package for the construction and drawing of evolutionary trees for the Microsoft Windows environment
.
Comput. Appl. Biosci.
 
10
569
570
.
[26]
Saitou
N.
Nei
M.
(
1987
)
The neighbour-joining method: a new method for reconstructing phylogenetic trees
.
Mol. Biol. Evol.
 
4
406
425
.
[27]
Altschul
S.F.
Madden
T.L.
Schaffer
A.A.
Zhang
J.
Zhang
Z.
Miller
W.
Lipman
D.J.
(
1997
)
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
.
Nucleic Acids Res.
 
25
3389
3402
.
[28]
Skovgaard
M.
Jensen
L.J.
Brunak
S.
Ussery
D.
Krogh
A.
(
2001
)
On the total number of genes and their length distribution in complete microbial genomes
.
Trends Genet.
 
17
425
428
.
[29]
Brown
J.R.
Douady
C.J.
Italia
M.J.
Marshall
W.E.
Stanhope
M.J.
(
2001
)
Universal trees based on large combined protein sequence data sets
.
Nat. Genet.
 
28
281
285
.
[30]
Wolf
Y.I.
Rogozin
I.B.
Grishin
N.V.
Tatusov
R.L.
Koonin
E.V.
Genome trees constructed using five different approaches suggest new major bacterial clades.
BMC Evol. Biol.
  1,
2001
, 8.