Attenuation regulation of amino acid biosynthetic operons in proteobacteria: comparative genomics analysis

Candidate attenuators were identiﬁed that regulate operons responsible for biosynthesis of branched amino acids, histidine, threonine, tryptophan, and phenylalanine in c - and a -proteobacteria, and in some cases in low-GC Gram-positive bacteria, Thermotogales and Bacteroidetes/Chlorobi. This allowed us not only to describe the evolutionary dynamics of regulation by attenuation of transcription, but also to annotate a number of hypothetical genes. In particular, orthologs of ygeA of Escherichia coli were assigned the branched chain amino acid racemase function. Three new families of histidine transporters were predicted, orthologs of yuiF and yvsH of Bacillus subtilis , and lysQ of Lactococcus lactis . In Pasteurellales, the single bifunctional aspartate kinase/homoserine dehydrogenase gene thrA was predicted to be regulated not only by threonine and isoleucine, as in E. coli , but also by methionine. In a -proteobacteria, the single acetolactate synthase operon ilvIH was predicted to be regulated by branched amino acids-dependent attenuators. Histidine biosynthetic operons his were predicted to be regulated by histidine-dependent attenuators in Bacillus cereus and Clostridium diﬃcile , and by histidine T-boxes in L. lactis and Streptococcus mutans . (cid:1) 2004 Federation of European Microbiological Societies. Published by Elsevier B.V. All rights reserved.


Introduction
Bacteria use many different regulatory mechanisms to control transcription and translation of genes in response to concentration of metabolic products. One of possible targets for regulation is the nascent transcript during transcription elongation. Attenuation or antitermination mechanisms that involve formation of alternative RNA structures were observed in diverse bacterial groups with different molecules influencing the choice between these structures [1,2]. In enteric bacteria, many amino acid biosynthetic operons (trp, his, leu, ilvGMEDA, ilvBN, and thr) as well as the phenylalanyl-tRNA synthetase operon pheST are regulated by transcription attenuation [3]. This mechanism is based on coupling between transcription and translation. The nascent leader transcript contains a short open reading frame that encodes the leader peptide. Soon after transcription initiation, a secondary structure element (1:2) forms that causes RNA polymerase to pause (Fig. 1A). This pause allows the ribosome to initiate translation of the leader peptide. Then, the translating ribosome disrupts the paused complex and transcription resumes, coupled with translation. Then, two possibilities exist depending on the level of the relevant amino acid in the cell. Under the condition of amino acid starvation, the level of charged tRNA is low and it causes ribosome stalling at codons for this amino acid (regulatory codons). When transcription proceeds, the antiterminator structure (2:3) folds and prevents terminator formation, resulting in transcription readthrough into downstream genes (Fig. 1B). Under the condition of amino acid excess, the level of charged tRNA is high and translation efficiently proceeds to the stop codon of the leader peptide. When ribosome translates the leader peptide, it prevents formation of the antiterminator structure, thereby promoting formation of the terminator (3:4), which causes premature termination of transcription (Fig. 1C). Thus, the ribosome plays the role of a mediator, sensing the concentration of charged tRNA, which in turn depends on the concentration of the amino acid. Expression of an operon corresponding to a biosynthetic pathway common for several amino acids may be regulated by all of these amino acids, and in this case the leader peptide reading frame contains several types of regulatory codons, for all amino acids.
Comparative analysis of bacterial genomes is a powerful approach to the analysis of regulation on the DNA or RNA levels and reconstruction of metabolic pathways [4][5][6]. Using available experimental data as a training set, we developed a program for prediction of attenuators (named LLLM [7,38]) and applied it to the analysis of upstream regions of orthologous amino acid biosynthetic genes. This resulted in identification of candidate attenuators not only in c-proteobacteria, but in a-and b-proteobacteria, low-GC Gram-positive bacteria, as well as bacteria from some other taxa (Table 1). Analysis of regulatory peptide open reading frames allowed for prediction of the regulating amino acids. Finally, analysis of positional clustering of genes and regulatory signals leads to identification of new candidate members of the biosynthetic pathways of branched chain amino acids, histidine, threonine, and aromatic amino acids.
In Eschericha coli, isoleicine, leucine, and valine biosynthetic genes (''ILV genes'' below) are clustered in several operons, ilvGMEDA, ilvBN, ilvC, ilvIH, and leuABCD [8]. Three paralogs of acetolactate synthase are encoded by genes ilvBN, ilvIH, and ilvGM from three different transcriptional units. The ilvBN and ilvIH genes are transcribed as separate operons, whereas il-vGM is located within the ilvGMEDA operon. The il-vGMEDA and ilvBN operons are regulated by transcription attenuation, and the leader peptide reading frame of the attenuator contains regulatory codons for all three amino acids, isoleucine, leucine, and valine [9]. The leuABCD operon contains genes for the leucine biosynthesis and expression of this operon also is regulated by transcription attenuation [10]. The leader peptide of the leu transcription attenuator includes regulatory codons for only one amino acid, leucine. These and other operons is also regulated by repressors of transcription: ilvC by IlvY, ilvIH, and ilvGMEDA operons by LRP [11][12][13][14].
The histidine biosynthesis pathway consists of 10 steps and starts from 5-phosphoribosyl diphosphate, a product of the pentose phosphate pathway (Fig. 2B). The histidine biosynthesis in E. coli involves nine enzymes: HisGEIAFHBCD, HisF, and HisH being isozymes [15]. All genes of the histidine pathway are known to form one his operon regulated via transcription attenuation [16]. The leader peptide reading frame of the histidine attenuator includes a run of histidine regulatory codons.  The threonine biosynthesis is linked with biosynthesis of other amino acids, aspartate, lysine, methionine, and branched chain amino acids (Fig. 2C). A part of the pathway, which is common for threonine, methionine, and lysine biosynthesis, starts from aspartate. E. coli has three aspartate kinase isozymes, ThrA, MetL, and LysC, that catalyze the conversion of aspartate to 4-aspartylphosphate [17,18]. ThrA and MetL have an additional homoserine dehydrogenase (Hom) domain that catalyzes conversion of aspartate 4-semialdehyde to homoserine. The biosynthesis of branched chain amino acids starts at threonine (Fig. 2C).
In E. coli, expression of three isozyme genes, thrA, metL, and lysC, is under different regulation. Transcription of the thrABC operon is regulated by a threonine-isoleucine-dependent attenuator [19]. At that, regulation of the thrABC operon by isoleucine is an interesting example of repression by a distant product (biosynthesis of branched-chain amino acids is known to start from threonine). The aspartokinase activity of ThrA is feed-back inhibited by threonine [17]. The metBL operon is regulated by repressor MetJ in response of the concentration of S-adenosylmethionine [18]. Finally, lysC is possibly regulated by a lysine riboswitch LYS-element in response of the concentration of lysine (mutations in the leader region of lysC release the lysine repression in E. coli [20] and, moreover, LYSelement is located upstream of lysC [21][22][23]), whereas the aspartokinase activity of LysC is feed-back inhibited by lysine. Thus, the expression and activity of ThrA, MetL, and LysC isozymes are controlled by the concentration of respective amino acids.
The trp operon of E. coli is regulated both by transcription attenuation and transcription repression. Transcription repressor TrpR regulates transcription initiation [25], whereas premature termination of transcription is under control of an attenuator containing two tryptophan codons [26]. The pheA gene, encoding chorismate/prephenate dehydratase, and pheST operon, encoding phenylalanyl-tRNA synthetase, are regulated by phenylanaline attenuation [27,28]. In a-proteobacterium Rhizobium meliloti, the trp(E/G) gene is known to be regulated by transcriptional attenuation [29]. In Gram-positive bacteria, tryptophan biosynthetic genes are known to be regulated by the T-box antitermination mechanism or by TRAP [30,31]. Previously we have analyzed regulation of aromatic amino acids in c-proteobacteria [32]. Here we extend this analysis, considering newly sequenced genomes from all proteobacteria.

Data and methods
Complete and partial sequences of bacterial genomes were downloaded from GenBank [33]. Preliminary sequence data were obtained also from the WWW sites of The Institute for Genomic Research (http://www. tigr.org), University of Oklahoma's Advanced Center for Genome Technology (http://www.genome.ou.edu), the Sanger Centre (http://www.sanger.ac.uk), the DOE Joint Genome Institute http://www.jgi.doe.gov), and the ERGO Database [34]. The list of genomes with taxonomy and abbreviations is given in Table 1.
Protein similarity search was done using the Smith-Waterman algorithm implemented in the GenomeExplorer program [35]. Orthologous proteins were initially defined by the best bidirectional hit criterion [36] and if necessary confirmed by construction of phylogenetic trees for the corresponding protein families. The phylogenetic trees were constructed by the maximum likelihood method implemented in PHYLIP [37]. Multiple sequence alignments were done using CLUSTALW [38]. Transmembrane segments were predicted using TMpred (http://www.ch.embnet.org/ software/TMPRED_form.html). The COG [36], Inter-Pro [39] databases were used to verify the protein functional and structural annotation.
Attenuators of transcription were found using LLLM program. This program identifies candidate attenuators defined as alternative RNA hairpins such that the upstream hairpin overlaps a short open reading frame (candidate leader peptide) containing runs of regulatory codons, whereas the downstream hairpin is a candidate terminator followed by a run of Us. For details see [7,40,41].

Isoleucine, leucine, and valine biosynthesis
Orthologs of the branched-chain amino acids (ILV) genes in genomes of c-, b-and a-proteobacteria were identified by similarity search. Positional gene clusters corresponding to possible ILV operons are shown in Table 2. Then, the LLLM program was applied to upstream regions of the predicted ILV operons in all proteobacterial genomes. New candidate transcriptional attenuators were identified.
Attenuator-like signals were found in upstream regions of candidate ilv operons in c-proteobacteria (Enterobacteria, Pasteurellales, Vibrionales, Shewanella oneidensis, and Xanthomonadales). In Pseudomonadales and other bacteria, the ilv genes are scattered along a genome, and some of them are also preceded by candidate attenuators. The ilvBN operon, which encodes genes for one of the acetolactate synthase isozymes in Enterobacteria, also was predicted to be regulated by the attenuation mechanism via leucine and valine regulatory codons. Other predicted attenuators include regulatory codons for three amino acids, isoleucine, leucine, and valine, similar to the experimentally studied attenuators of E. coli (Fig. 3).
The structure of the candidate ilv biosynthetic operons varies in the analyzed genomes. For example, the order of genes in the ilv operon is ilvGMEDA in Enterobacteria and Vibrionales, but in Xanthomonadales, the order is ilvCGM-tdcB-leuA. In the latter case, the tdcB gene is possibly co-regulated with the ILV genes. Its product is threonine dehydratase which catalyzes reactions in both serine and ILV metabolism.
Another possible co-regulation event was observed in Pasteurella multocida. A gene with unknown function (orthologous to hypothetical gene ygeA of E. coli) is located within the ilv operon (ilvGM-ygeA-DA), and a candidate attenuator was found upstream of this operon. YgeA is weakly similar to the amino acid racemase protein RacX from B. subtilis, which converts L L -aspartate to D D -aspartate [42,43]. Thus, ygeA likely encodes a new kind of racemase, possibly ILV racemase.
The leu operon, which includes only genes for the leucine synthesis, is predicted to be regulated by attenuation in some c-proteobacteria (Enterobacteria, Pasteurellales, Vibrionales, Alteromonadales, and Xanthomonadales), but not in Pseudomonadales and other species. The leader peptide reading frames of all predicted attenuators include runs of leucine codons.
Little is known about regulation of ILV genes in a-proteobacteria. Expression of the ilvIH genes encoding the two subunits of acetolactate synthase has been studied in Caulobacter crescentus, and the region between ilvIH and the transcription initiation site was shown to have the properties of a transcription attenuator [44] (in the cited paper this operon is called ilvBN, not ilvIH, but phylogenetic analysis of all three acetolactate synthases shows that this gene is located on the branch corresponding to ilvIH, data not shown). We analyzed upstream regions of all ILV genes of available a-proteobacterial genomes and found attenuator-like structures ( Table 2). a-Proteobacteria have one acetolactate synthase, IlvIH. The ilvIH operon is possibly regulated by transcription attenuation in Rhizobiales (Sinorhizobium meliloti, Agrobacterium tumefaciens, Mesorhizobium loti, Bradyrhizobium japonicum, Rhodopseudomonas palustris, and Brucella melitensis), Rhodobacter spp., Magnetospirillum magnetotacticum, C. crescentus, and a deeply rooted bacterium Deinococcus radiodurans (Deinococcus/Thermus group). The leader peptide reading frames of predicted attenuators include runs of isoleucine, leucine, and valine regulatory codons (Fig. 3). Conversely, in c-proteobacteria, operons encoding two other acetolactate synthase isoenzymes, ilvBN (present only in Enterobacteria) and ilvGM, but not ilvIH, are regulated by attenuators.
There exist two groups of homologous 2-isopropylmalate synthases, leuA and leuA2 (approx. 30% sequence identity). The leuA genes, orthologs of leuA from E. coli were observed in c-proteobacteria, excluding Pseudomonadales, and in some a-proteobacteria, whereas the leuA2 genes, homologs of well-studied 2isopropylmalate synthases from Actinobacteria and Fungi, in particular Corynebacterium glutamicum [45] and Saccharomyces cerevisiae [46], respectively, were observed in a-proteobacteria, some b-proteobacteria and Pseudomonadales. In a-proteobacteria, both types of 2-isopropylmalate synthase genes have candidate attenuators in upstream regions ( Table 2). Although these attenuators have leader peptide reading frames with runs of leucine regulatory codons, the terminator structures are weak and lack runs of uridines (Fig. 3). At that, one should note that a similar situation was observed in the case of trpE and trpGDC operons in Pseudomonas putida, where transcripts were attenuated despite the absence of strong q-independent terminator structures [47]. Moreover, we found a possible attenuator with a strong G/C-rich terminator upstream of the leuA gene in D. radiodurans. Table 2 Predicted operon structures and regulation of the ILV genes Predicted attenuators are denoted by '&' and '%' (the latter lack terminator-like RNA secondary structures, see the text). Divergently located genes are separated by '<->'. Contig ends are marked by square brackets. Known and possible (REG) regulators from LysR family are shown in bold. Genes with unknown function are denoted by 'x' (with numbers for orthologous genes).

Histidine biosynthesis
Orthologs of the histidine biosynthetic (HIS) genes in bacterial genomes were identified by similarity search. Positional gene clusters corresponding to candidate HIS operons are listed in Table 3. The LLLM program with parameters obtained by analysis of known attenuator structures was used to scan the upstream regions of predicted HIS operons in all analyzed genomes (for details see [7]). New candidate transcriptional attenuators were identified, mainly in c-proteobacteria. We also identified attenuator-like structures in some low-GC Gram-positive bacteria, Bacteroidetes/Chlorobi group and Thermotogales.
Positional analysis and analysis of regulation showed that in most c-proteobacteria (Enterobacteria, Pasteurellales, Vibrionales, and Shewanella oneidensis), all histidine biosynthetic genes are clustered and possibly regulated via the transcription attenuation mechanism ( Table 3). All candidate attenuators upstream of the his operons in these bacteria have similar features: a leader peptide reading frame with a run of histidine regulatory codons and terminator/antiterminator structures (Fig. 4). We found no attenuators upstream of HIS genes in Pseudomonadales, Xanthomonadales, and some other c-proteobacteria.
Analysis of upstream regions of HIS genes in other taxonomic groups revealed attenuator-like structures in the Bacillus/Clostridium group, Bacteroidetes/Chlorobi, and Thermotogales. In those cases, histidine biosynthetic operons, which include most of HIS genes, are possibly regulated. We observed diversity of mechanisms for regulation of the HIS gene expression. In particular, in Lactococcus lactis and Streptococcus mutans, the his operon is regulated by the T-box antitermination mechanism [ [48], Vitreschak A, unpublished], whereas in Bacillus cereus and Clostridium difficile, the his operon seems to be regulated via transcription attenuation. Other Streptococcus spp. as well as Entrerococcus spp. lack histidine biosynthetic genes. Moreover, B. cereus has two copies of the hisZ gene, which are predicted to be regulated by transcriptional attenuation: one as a part of the his biosynthetic operon; the other, as a separate gene with a possible histidine attenuator structure in the upstream region (Table 3). hisS gene in this bacterium, as well as orthologous hisS genes in Bacillus spp., Listeria spp., Enterococcus spp., and L. lactis, are located separately and predicted to be regulated by the T-box antitermination mechanism [ [49], Vitreschak A, unpublished]. Table 3 Candidate operon structures and predicted regulation of HIS genes. Notation as in Table 2 Gene fusion of hisI and hisE is denoted by (I/E). Histidine-specific T-boxes are denoted by 'T'. LYS-elements are denoted by 'L'. Several hypothetical genes were predicted to belong to the histidine regulons. HI0325 from Haemophylus influenzae, which encodes a putative transporter with 10 transmembrane segments, has a candidate histidine attenuator in the upstream region. This gene is widely distributed, but not universal in bacteria. In a number of genomes, in particular in Fusobacterium nucleatum and Bacillus halodurans, this gene is clustered with histidine utilization genes (the hut locus). Thus, HI0325 and its orthologs (yuiF in B. subtilis) possibly constitute a new family of histidine transporters.
Another example is the BC0629 gene from B. cereus that also is possibly regulated via the histidine-dependent attenuation. This gene (yvsH in B. subtilis) is homologous to the arginine:ornithine antiporter arcD from Pseudomonas aeruginosa and lysine permease lysI from Corinobacterium glutamicum. All these proteins belong to the basic amino acid/polyamine antiporter APA family [http://tcdb.ucsd.edu/tcdb/background.php]. B. cereus has two yvsH paralogs, yvsH1 (BC0629) and yvsH2 (BC0865). The former is a candidate lysine transporter whose expression was predicted to be regulated by the lysine via the LYS-element riboswitch mechanism [21]. The upstream region of yvsH2 contains a candidate attenuator whose leader peptide reading frame contains a run of histidine regulatory codons (Fig. 4). Thus, yvsH2 (BC0629) is possibly involved in the histidine transport. The predicted specificity of this transporter is consistent with experimental data for the homologous HisJ and LAO transporters, which both bind histidine, arginine, lysine, and ornithine, albeit with different affinities towards these ligands [50].
A very similar situation was observed in the case of two paralogous transporter genes in L. lactis, lysP, and lysQ. Both proteins are similar (more than 50% identity) to the experimentally identified lysine permease lysP of E. coli [51]. In the L. lactis genome, lysP was predicted to be regulated by a LYS-element and thus to be involved in the lysine transport [21]. On the other hand, the upstream region of lysQ contains a candidate histidine attenuator (Fig. 4). Thus, these two transporters can have different affinity to lysine and histidine, and because of that be regulated one by lysine and the other one by histidine.
All genes required for the histidine biosynthesis were identified in all analyzed bacteria, the only exception being the histidinol-phosphatase domain of HisB in Pseudomonas spp. Neither similarity search nor positional analysis and analysis of regulation provided a candidate for this enzymatic activity.
On the other hand, at least three non-homologous proteins with unknown function (shown in Table 3 as vatB, actX2, and actX3 in P. multocida, Mannheimia haemolytica, and Polaribacter filamentus, respectively), encoding putative acetyltransferases, that are possibly co-regulated with HIS genes. These candidate acety-ltransferases could catalize conversion of histamine to 4b-acetylaminoethyl-imidazole. This is one of the steps of the histidine modification (http://www.genome.ad.jp/ kegg/metabolism.html), for which only enzymatic activity, EC 2.3.1., is known, but no genes have been assigned yet.

Threonine biosynthesis
We analyzed regulation of the thr biosynthetic operon in proteobacteria. Orthologs of thr genes were identified by similarity search. Candidate thr operons and possible regulation are shown in Table 4. Enterobacteria, Pasteurellales, Vibrionales, Shewanella oneidensis, and Xanthomonadales have the same gene order thrABC in the threonine biosynthetic loci. In Pseudomonadales and some other genomes, the threonine biosynthetic genes are scattered along genome. Moreover, in Enterobacteria, Pasteurellales, Vibrionales, S. oneidensis, and Xanthomonadaels, thrA encodes a bifunctional protein, aspartate kinase/homoserine dehydrogenase, whereas in Pseudomonadales and some other c-proteobacteria thrA2 (aspartate kinase) and hom (homoserine dehydrogenase) are located in different loci. Finally, two homoserine kinase genes, thrB2 and thrH [52], neither homologous to thrB of E. coli, were observed in Pseudomonadales (Table 4).
Then, we analyzed upstream regions of the predicted thr operons by LLLM trained on known attenuators. New candidate transcriptional attenuators were identified in c-proteobacteria (Table 4). They have all properties of threonine attenuators: a short leader peptide reading frame with a run of threonine and isoleucine codons, as well as alternative termination and antitermination RNA structures (Fig. 5). Our results predicted that thr operons are regulated by transcription attenuation in Enterobacteria, Pasteurellales, Vibrionales, Shewanella oneidensis, and Xanthomonas campestris.
Closer analysis showed that in Pasteurellales (H. influenzae, P. multocida, Actinobacillus actinomycetemcomitans, and M. haemolytica), the leader peptide reading frame contains not only standard regulatory codons for threonine and isoleucine, but also numerous methionine codons (Fig. 5). Thus, the thr operons in Pasteurellales seem to be regulated by concentration of three amino acids, threonine, isoleucine, and methionine, instead of the former two. Indeed, Pasteurellales have only one copy of the bifunctional aspartate kinase/ homoserine dehydrogenase protein, instead of two isozymes ThrA and MetL in other c-proteobacteria, where the expression of these isozymes is regulated by threonine/isoleucine and by methionine, respectively. Thus, it makes sense that the single ThrA isozyme of Pasteurellales is regulated not only by threonine and isoleucine, but by methionine as well. One more, monofunctional aspartate kinase LysC, is present in three of the five Pasteurellales, P. multocida, Haemophylus ducrei, and M. haemolytica, and the expression of lysC has been predicted to be regulated by lysine via LYS-element riboswitches, as in E. coli [21][22][23].

Tryptophan and phenylalanine biosynthesis
Orthologs of the trp and pheA genes in c-and aproteobacteria were identified by similarity search. Candidate trp, pheA, and pheST operons are shown in Table 5. Candidate attenuators were identified upstream of these operons by the LLLM program (Table 5).
Candidate trp attenuators found in Enterobacteria, Vibrionales, and Shewanella oneidensis have leader peptide reading frames with tryptophan regulatory codons and antitermination/termination-like structures (Fig. 6). The trp(E/G) gene, which encodes fused com-ponents of anthranilate synthase responsible for the first step of the tryptophan biosynthesis, is possibly regulated by transcription attenuation in all analyzed Rhizobiales (order of a-proteobacteria) and in Bordetella pertussis (belonging to b-proteobacteria).
The pheA operon may be regulated by candidate phenylalanine-dependent attenuators in Enterobacteria, Vibrionales, and S. oneidensis, whereas the pheST operon seems to be regulated only in Enterobacteria.
Candidate attenuators of the trpE and trpGDC operons in Pseudomonadales have some peculiar properties. There is experimental evidence that transcription of the trpE and trpGDC operons is regulated by attenuation [47], but no strong q-independent transcriptional terminators could be found in the leader regions of these operons. We aligned sequences upstream of the trpE and trpGDC operons from five pseudomonads. The region Table 4 Candidate operon structures and predicted regulation of THR genes Notation as in Table 2.  Fig. 3. T, I, and M denote, respectively, threonine, isoleucine, and metionine regulatory codons in the leader peptide reading frame. of sequence conservation corresponds to a possible leader peptide, which contains two nearly adjacent tryptophan codons (Fig. 6). It seems that in this case the terminator and antiterminator structures are less pronounced and maybe less stable than those in other attenuators.

Discussion
This analysis allowed us to identify a large number of candidate attenuators and predict the amino acid(s) responsible for the regulation, demonstrated variability of regulatory mechanisms for the amino acid biosynthetic pathways even in closely related genomes, and allowed for functional annotation of hypothetical genes encoding transporters and enzymes. In particular, candidate attenuators were found in some taxonomic groups where this mechanism of regulation was studied little (a-proteobacteria, low-GC Gram-positive bacteria) or not at all (Bacteroidetes/Chlorobi group and Thermotogales).
This analysis, as well as other comparative studies, demonstrate the diversity and evolutionary lability of regulatory mechanisms based on formation of alternative RNA structures, especially in low-GC Gram-posi-tive bacteria. Indeed, we observed candidate histidine attenuators regulating his operons in bacilli and clostridiae, but T-boxes in streptococci that have this operon. It is known that transcription attenuation and T-box antitermiantion mechanisms are prevalent in Proteobacteria and Gram-positive bacteria, respectively. We demonstrate that these different mechanisms, based on switching between two conformations of the RNA nascent transcript, are involved in regulation of the his operons in low-GC Gram-positive bacteria. For example, candidate histidine attenuators regulate his operons in B. cereus and C. difficile, but not in L. lactis and S. mutans, where this role is taken by histidine T-boxes. Moreover, in B. cereus both regulatory mechanisms are present, where histidine attenuators regulate two operons his and hisZ2, whereas the third one, hisS, is regulated by a histidine T-box. This situation is similar to the one with the methionine biosynthesis pathway, which is regulated by T-boxes in streptococci, S-box riboswitches in bacilli and clostridiae, and by transcription repression in lactobacilli [53].
In the case of transcription attenuation, we suppose an ancient origin of this regulatory mechanism. Indeed, we found possible attenuators of amino acid biosynthetic genes not only in proteobacteria, but also in low-GC Gram-positive bacteria, Bacteroidetes/ Table 5 Candidate operon structures and predicted regulation of trp and pheA genes Notation as in Table 2. Gene fusion of trpE and trpG is denoted by (E/G). Gene fusion of trpC and trpF is denoted by (C/F).