Coevolution between Stop Codon Usage and Release Factors in Bacterial Species

Three stop codons in bacteria represent different translation termination signals, and their usage is expected to depend on their differences in translation termination efficiency, mutation bias, and relative abundance of release factors (RF1 decoding UAA and UAG, and RF2 decoding UAA and UGA). In 14 bacterial species (covering Proteobacteria, Firmicutes, Cyanobacteria, Actinobacteria and Spirochetes) with cellular RF1 and RF2 quantified, UAA is consistently over-represented in highly expressed genes (HEGs) relative to lowly expressed genes (LEGs), whereas UGA usage is the opposite even in species where RF2 is far more abundant than RF1. UGA usage relative to UAG increases significantly with PRF2 [=RF2/(RF1 + RF2)] as expected from adaptation between stop codons and their decoders. PRF2 is > 0.5 over a wide range of AT content (measured by PAT3 as the proportion of AT at third codon sites), but decreases rapidly toward zero at the high range of PAT3. This explains why bacterial lineages with high PAT3 often have UGA reassigned because of low RF2. There is no indication that UAG is a minor stop codon in bacteria as claimed in a recent publication. The claim is invalid because of the failure to apply the two key criteria in identifying a minor codon: (1) it is least preferred by HEGs (or most preferred by LEGs) and (2) it corresponds to the least abundant decoder. Our results suggest a more plausible explanation for why UAA usage increases, and UGA usage decreases, with PAT3, but UAG usage remains low over the entire PAT3 range.

Because different stop codons may manifest as different signals to the cellular translation termination machinery, both experimental and bioinformatic approaches have been taken to characterize translation termination efficiency in association with their decoders. The experimental studies on translation termination have focused mainly on E. coli (and occasionally in the yeast, Saccharomyces cerevisiae) and addressed two questions: (1) which tRNA species tend to misread a stop codon as a near-cognate sense codon and (2) which release factor tends to misread near-cognate sense codons as stop codons.
All three stop codons can be misread by tRNAs, and UGA appears to be the leakiest of the three, with a readthrough frequency of at least 10 À 2 -10 À 3 in Salmonella typhimurium (Roth 1970) and E. coli (Sambrook et al. 1967;Strigini and Brickman 1973). UAA and UAG can also be leaky in bacteria (Davies et al. 1966;Ryden and Isaksson 1984), although their misreading has not been reported as frequently as UGA. Natural UAG readthrough frequency is mostly within the range of 1.1Â10 À 4 -7Â10 À 3 , depending on the nature of the downstream nucleotides (Bossi and Ruth 1980;Bossi 1983;Miller and Albertini 1983;Ryden and Isaksson 1984). The readthrough of UAA seems to occur at frequencies from 9Â10 À 4 to < 1Â10 À 5 (Ryden and Isaksson 1984). Overall, the available experimental data suggest that in bacteria species, particularly in E. coli, readthrough is most frequent for UGA, less for UAG, and least for UAA (Strigini and Brickman 1973;Geller and Rich 1980;Parker 1989;Jorgensen et al. 1993;Meng et al. 1995;Cesar Sanchez et al. 1998;Tate et al. 1999).
Translation termination error rate depends not only on readthrough by tRNA, but also on the efficiency and relative concentration of RF1 and RF2 (Korkmaz et al. 2014). Increasing RF2 concentration decreased both UGA readthrough and frameshift (reviewed in Tate et al. 1999). The observation that UAA is the most frequently used stop codon in E. coli, Bacillus subtilis, and S. cerevisiae (Sharp and Bulmer 1988) was interpreted in light of the fact that UAA has the largest number of decoders (being decoded by both RF1 and RF2) and that it is the most reliable stop signal of the three as reviewed above. Early studies suggest that RF1 and RF2, given the same concentration, decode their respective stop codons with roughly equal efficiency (Scolnick et al. 1968;Jorgensen et al. 1993;Freistroffer et al. 2000;Ito et al. 2000), and that both are extremely efficient and accurate against nearcognate codons, except for UGG in the case of RF2 and UAU in the case of RF1 (Freistroffer et al. 2000). However, given the same codon context, RF2 decoding UGA is less efficient than RF1 decoding UAG in E. coli (Bjornsson and Isaksson 1996).
The effect of both mutation and selection (mediated by relative concentration of RF1 and RF2) on stop codon usage have been studied. The selection effect is derived as an extension of the well-known codon-anticodon adaptation (Ikemura 1981(Ikemura , 1992Akashi and Eyre-Walker 1998;Xia 1998;van Weringh et al. 2011;Chithambaram et al. 2014aChithambaram et al. , 2014bPrabhakaran et al. 2014Prabhakaran et al. , 2015. As UGA is decoded only by RF2 and UAG only by RF1, one expects UGA to be used more than UAG when RF2 concentration is higher than RF1 (assuming that the two have equal decoding efficiency on their respective codons). This is consistent in E. coli, where RF2 is $5 times more frequent than RF1 (Adamski et al. 1994;Mora et al. 2007) and UGA is used much more frequently than UAG.
The mutation effect on stop codon usage is mainly studied through genomic GC content which has a strong effect on stop codon usage based on data from 736 species (Povolotskaya et al. 2012). An even more comprehensive compilation involving 4,684 genomes (Korkmaz et al. 2014) have revealed strong effect of GC content on the frequencies of UAA and UGA, but little on the frequency of UAG. However, the effect of GC content on stop codon usage depends on gene expression (Korkmaz et al. 2014).
These bioinformatic studies (Sharp and Bulmer 1988;Brown et al. 1993;Cridge et al. 2006;Povolotskaya et al. 2012;Korkmaz et al. 2014) have generally found UAA to be the most frequent stop codon and UAG the least frequent. In particular, Korkmaz et al. (2014) claimed that "TAG is truly a minor stop codon in all aspects". Designating codons as major and minor codons are important not only in understanding the function of the translation machinery, but also in biopharmaceutical industry as many experimental studies have shown that replacing minor codons by major codons increases protein production (Robinson et al. 1984;Sorensen et al. 1989;Haas et al. 1996;Ngumbela et al. 2008).
The term "major (or minor) codon" is often misunderstood. "Major codon" (or optimal codon) originally refers to sense codons preferred by highly expressed genes and decoded by the most abundant tRNA. It is first used by McPherson (1988) in reference to a study (Kurland 1987) showing that highly expressed genes use codons to optimize decoding efficiency of the tRNA pool. A minor codon is the opposite. Major and minor codons are not necessarily the most frequent or least frequent codons when compilation is done for all genes.
Two criteria, one essential and one corroborative, have been used, sometimes implicitly, to identify a minor sense codon. The essential criterion is that a minor codon is the most strongly avoided in highly expressed genes (HEGs, in contrast to lowly expressed genes or LEGs). The corroborative criterion is that a minor codon corresponds to the least abundant tRNA among synonymous codons. Without these two criteria, a minor codon could be identified incorrectly (Xia 2015). For example, if we compile the codon frequencies of Asp codon family for all genes in E. coli (NC_000913), we will get 41,806 GAU and 25,015 GAC, which would mislead us to conclude that GAU is the major codon, and GAC the minor. However, if we rank E. coli genes by the protein abundance data compiled in the integrated data set in PaxDB (Wang et al. 2012) or by the index of translation elongation (I TE , Xia 2015), then LEGs (100 genes at the low end of abundant proteins) uses more GAU than GAC, but HEGs (100 genes at the high end of gene expression) uses more GAC than GAU. Furthermore, these Asp codons are translated by three tRNA Asp genes all with the same GUC anticodon forming perfect base-pair with GAC. Thus, both criteria support GAC as the major (optimal) codon, and GAU as the minor. Korkmaz et al. (2014) made an effort to apply these two criteria in identifying major and minor stop codons in bacteria. They compiled 4,684 bacterial genomes and concluded that "in all these phyla, TAG is the minor stop codon", and that "TAG is truly a minor stop codon in all aspects". The conclusion, however, is wrong because of misapplication of the two criteria, which may be best illustrated by taking Microcystis aeruginosa for example. LEGs use more UGA than UAG as stop codons in this species (P UGA,LEG ¼ 0.2970, P UAG.LEG ¼ 0.2393, table 1), but HEGs use more UAG than UGA (P UGA,HEG ¼ 0.2536, P UAG.HEG ¼ 0.1556, table 1). This stop codon usage pattern is consistent with the relative RF1 and RF2 concentrations compiled in the integrated data set available in PaxDB (Wang et al. 2012). Protein abundance is 33.3 ppm (parts per million) for RF1 and 18.2 ppm for RF2 in that integrated data set. The average concentration of RF1 is also higher than RF2 based on multiple separate measurements (table 1). Thus, UAG has more decoders than UGA and is expected to be more preferred than UGA by HEGs, especially given the experimental evidence (reviewed above) that UAG is a more accurate stop signal than UGA. So UAG clearly is not a minor stop codon in M. aeruginosa, contrary to what Korkmaz et al. have claimed. Korkmaz et al. (2014) used ribosomal protein and translation factor genes (which are generally highly expressed) as HEGs in a subset of genomes studied, but they did not contrast between HEGs and LEGs, so one does not know the difference in relative stop codon preference between HEGs and LEGs.
Wei et al. . doi:10.1093/molbev/msw107 MBE For relative abundance of RF1 and RF2, Korkmaz et al. (2014) only confirmed previous findings that RF2 is several fold more abundant than RF1 in E. coli, but did not have RF1 and RF2 abundance data for the rest of the 4,684 species they studied. For the two other species that they studied in detail, B. subtilis and Mycobacterium smegmatis, they have only mRNA data for prfA (coding RF1) and prfB (coding RF2). However, more prfB mRNA than prfA mRNA does not imply more RF2 than RF1 because RF2 is translationally regulated (Craigen et al. 1985;Donly et al. 1990). Thus, their key conclusion that "UAG is truly a minor stop codon in all aspects" is an unwarranted generalization. Korkmaz et al. (2014) did notice that UAG in some bacterial species is more frequent than UGA. However, they interpreted these observations as likely arising from the process of UGA reassignment to a sense codon. They in particular drew attention to Mollicutes where many lineages use genetic code 4 with only two stop codons (UAA and UAG, with UGA reassigned to tryptophan). However, their Table 2 included bacterial species where UAG is used frequently, with no evidence that UGA is either reassigned or in the process of being reassigned. Korkmaz et al. also speculated that the combination of UAG and RF1 is translationally less efficient and accurate than that of UGA and RF2 which, however, is contrary to available experimental evidence reviewed above.
It may be entirely unnecessary to argue that UAG is a nearly universal minor stop codon in bacteria. Those bacterial species that use more UAG than UGA as stop codons may not at all be in the process of having UGA reassigned to sense codons, but instead may simply have more actively decoding RF1 than RF2 in their cells. This hypothesis, which may be termed codon-decoder adaptation hypothesis, is consistent with many previous experimental and bioinformatic studies, including Korkmaz et al. (2014). In fact, one of the key contributions in Korkmaz et al. (2014) is the confirmation that stop codon usage in E. coli is related to relative abundances of RF1 and RF2.
Proteomic studies have been carried out in many bacterial species, with 14 of them (covering Proteobacteria, Firmicutes, Cyanobacteria, Actinobacteria and Spirochetes) having both RF1 and RF2 quantified and deposited in PaxDB (Wang et al. 2012). Of particular value in these data is that relative abundance of RF1 and RF2 varies widely, which paves the way for evaluating the effect of relative abundance of RF1 and RF2 on stop codon usage. The availability of protein abundance data for thousands of proteins also permits a more objective and comprehensive characterization of HEGs and LEGs and their respective stop codon usage.
We found UAA consistently over-represented in HEGs relative to LEGs, consistent with experimental studies (reviewed above) showing UAA to be the most efficient stop codon. In contrast, UGA is always avoided in HEGs relative to LEGs. This is true even in species where UGA accounts for an overwhelming majority of stop codons and RF2 is far more abundant than RF1. In such species, UAA is mostly found in HEGs. UGA usage relative to UAG increases significantly with relative abundance of RF2, following the expectation that synonymous codons increase in usage with the abundance of their decoders (which are tRNAs in the case of sense codons and release factors in the case of stop codons). RF2 is more abundant than RF1 over a wide range of AT content, but decreases rapidly toward zero at extreme AT-richness. This explains why bacterial lineages with high genomic AT content often have UGA reassigned because the low RF2 would select strongly against UGA. There is no indication that UAG is a minor stop codon in bacteria as claimed by Korkmaz et al. (2014). Our results suggest a more plausible explanation for why UAA usage increases, and UGA usage decreases, with P AT3 , but UAG usage remains low over the entire P AT3 range.

Results and Discussion
We ranked protein-coding genes by the following: (1) protein abundance and (2) index of translation efficiency (I TE ), and the top 25% and bottom 25% of genes are taken as HEGs and LEGs (see Materials and Methods for details). We defined P UAA , P UGA and P UAG as the proportion of the three stop codons, and P2 UGA ¼ N UGA /(N UGA þ N UAG ), where N UGA and N UAG are the number of UGA and UAG codons. Note that P2 UGA is different from P UGA which is N UGA / (N UGA þ N UAA þ N UAG ). P UAA , P UGA , P UAG and P2 UGA based on HEGs or LEGs will be subscripted by "HEG" or "LEG", respectively. We also defined P RF2 as where [X] is the concentration of X. We used AT content at the third codon site (P AT3 ) as a proxy of AT-biased mutation.
To facilitate presentation, we rebranded the conventional codon-anticodon adaptation hypothesis for sense codons as codon-decoder adaptation hypothesis. This generalized hypothesis predicts that a codon, be it sense or stop codon, increases its usage with its decoders, and that such increase is typically more pronounced in HEGs than in LEGs.
UAA is a Major Codon in All 14 Species P UAA does not increase or decrease with the relative availability of RF2 (P RF2 , fig. 1a and table 1) which is expected because RF1 and RF2 can both decode UAA with roughly equal efficiency, at least in E. coli (Scolnick et al. 1968;Jorgensen et al. 1993;Freistroffer et al. 2000;Ito et al. 2000). What is remarkable is that P UAA is always higher in HEGs than in LEGs in all 14 species ( fig. 1), even in extremely GC-biased genomes ( fig. 1b, P AT3 is only 0.1262 for P. aeruginosa and 0.2018 for Mycobacterium tuberculosis). In contrast, UGA is always avoided in HEGs relative to LEGs ( fig. 1), even in species where UGA represents an overwhelming majority of stop codons in all genes. Among the 5,925 annotated protein-coding genes in P. aeruginosa (NC_011770), 4,651 terminate with UGA, 684 with UAG and only 590 with UAA (which are mostly in HEGs). This preponderance of UGA stop codons is associated with greater abundance of RF2 than RF1 (P RF2 ¼ 0.7475 in P. aeruginosa). Given so many UGAs and so few UAAs in P. aeruginosa, one would have expected RF2 to evolve a higher efficiency to decode UGA, perhaps at the cost of reduced efficiency of decoding UAA, so that HEGs would have an increased preference for UGA relative to UAA. However, this expectation is not supported as UGA is used less frequently in HEGs than in LEGs in these two species (fig. 2a, P UGA.HEG ¼ 0.6880 and P UGA.LEG ¼ 0.8160 for P. aeruginosa). Thus, although UAA is rare in P. aeruginosa, it is strongly preferred by HEGs. In contrast, UGA in P. aeruginosa is frequent (and RF2 more abundant than RF1), yet it is avoided by HEGs. The difference in stop codon usage between 500 HEGs and 500 LEGs is highly significant based on chi-squared test with Yates correction for continuity (v 2 ¼ 91.23, DF ¼ 2, P < 0.0001). One possible explanation for this lack of (a) (b) FIG. 1. Stop codon UAA is preferred in highly expressed genes (HEGs) relative to lowly expression genes (LEGs) in all 14 species, regardless of (a) relative abundance in RF1 and RF2, measured by PRF2 as RF2/(RF1 þ RF2), or (b) proportion of AT at third codon site (P AT3 ).
Wei et al. . doi:10.1093/molbev/msw107 MBE expected RF2 evolution is that genomic AT content could change very quickly (Marin and Xia 2008;Nikbakht et al. 2014), whereas functional modification of a key cellular protein is typically a very slow process. In short, GC-biased mutation can increase UGA at the cost of UAA, but does not change the preference of UAA by HEGs in all 14 species studied.
In model organisms such as E. coli, UAA has been shown to be the most efficiently decoded and UGA the least (Strigini and Brickman 1973;Geller and Rich 1980;Parker 1989;Jorgensen et al. 1993;Tate et al. 1999). Highly expressed genes in E. coli were previously observed to prefer UAA as stop codons (Jin et al. 2002). Our result, with 14 species covering a wide taxonomic spectrum, suggests that UAA is a more efficient stop signal than other stop codons in bacteria in general. This implies that a transgenic gene expressed in a bacterial species should be terminated with UAA to enhance termination efficiency.
The other AT-poor species, M. tuberculosis, also exhibit strong difference between HEGs and LEGs (v 2 ¼ 16.23, DF ¼ 2, P ¼ 0.0003), but here both UAG ( fig. 2) and UAA ( fig. 1) are preferred in HEGs relative to LEGs. The strong preference of UAG in HEGs is clearly at odds with the conclusion in Korkmaz et al. (2014) that "UAG is a minor stop codon in all aspects". P UAG.HEG is also higher than P UAG.LEG in M. aeruginosa and Acidithiobacillus ferrooxidans, and the two are equal in Helicobacter pylori (table 1). Thus, UAA is universally preferred in HEGs, UAG is preferred in HEGs in 3 species, and UGA is avoided in HEGs in all 14 species.
If we do not contrast between HEGs and LEGs, and focus on HEGs only or all genes, then we may arrive at a wrong conclusion that UGA is the major codon and UAA the minor codon in M. tuberculosis and P. aeruginosa because UGA is more frequent than UAA or UAG. Take HEGs in M. tuberculosis for example. P UGA.HEG , P UAG.HEG and P UAA.HEG are 0.4775, 0.35375 and 0.16875, respectively (table 1). However, UGA is not the major codon because UGA is even more frequent than UAA or UAG in LEGs, with P UGA.LEG , P UAG.LEG and P UAA.LEG being 0.57625, 0.27125 and 0.1525, respectively (table 1). It is crucially important to contrast codon usage between HEGs and LEGs in identifying codons favoured by decoder-mediated selection (Eyre-Walker and Bulmer 1995;Xia 2015).

Relative Usage of UAG and UGA Depends on Relative Abundance of RF1 and RF2
Because UAG is decoded by RF1 and UGA by RF2, we expect P2 UGA , which is the proportion of UGA within (UGA þ UAG), to increase with P RF2 . The concentration of RF1 and RF2 vary widely among the 14 bacterial species, with P RF2 varying from 0.046 in Bacillus anthracis to 0.9830 in Yersinia pestis CO92. The codon-decoder adaptation hypothesis predicts that species like B. anthracis should use UAG more frequently than UGA in HEGs and species like Coevolution between Stop Codon Usage and Release Factors in Bacterial Species . doi:10.1093/molbev/msw107 MBE Y. pestis CO92 should use UGA more frequently than UAG. We tested this prediction by using regression on the original P RF2 and P2 UGA and on phylogeny-based independent contrasts (Felsenstein 1985). The latter method alleviates the problem of data dependence due to sharing of ancestry.
The stop codon usage among the 14 bacterial species is as predicted by the codon-decoder adaptation hypothesis ( fig. 3). First, both LEGs and HEGs follow the same trend with P2 UGA increasing with P RF2 (P < 0.01 in both LEGs and HEGs, fig. 3). Second, the pattern is stronger in HEGs than in LEGs. For example, in the three species with the highest P RF2 values, P2 UGA.HEG is greater than P2 UGA.LEG (fig. 3). In the three species with the lowest P RF2 values, P2 UGA.HEG is lower than P2 UGA.LEG (fig. 3). Such a pattern is consistent with that observed in sense codons. There is no indication that "UAG is truly a minor stop codon in all aspects" (Korkmaz et al. 2014), and there is consequently no need to invoke the speculations by Korkmaz et al. (2014) that the combination of UAG and RF1 is worse than that of UGA and RF2 in translation termination efficiency and accuracy. A codon becomes rare when its decoder is rare and vice versa. One may say that UAG is a minor codon in E. coli, but it is inappropriate to say that UAG is a universal minor codon and jump to speculate that the combination of UAG and RF1 is inefficient or inaccurate.
Based on the regression line for P2 UGA.HEG on P RF2 , P2 UGA.HEG equals 0.5 when P RF2 ¼ 0.3679 (i.e., when RF2:RF1 is $0.6:1). Thus, if we may make a liberal interpretation of this result from a limited data of 14 species, then UGA will tend to be less frequent than UAG (i.e., P2 UGA.HEG < 0.5) when P RF2 is < 0.3679, but UAG will tend to be less frequent than UGA when P RF2 is > 0.3679 (assuming equal efficiency between RF1 decoding UAG and RF2 decoding UGA). In our study, 3 of the 14 species (Streptococcus pyogenes, B. anthracis, and Staphylococcus aureus) have P RF2 <0.3679 ( fig. 3) and their UGA, instead of UAG, is the less frequent of the two, with their P2 UGA.HEG values being 0.2969, 0.3953, and 0.3279, respectively. It is unnecessary to suggest, as Korkmaz et al. (2014) did, that bacterial species with low UGA usage may be in the process of UGA reassignment.
Strictly speaking, the regression and significance tests of the regression slope in figure 3 are not valid because the P2 UGA and P RF2 values are not independent due to the sharing of ancestry among the bacterial species. For example, E. coli, S. enterica, and Y. pestis are closely related, so are B. subtilis and B. anthracis. In the extreme case when two species are identical, then the two associated data points should really be treated as just one data point. To alleviate this problem, we have built a tree from the small subunit ribosomal RNA from the 14 species ( fig. 4) and computed the independent contrasts (Felsenstein 1985) for P RF2 and P2 UGA based on the tree and the data in table 1. The results for regressing P2 UGA.HEG on P RF2 are slope ¼ 0.3062, r ¼ 0.5693, P ¼0.0336, and those for regressing P2 UGA.LEG on P RF2 are slope ¼ 0.2663, r ¼ 0.5800, P ¼ 0.0297. This result does not depend heavily on the tree in figure 4. We have generated 100 bootstrap trees and repeated independent contrast analysis for each tree. The P value is always <0.05. Thus, P2 UGA depends significantly on P RF2 , following the prediction of codon-decoder adaptation hypothesis. The wide variation in relative concentration of RF1 and RF2 (with P RF2 varies from 0.046 to 0.9830) raises the question of what affects P RF2 . As previously noted (Korkmaz et al. 2014), bacterial species that lack the prfB gene and have UGA reassigned as a sense codon are typically associated with high genomic AT content. It is therefore reasonable to hypothesize that RF2 abundance decreases with AT content and disappears in species with extreme AT-bias so that UGA as a stop codon would be strongly selected against and eventually reassigned.
AT bias, measured by either the third codon position or by inter-gene sequences, indeed is negatively and highly significantly related to P RF2 (fig. 5, the Spearman rank correlation is FIG. 4. Phylogenetic tree built with small subunit ribosomal RNA sequences (ssu rRNA), used for independent contrasts, with leaves denoted by species name and GenBank accession for genomes from which the ssu rRNA sequences are extracted. Only the first annotated ssu rRNA sequence is used. Coevolution between Stop Codon Usage and Release Factors in Bacterial Species . doi:10.1093/molbev/msw107 MBE À0.6659, P ¼ 0.0093, where P AT3 is the proportion of AT at third codon sites, and is similar to the proportion of AT in intergenic sequences). The relationship can be fitted well by the following equation: The fitted curve ( fig. 5), which accounts for 46.94% of the variation in P RF2 , implies that P RF2 will rapidly approach 0 when P AT3 approaches 0.81566 or higher. This trend that P RF2 would approach 0 with increasing P AT3 explains why extremely AT-rich bacterial genomes frequently lose prfB and have stop codon UGA reassigned. The equation also explains why RF2 is more likely lost than RF1 because the concentration of RF1 does not approach 0 with changes in P AT3 (fig. 5). These results offer empirical substantiation of previous models on stop codon reassignment Jukes 1989, 1995;Andersson and Kurland 1991;Sengupta and Higgs 2005;Sengupta et al. 2007).
We have previously mentioned that P2 UGA.HEG tends to be < 0.5 (i.e., more UAG than UGA) when P RF2 is < 0.3679. According to equation (1), P RF2 will be < 0.3679 when P AT3 > 0.70835. This result, if interpreted liberally, suggests that UAG will tend to be more frequent than UGA only when P AT3 is >0.70835, and explains why UAG tends to be the least frequent in most bacterial species because relatively few bacterial genomes have P AT3 > 0.70835.

Dynamic Changes of Stop Codons with AT Content
One conspicuous pattern observed previously (Korkmaz et al. 2014) is that UAA usage increases, and UGA usage decreases, with AT content, but UAG usage remains low and hardly changes with AT content. This pattern is also visible in the 14 species here ( fig. 6). Korkmaz et al. (2014) interpreted this pattern as consistent with UAG being a minor codon that has translation termination efficiency and accuracy problems and is therefore nearly universally avoided. This interpretation by Korkmaz et al. (2014) is somewhat far-fetched for two reasons. First, as we have mentioned earlier, experimental evidence suggests that UAG is typically more efficient and accurate than UGA as a termination signal. Second, UAG is favored by HEGs in 3 of the 14 species whereas UGA is avoided by HEGs in all 14 species. Furthermore, the interpretation does not explain why UGA becomes less frequent than UAG at high AT content which is particularly visible in Figure 2B in Korkmaz et al. (2014) for highly expressed genes.
Our results on the change of P RF2 and P AT3 offer an alternative explanation for the observation of the following: (1) low UAG usage and (2) little change in UAG usage over the entire range of AT content in bacterial genomes. At the low P AT3 range, mutation would have favoured both UAG and UGA at the cost of UAA. However, P RF2 is high with low P AT3 ( fig. 5) which would favor UGA and select against UAG, keeping the latter at low frequency. At high P AT3 , mutation would  6. UAA usage increases, and UGA usage decreases, with P AT3 , but UAG usage is low and changes little with P AT3 . The pattern is consistent in both highly expressed genes (a) and lowly expressed genes (b). Wei et al. . doi:10.1093/molbev/msw107 MBE favor UAA against UGA and UAG and we expect the latter two to decrease. However, P RF2 approaches 0 at high P AT3 (fig. 5), which selects strongly against UGA codons, but little against UAG codons (as RF1 becomes the dominant release factor at high P AT3 ). This explains why, at high P AT3 , UAG does not decrease as much as UGA in figure 6a and tend to have its usage higher than that of UGA. This pattern is also visible in Figure 2B in Korkmaz et al. (2014). In the mid-range of P AT3 , UAA is overused because of the following: (1) it is favoured by selection and (2) there is no mutation bias against it. Also in this range, P RF2 is still much >0.5 ( fig. 5), favoring UGA against UAG and keep the latter at low frequency. So UAG usage is kept low and changes little over the entire range of P AT3 .

Materials and Methods
Classifying Genes According to Gene Expression We have used protein abundance and Index of Elongation Efficiency (I TE , Xia 2015) as proxies of gene expression. Protein abundance data were downloaded from PaxDB (Wang et al. 2012). For species with multiple proteomic studies, only the integrated data set is downloaded and used to rank the coding sequences. The protein ID in PaxDB is often the Uniprot ID and needs to be mapped to gene names (or GI or GeneID) in a GenBank file for individual species (e.g., B. subtilis). We downloaded the paxdb-uniprot-links file relevant to the species (e.g., 224308-paxdb_uniprot.txt for B. subtilis), saved the Uniprot ID (the last column) to a file (e.g., BsUniprotID.txt), browsed to http://www.uniprot.org/uploadlists/ (last accessed May 31, 2016), under "Provide your identifiers" uploaded the BsUniprotID.txt file, under "Selection options" selected the mapping from "UniProtKB AC/ID" to "Gene name" (or GI or GeneID), and clicked "Go". The resulting mapping file was generated with two columns (original input Uniprot IDs and the mapped gene name (or GIs GeneID) corresponding to gene name or other IDs in a GenBank file. Unmapped ID is stored in a separate file, also available for downloading.
An alternative proxy for gene expression is I TE which require codon usage data from both HEGs and LEGs. For each species, we ranked the genes by protein abundance, took the top 40 ribosomal protein genes as HEGs and bottom 40 genes with nonzero values as LEGs, and compiled codon usage table for HEGs and LEGs separately. These codon usage tables were then used to compute I TE with DAMBE. The resulting I TE is then used as a proxy of gene expression. The advantage of using I TE is that it can be used for all genes and that it is less affected by differential mRNA abundance and protein degradation.
After genes were ranked by either protein abundance or I TE , We have used top and bottom 25% of genes as HEGs and LEGs, respectively, to compile stop codon usage, so the actual number of genes taken as HEGs and LEGS differ among species. If 25% of genes is >1,000, then only 1,000 genes were used. The two ways of ranking genes by their expression (i.e., by protein abundance or by I TE ) lead to similar results. The results presented are based on the ranking by protein abundance. The results from ranking by I TE have slightly stronger patterns with slightly smaller P values.

RF1 and RF2 Concentration
We compiled RF1 and RF2 concentration from proteomic data at PaxDB. Only 14 species have both RF1 and RF2 measured and were included. An average is used when multiple values available. Our values are therefore not always the same as those RF1 and RF2 values in the integrated data sets in PaxDB because the latter includes studies in which either RF1 or RF2 is measured.

Phylogenetic Reconstruction
For computing phylogeny-based independent contrasts, we extracted small subunit ribosomal RNA (ssu rRNA) sequences from genomic sequences in GenBank (with accession included in fig. 4). For species with multiple ssu rRNA genes, only the first one is used for phylogenetic reconstruction. The sequences were aligned by MAFFT (Katoh et al. 2009) with the slow but accurate "-localpair" and "-maxiterate ¼ 1,000" options.
Two phylogenetic reconstruction methods were used. The first was PhyML (Guindon and Gascuel 2003) with GTR (or HKY85). The tree improvement option "-s" was set to "BEST" (best of NNI and SPR search). The "-o" option (optimize starting tree) was set to "tlr" which optimizes the topology, the branch lengths and rate parameters. The other was a distance-based FastME method Gascuel 2002, 2004) implemented in DAMBE (Xia 2013), with the simultaneously estimated maximum composite likelihood distance (Tamura et al. 2004;Xia 2009) based on the TN93 model (MLCompositeTN93). The two trees from the two methods have identical topology and almost perfectly correlated branch lengths. The independent contrasts were generated by using the CONTRAST program in the PHYLIP package (Felsenstein 2014).