Idiosyncratic Purifying Selection on Metabolic Enzymes in the Long-Term Evolution Experiment with Escherichia coli

Abstract Bacteria, Archaea, and Eukarya all share a common set of metabolic reactions. This implies that the function and topology of central metabolism has been evolving under purifying selection over deep time. Central metabolism may similarly evolve under purifying selection during long-term evolution experiments, although it is unclear how long such experiments would have to run (decades, centuries, millennia) before signs of purifying selection on metabolism appear. I hypothesized that central and superessential metabolic enzymes would show evidence of purifying selection in the long-term evolution experiment with Escherichia coli (LTEE). I also hypothesized that enzymes that specialize on single substrates would show stronger evidence of purifying selection in the LTEE than generalist enzymes that catalyze multiple reactions. I tested these hypotheses by analyzing metagenomic time series covering 62,750 generations of the LTEE. I find mixed support for these hypotheses, because the observed patterns of purifying selection are idiosyncratic and population-specific. To explain this finding, I propose the Jenga hypothesis, named after a children’s game in which blocks are removed from a tower until it falls. The Jenga hypothesis postulates that loss-of-function mutations degrade costly, redundant, and non-essential metabolic functions. Replicate populations can therefore follow idiosyncratic trajectories of lost redundancies, despite purifying selection on overall function. I tested the Jenga hypothesis by simulating the evolution of 1,000 minimal genomes under strong purifying selection. As predicted, the minimal genomes converge to different metabolic networks. Strikingly, the core genes common to all 1,000 minimal genomes show consistent signatures of purifying selection in the LTEE.


Introduction
Researchers have learned much about adaptive processes by conducting evolution experiments with large populations of microbes. By contrast, researchers have rarely, if ever, used experimental evolution to study purifying selection, because it may take hundreds or even thousands of years to see evolutionary stasis in action. My colleagues and I have recently started to address this research gap, by examining detailed time series of genetic change in Lenski's long-term evolution experiment with Escherichia coli (LTEE). Our research addresses the following question: what processes are evolving under purifying selection, during adaptation to a novel laboratory environment in which abiotic conditions are held constant?
The LTEE has studied the evolution of 12 initially identical populations of E. coli in carbon-limited minimal medium for more than 30 years and 60,000 generations of bacterial evolution (Lenski et al. 1991;Lenski 2017). The LTEE populations are named by the presence of a neutral phenotypic marker: populations Ara+1 to Ara+6 grow on arabinose, while populations Ara−1 to Ara−6 cannot (Lenski et al. 1991). The LTEE populations have diverged in their mutation rates and biases, as several LTEE populations have evolved elevated mutation rates due to defects in DNA repair (Sniegowski et al. 1997;Tenaillon et al. 2016;Good et al. 2017;Maddamsetti and Grant 2020). These hypermutator populations tend to adapt faster than non-mutator populations that have retained the ancestral mutation rate (Wiser et al. 2013), even though their more rapid genomic evolution largely reflects the accumulation of nearly-neutral mutations through genetic hitchhiking (Couce et al. 2017).
Because so many mutations are observed in the hypermutator populations, my colleagues and I hypothesized that sets of genes that have few mutations, in comparison to typical sets of genes drawn from the genome, may be evolving under purifying selection. To test this hypothesis, we developed a randomization test that we call STIMS (Simple Test to Infer Mode of Selection) to detect selection on pre-defined sets of genes, using metagenomic time series covering 62,750 generations of the LTEE (Good et al. 2017). STIMS can detect positive as well as purifying selection in hypermutator populations, and is able to control for both temporal variation in mutation rates over time, as well as mutation rate variation over the chromosome (Maddamsetti and Grant 2020;Maddamsetti and Grant 2022). Using STIMS, we found evidence of purifying selection on aerobic-specific and anaerobic-specific genes (Grant et al. 2021b) and essential genes (Maddamsetti and Grant 2022) in the LTEE. In addition, we found evidence of purifying selection on protein interactome resilience (Maddamsetti 2021a), and purifying selection on genes encoding highly expressed and highly interacting proteins (Maddamsetti 2021b), using different methods.
Here, I use STIMS to test four sets of metabolic genes for purifying selection in the LTEE: 1) those that encode enzymes that catalyze the core reactions of E. coli central metabolism, as curated in the BiGG Models knowledge base (King et al. 2015); 2) those encoding enzymes that catalyze "superessential" reactions that have been found to be essential in all tested bacterial metabolic network models (Barve et al. 2012); 3) and 4) those that encode specialist and generalist enzymes, respectively, in the E. coli metabolic reaction network, as determined by their substrate and reaction specificity (Nam et al. 2012). Specialist enzymes catalyze one reaction on a unique primary substrate. Generalist enzymes, by contrast, either catalyze reactions on a variety of substrates, or catalyze multiple classes of reactions.
I tested two specific hypotheses. First, I hypothesized that the BiGG core metabolic enzymes and the superessential metabolic enzymes would show evidence of purifying selection. Second, I hypothesized that specialist enzymes would show stronger purifying selection than generalist enzymes, because Nam et al. (2012) reported that specialist enzymes are frequently essential and maintain higher metabolic flux than generalist enzymes. While this analysis focuses on the hypermutator LTEE populations due to substantially greater power to detect purifying selection (Maddamsetti and Grant 2022), I also report the result of running STIMS on the non-mutator populations for the sake of completeness.
My results show mixed support for these hypotheses. I find that patterns of purifying selection on metabolism, at least after 60,000 generations of experimental evolution, are largely idiosyncratic and population-specific. To explain this finding, I propose the Jenga hypothesis, in which mutation accumulation causes replicate populations to follow idiosyncratic trajectories of lost redundancies, despite purifying selection on network function. I tested several predictions of the Jenga hypothesis by simulating the evolution of 1,000 minimal genomes under strong purifying selection. Strikingly, the core genes common to all minimal genomes show consistent signatures of purifying selection in the LTEE.

Results
Overlap Between Pre-Defined Sets of Metabolic Genes I first examined the overlap between the four gene sets of interest ( fig. 1). If these sets are disjoint, then running STIMS on each would represent four independent statistical tests. At the other extreme, I would be conducting the same test four times.
Specialist and generalist enzymes are mutually exclusive by definition. While I expected the BiGG core and superessential enzymes to overlap, due to their importance to E. coli metabolism, they are mutually exclusive ( fig. 1). Out of 136 BiGG core enzymes, 75 are specialist enzymes, while 52 are generalist enzymes. The remaining 9 were not classified as either by Nam et al. (2012). Out of 96 superessential enzymes, 59 are specialist enzymes, and 37 are generalist enzymes. Therefore, two pairs of comparisons -BiGG core versus superessential, and specialist versus generalist-are statistically independent.

Idiosyncratic Purifying Selection on Superessential Metabolic Genes
Only two populations show evidence of purifying selection on superessential metabolic genes: Ara−1 (STIMS bootstrap: P < 0.001) and Ara+6 (STIMS bootstrap: P = 0.01). The strength of purifying selection on superessential genes therefore seems to be population-specific Altogether, these results somewhat support the hypothesis that specialist enzymes evolve under stronger purifying selection than generalist enzymes, with Ara−1 serving as a notable exception.

The Jenga Hypothesis Can Explain Idiosyncratic Variation in the Strength of Purifying Selection on Metabolism Across Replicate Populations
Ara−1 often shows a stronger signal of purifying selection than either Ara+3 or Ara+6, even though it has spent less time as a hypermutator and has fewer mutations than either (figs. 2 and 3). This observation implies that differences in statistical power cannot account for idiosyncratic variation in the strength of purifying selection across LTEE populations. Therefore, some biological explanation is needed.
One possibility is that the strength of purifying selection on metabolic enzymes depends on how mutation accumulation has eroded and reshaped metabolism in each population (Cooper and Lenski 2000;Leiby and Marx 2014). Ara +6 in particular has lost the ability to grow on many sugars that the ancestral LTEE clone can use (Leiby and Marx 2014). Those losses of function may cause stronger purifying selection on the metabolic pathways that have remained intact. One way to conceptualize this hypothesis is as follows: at first, many metabolic pathways may evolve nearly neutrally, such that many loss-of-function mutations accumulate, especially in the hypermutator LTEE populations. Eventually, the accumulated effects of these losses of function reach a critical point, after which further gene losses become deleterious. Idiosyncratic differences in the strength of purifying selection on metabolic enzymes across populations then reflects idiosyncratic changes in the pathways that metabolic flux can follow in each population.
I call this conceptual model the Jenga hypothesis, after a children's game in which wooden blocks are progressively The intersections between four pre-defined sets: superessential enzymes, BiGG core metabolic enzymes, specialist enzymes, and generalist enzymes were examined (see text for further details about these gene sets). The cardinality of each of these sets is shown in the horizontal bar graph on the left. Every non-empty subset of the four sets is visualized as a set of dots connected by lines. The cardinality for each nonempty subset (set intersections) is shown in the vertical bar graph.
Genome Biol. Evol. 14(12) https://doi.org/10.1093/gbe/evac114 Advance Access publication 17 August 2022 removed from a tower until it falls. The Jenga hypothesis generalizes to any phenotype that is mechanistically caused by a molecular network with many redundancies (Albergante et al. 2014). Such a network may have many nodes or edges that can be lost without any loss of function. At some point however, the network has lost most of its redundancies, causing strong purifying selection on the particular nodes that if lost would compromise network function. The specific nodes that end up evolving under strong purifying selection strongly depends on the history of nodes that were previously lost in the network, causing idiosyncratic purifying selection ( fig. 4A).
The Jenga hypothesis makes four specific and testable predictions involving the reductive evolution of genomes encoding network phenotypes under purifying selection. First, replicate minimal genomes that have evolved under strong purifying selection should show variation in minimal genome content (i.e. the specific genes found in each genome). Second, replicate minimal genomes should have variable numbers of genes. Third, replicate minimal genomes For comparison, random sets of genes (with the same cardinality as the gene set of interest) were sampled 1,000 times, and the cumulative number of mutations in those random gene sets, normalized by gene length, was calculated. The middle 95% of this null distribution is shown as shaded points. When a solid line falls below the shaded region, then the gene set of interest shows a significant signal of purifying selection (P < 0.025 for a one-tailed test for purifying selection). For further details, see figure 2 of Maddamsetti and Grant (2022) should encode more fragile networks; in other words, they should have higher proportions of essential genes compared to their ancestor. Finally, replicate minimal genomes should show idiosyncratic variation in which particular genes are essential for network function. To test these predictions, I played 1,000 rounds of Genome Jenga, a zeroplayer game (like Conway's game of Life) in which genes are randomly selected from a genome and are removed if doing so has a negligible effect on fitness. I implemented the algorithm described by Pál et al. (2006), who used this technique to study historical contingency in the evolution of minimal metabolic networks. Furthermore, I build upon the analytical framework described by Pál et al. (2006) to study the Jenga Hypothesis. As Pál et al. (2006) designed their computational experiments to model the evolution of endosymbiotic bacteria, they played Genome Jenga with very small effective population sizes (1 × 10 2 ) and in simulated rich media to mimic the intracellular environment of a host cell. By contrast, and for comparison to the LTEE, I played Genome Jenga in simulated minimal  4E). All of the evolved metabolic networks encoded by the minimal genomes contain more essential genes-in absolute numbers-than the ancestral metabolic network ( fig. 4C and fig. 4F). Strikingly, the distribution of the number of essential genes in the 1,000 minimal genomes is bimodal ( fig. 4F), and the distribution of the number of metabolic reactions per genome is multimodal ( fig. 4G). This indicates that several of the minimal genomes have converged to alternative network topologies with alternative sets of essential genes, which is further supported by how the minimal genomes cluster by gene content, essential gene content and metabolic reaction content

Discussion
I tested four sets of metabolic genes for purifying selection in the LTEE: those encoding reactions in central metabolism; those catalyzing "superessential" reactions, and those that encode specialist and generalist enzymes in the E. coli metabolic reaction network. Evidence of purifying selection was found only in specific populations. This finding indicates that the strength of purifying selection on metabolism in each population depends on its particular evolutionary history during the LTEE. In this regard, it is important to note that two hypermutator populations may be considered as outliers for the purpose of this paper. First, Ara−2 lost its hypermutator phenotype following the fixation of a reversion mutation at 42,250 generations (Maddamsetti and Grant 2020). It may be challenging to detect a cumulative signal of purifying selection in later generations of this population owing to the lower mutation rate (Maddamsetti and Grant 2022). Second, Ara−3 evolved the ability to use citrate as a carbon source at ∼31,000 generations (Blount et al. 2008;Blount et al. 2012). Its phenotypic evolution is now on a different trajectory from the other populations (Blount et al. 2018;Blount et al. 2020;Grant et al. 2021a).
Overall, Ara+6 shows the strongest signal of purifying selection on metabolic enzymes. This finding is consistent with previous results that show purifying selection on aerobicspecific genes, anaerobic-specific genes, and essential genes in Ara+6 (Grant et al. 2021b;Maddamsetti and Grant 2022). Ara+6 has accumulated the most mutations over the course of the LTEE, which may explain its especially strong signal of purifying selection. In these previous studies, we attributed any idiosyncratic differences in the signatures of selection found by STIMS to differences in statistical power caused by differences in mutation accumulation across populations.
By contrast, the findings in this work show that mutation accumulation is not sufficient to explain the varying signal of purifying selection across populations. If mutation accumulation were a dominant factor, then the populations with the most mutations should show the strongest signal of purifying selection on metabolic enzymes. However, like Ara+6, Ara+3 has also accumulated many mutations (Tenaillon et al. 2016;Couce et al. 2017;Maddamsetti and Grant 2020), and yet it shows much weaker evidence of purifying selection on metabolic enzymes. Moreover, Ara−1 shows by far the strongest signal of purifying selection on superessential genes, even though it evolved a hypermutator phenotype much later than Ara+3 and Ara+6 (Barrick and Lenski   FIG. 6.-The result of running STIMS on core genes in the 1,000 minimal genomes (hypermutator populations only). Each panel shows the cumulative number of mutations in the gene set of interest normalized by the combined length of that gene set (solid line), in the six hypermutator LTEE populations. For comparison, random sets of genes (with the same cardinality as the gene set of interest) were sampled 1,000 times, and the cumulative number of mutations in those random gene sets, normalized by gene length, was calculated. The middle 95% of this null distribution is shown as shaded points. When a solid line falls below the shaded region, then the gene set of interest shows a significant signal of purifying selection (P < 0.025 for a one-tailed test for purifying selection). For further details, see figure 2 of Maddamsetti and Grant (2022), which illustrates how STIMS works.
To explain these idiosyncratic differences in purifying selection on metabolic enzymes, I propose the Jenga hypothesis. The Jenga hypothesis is closely related to the hypothesis that interconnections between aerobic and anaerobic metabolism prevent the loss of anaerobic function despite evolving under relaxed selection in LTEE (Grant et al. 2021b). The main difference between the buttressing pleiotropy model proposed by Grant et al. (2021a,b) and the model considered here is that I propose that purifying selection on metabolic enzymes may only become evident after passing a critical threshold of network evolution.
I tested the plausibility of the Jenga hypothesis by asking whether distinct reaction networks would evolve from an ancestral genome-scale metabolic model of the REL606 genome. By simulating genome evolution under strong purifying selection for growth on minimal glucose media, I found clear evidence for multiple predictions that follow from the Jenga hypothesis. While the minimal genomes share a core set of genes and reactions, they broadly vary in terms of gene content, essential genes, and metabolic reactions. More genes became essential in all 1,000 minimal genomes, and each minimal genome has an idiosyncratic set of genes that is required for growth in simulated minimal glucose media. Pál et al. (2006) found these same general patterns in their pioneering computational study of reductive genome evolution during endosymbiosis.
In addition, the core genes found across the minimal genomes show clear and consistent signatures of purifying selection across the LTEE populations. This finding suggests that the subset of metabolic enzymes that are absolutely required for growth on glucose is evolving under purifying selection across the LTEE. This finding is also concordant with the results of Pál et al. (2006), who were able to accurately predict endosymbiont genome content by playing Genome Jenga with the E. coli genome.
Despite these important commonalities, the overall message of this work strongly contrasts with the overall message of Pál et al. (2006), who focused on the predictability of gene content evolution. By playing Genome Jenga, I find multimodal distributions of essential genes per genome and metabolic reactions per genome. Together with my initial findings on core metabolism, superessential genes, and specialist and generalist enzymes, my results imply that the large number of metabolic redundancies present in the E. coli genome can result in divergent network evolution, despite strong selection in controlled laboratory conditions, during an ongoing process of genome streamlining.
Future research could study the implications of the Jenga hypothesis using evolution experiments with organisms with minimal genomes, including endosymbionts and engineered organisms (Gil et al. 2002;Hutchison et al. 2016). Specifically, organisms with minimal genomes should show strong purifying selection against losses of metabolic function, while adding redundancies should weaken purifying selection.
The Jenga hypothesis makes a further, specific prediction with regard to the LTEE. The function of many metabolic genes, or their regulation, may already be knocked out in many populations given the extensive losses of metabolic function that have been observed (Leiby and Marx 2014). In most cases, the genetic causes of these losses of function are unknown. If idiosyncratic purifying selection is caused by the progressive loss of metabolic function, then one should be able to predict which enzymes have come under the strongest purifying selection in the network, based on which enzymes have the greatest gains in metabolic flux over adaptive evolution, or the least variability in (or equivalently, the most constraint on) metabolic flux. Testing this prediction will require metabolomics experiments to describe how metabolic flux has evolved in the LTEE, and targeted knockout experiments to see whether evolved bottlenecks in each population's metabolic network can account for idiosyncratic purifying selection on metabolic enzymes in the LTEE.
The simulations reported in this work demonstrate that the Jenga Hypothesis is a sufficient cause for idiosyncratic purifying selection on metabolic enzymes. However, other causes may be as important or more important, in the context of the LTEE. Epistasis and historical contingency may generally cause such patterns, as beneficial mutations, including gain-of-function and fine-tuning mutations (Maddamsetti et al. 2017) could dynamically intensify or relax purifying selection pressures on various molecular systems of the cell. This hypothesis generalizes the Jenga hypothesis, which specifically focuses on the effect of loss-of-function mutations. In addition, many LTEE populations have evolved into simple communities with crossfeeding subpopulations (Le Gac et al. 2012;Plucain et al. 2014;Maddamsetti et al. 2015;Good et al. 2017) due to excreted metabolites (Harcombe et al. 2013;Leiby and Marx 2014;Turner et al. 2015). Resources produced by cell lysis and death in the LTEE may also create opportunities for ecological specialization (Rozen et al. 2009;Blount et al. 2020;Grant et al. 2021a). It is not known whether, or how, differences in the evolved communities in each replicate flask of the LTEE could cause the patterns of idiosyncratic purifying selection reported here. More generally, understanding how the evolution of ecological complexity shapes selection pressures on metabolism remains a fertile field for future research.
Genes in the BiGG E. coli core metabolic model were downloaded from: http://bigg.ucsd.edu/models/e_coli_ core. E. coli genes encoding superessential metabolic reactions were taken from supplementary table S6 of Barve et al. (2012). E. coli genes encoding specialist and generalist metabolic enzymes were taken from supplementary table S1 of Nam et al. (2012). These tables were then merged with NCBI Genbank gene annotation for the ancestral LTEE strain, E. coli B str. REL606. The UpSet visualization for the overlap between the four sets of metabolic enzymes was produced using the UpSetR R Package (Conway et al. 2017).
Data Analysis with STIMS STIMS is fully described in Maddamsetti and Grant (2022). Briefly, STIMS counts the cumulative number of mutations occurring over time in a gene set of interest, and compares that number to a null distribution that is constructed by subsampling random gene sets of equivalent size. The number of observed mutations in a gene set is normalized by the total length of that gene set in nucleotides. Bootstrapped P-values were calculated separately for each population. P-values for one-sided tests for purifying selection were calculated as the lower-tail probability, assuming the null distribution, of sampling a normalized cumulative number of mutations that is less than the normalized cumulative number of mutations in the gene set of interest. In the visualizations shown in the figures, the top 2.5% and bottom 2.5% of points in the null distribution are omitted, such that each panel can be interpreted as a two-sided randomization test with a false-positive (type I error) probability α = 0.05. For further details see figure 2 of Maddamsetti and Grant (2022), which illustrates how STIMS works.

Genome Jenga
In Genome Jenga, genes are randomly selected from a genome, and removed if doing so has a negligible effect on fitness. The game is continued until no further genes can be removed from the genome. I used a genome-scale metabolic model of the ancestral LTEE REL606 strain (King et al. 2015), and I used the algorithm for minimal genome construction reported by (Pál et al. 2006). I generated 1,000 minimal genomes from the metabolic model of REL606, evaluating changes in fitness as the change in the predicted biomass objective optimized by flux balance analysis after the removal of a given gene (Pál et al. 2006;Hosseini and Wagner 2018). I modeled the growth conditions of the LTEE by using glucose as the sole limiting carbon source, and adding an excess of thiamine to model the Davis-Mingioli medium used in the LTEE (Lenski et al. 1991). I modeled strong purifying selection by setting the fitness threshold s for gene removal as 1/N e , where N e = 3.3 × 10 7 , which is the effective population size of the LTEE (Izutsu et al. 2021). This models nearly-neutral evolution in which mutations that disrupt a gene's function can hitchhike to fixation if they either confer a beneficial effect, or confer a sufficiently small fitness defect such that |N e s| < 1, where s is the selective effect of the gene disruption (Pál et al. 2006). My design deviates from that in Pál et al. (2006) in that I model reductive genome evolution in minimal glucose media rather than rich media, and I used a highly stringent selective threshold to model the large population size of the LTEE in contrast to the small effective population sizes of intracellular endosymbionts (N e ∼ 10 2 -10 3 ) modeled by Pál et al. (2006). I used COBRAPy software to conduct flux balance analyses, including calculations of which genes in the metabolic model were essential for growth (Ebrahim et al. 2013).