No Evidence That Protein Noise-Induced Epigenetic Epistasis Constrains Gene Expression Evolution

Changes in gene expression can affect phenotypes and therefore both its level and stochastic variability are frequently under selection. It has recently been proposed that epistatic interactions inﬂuence gene expression evolution: gene pairs where simultaneous knockout is more deleterious than expected should evolve reduced expression noise to avoid concurrent low expression of both proteins. In apparent support, yeast genes with many epistatic partners have low expression variation both among isogenic individuals and between species. However, the speciﬁc predictions and basic assumptions of this verbal model remain untested. Using bioinformatics analysis, we ﬁrst demonstrate that the model’s predictions are unsupported by available large-scale data. Based on quantitative biochemical modeling, we then show that epistasis between expression reductions (epigenetic epistasis) is not expected to aggravate the ﬁtness cost of stochastic expression, which is in sharp contrast to the verbal argument. This nonintuitive result can be readily explained by the typical diminishing return of ﬁtness on gene activity and by the fact that expression noise not only decreases but also increases the abundance of proteins. Overall, we conclude that stochastic variation in epistatic partners is unlikely to drive noise minimization or constrain gene expression divergence on a genomic scale.


Introduction
Gene expression is the first step in translating genetic information into phenotypes, hence expression level is expected to be the subject of natural selection. Indeed, both increased and decreased expression can impose a fitness burden (Papp et al. 2003;Dekel and Alon 2005;Deutschbauer et al. 2005;Sopko et al. 2006) and comparative analyses revealed that transcript levels of most genes evolve under stabilizing selection (Lemos et al. 2005;Gilad et al. 2006). Accordingly, stochastic variation in protein abundance between isogenic individuals (gene expression noise) causes a substantial fitness defect (Wang and Zhang 2011), and increased stochasticity has been implicated in disease (Kemkemer et al. 2002). Therefore, natural selection is generally expected to reduce expression noise (for counterexamples, see Zhang et al. 2009;Eldar and Elowitz 2010). Several lines of evidence support this argument. First, a recent population genetic case study provided direct empirical evidence that selection acts on reducing expression noise of a yeast enzyme (Metzger et al. 2015). More generally, a global measurement of gene expression noise in yeast (Newman et al. 2006) revealed that genes important for cell growth and those sensitive to altered dosage tend to display an especially low noise (Newman et al. 2006;Batada and Hurst 2007;Lehner 2008). Thus, noise in gene expression appears to be shaped by natural selection to minimize the deleterious consequences of fluctuations in important and dosage-sensitive proteins.
Could there be other ways by which reducing expression noise confers a fitness benefit? It has recently been proposed that functional interaction between genes imposes an extra selection pressure to reduce noise (Park and Lehner 2013). The model applies to gene pairs that show a negative epistatic (genetic) interaction, that is when loss-of-function mutations in the two genes enhance each other's deleterious effect (i.e., the double mutant has a lower fitness than expected based on the combined effect of the two single mutants). Such fitness interactions often define redundant gene functions and alternative molecular pathways. As gene expression noise can mimic mutations by decreasing protein abundances, concurrent low expression of both gene products owing to stochastic fluctuations may cause an especially severe fitness defect for gene pairs that are in negative epistasis, a notion termed "epigenetic epistasis". According to the verbal model, noise-induced epigenetic epistasis should select for lower gene expression noise and this effect should be stronger for genes with many epistatic interactions due to an "increased likelihood that at least one interaction partner will have low expression in a particular individual" (Park and Lehner 2013). Capitalizing on a recent comprehensive map of epistatic interactions between null mutations in the yeast Saccharomyces cerevisiae (Costanzo et al. 2010), Park and Lehner reported that genes with many epistatic interactions tend to show low protein expression noise and diverge slowly between species, patterns that are consistent with the model. Whereas the above verbal argument appears straightforward, its specific predictions and basic assumptions have remained untested so far. First, according to Park and Lehner's model, negative epistatic interactions should have an important influence on gene expression noise even when other, previously identified determinants of noise are taken into account. Second, the model should specifically apply to negative, but not positive genetic interactions as genes with many positive interactions (i.e., when two gene deletions alleviate each other's detrimental effect) are not expected to evolve low noise. Third, selection to minimize noise should be especially strong for genes with high-noise epistatic partners. Finally, the verbal model assumes that the presence of negative epistasis between two genes should aggravate the fitness cost associated with stochastic expression compared with noninteracting gene pairs. Crucially, testing the validity of this assumption requires a quantitative understanding of how protein abundances map to cellular fitness. In the present study, we address these issues by analyzing available functional genomic data in yeast and by investigating the interplay between noise and epistasis using a quantitative metabolic pathway model.
Our work found no empirical support for the specific predictions of Park and Lehner's model, not least because the correlation between genetic interaction degree and protein noise becomes minuscule when controlling for other determinants of noise. Furthermore, and in contrast with the simple verbal argument, biochemical modeling revealed that the presence of negative epigenetic epistasis between stochastically expressed genes is actually not expected to incur a noticeable extra fitness cost compared with nonepistatic genes. We demonstrate that this nonintuitive finding can be readily explained by the typical shape of the gene dose-fitness curve in multi-gene networks and by the fact that expression noise not only decreases but also increases the abundance of proteins. We conclude that epigenetic epistasis induced by stochastic variation in gene expression levels is unlikely to drive noise minimization or constrain expression evolution on a genomic scale.

Negative Genetic Interaction Degree Is a Minor and Non-Specific Predictor of Expression Noise and Divergence
As an empirical support of their model, Park and Lehner (2013) reported that yeast genes with many genetic interactions, as inferred from null mutations, show a reduced expression noise and slow evolutionary divergence in expression. We first revisited this claim by asking whether the correlation also holds when potential confounding factors are taken into account. Previous works identified several key genomic and physiological features that influence both expression noise and genetic interaction degree in S. cerevisiae (Fraser et al. 2004;Newman et al. 2006;Batada and Hurst 2007;Lehner 2008;Zhang et al. 2009;Park and Lehner 2013). Among others, these features include (a) gene expression level, (b) single gene deletion and overexpression fitness defects, (c) haploinsufficiency, (d) the presence and number of protein-protein interactions, (e) gene functionality, and (f) gene localization. Thus, a claim that any feature is important in determining protein noise should control for these other predictors. To identify the most parsimonious set of features that best explain gene expression noise, we carried out a stepwise model selection procedure in a multiple-regression statistical framework (Hastie and Pregibon 1992) on 20 candidate predictors ("Materials and Methods" section). We did not include the presence of TATA boxes in gene promoters as a potential predictor variable, as these motifs are thought to be causally involved in determining noise levels and expression conservation (Tirosh et al. 2006;Choi and Kim 2009). Throughout our analyses, we used the same original datasets on protein noise, expression evolution and genetic interactions as in (Park and Lehner 2013). We report that a set of 10 gene features can explain 19.5% of variation in gene expression noise when considered jointly (table 1). For example, metabolic genes tend to exhibit especially high noise levels, and the converse is true for genes involved in translation. These trends may simply reflect an uneven distribution of TATA-box containing genes across functional classes (Tirosh et al. 2006). Importantly, whereas the number of negative genetic interactions explains $4% of the variation in noise when considered in single, it explains only a negligible (0.5%) part of the total variance in the most parsimonious multivariate model (table 1). In a similar analysis, we found only a very weak partial Spearman correlation between noise and the number of negative genetic interactions when controlling for all other explanatory variables in table 1 (partial Spearman's rho ¼ À0.07, P ¼ 0.01). Thus, genetic interaction degree has only a minor independent explanatory effect when other genomic features are taken into account. Similar conclusions were reached for gene expression divergence (supplementary table S1, Supplementary Material online). It remains to be seen whether such weak statistical associations represent genuine biological effects or rather stem from potential biases in high-throughput data.
Second, the argument of Park and Lehner specifically applies to negative genetic interactions, i.e., when "reduced activity of two genes in combination often has a more detrimental effect than expected." Crucially, genes with many positive epistatic interactions are not expected to be constrained to maintain stable expression. Given the strong correlation between positive and negative genetic interaction degrees (Spearman's rho ¼ 0.72, P < 10 À15 ), one might wonder whether the observed low expression variability is restricted to negative epistatic hubs. If positive interaction hub genes were also constrained in their expression noise, that would suggest that the observed correlation between negative genetic interaction degree and gene expression noise is not explained by the verbal model of Park and Lehner, but is likely caused by confounding variables (table 1). Indeed, whereas we could reproduce the previously observed correlations of noise and divergence with the number of negative epistatic interactions, these relationships also hold for the number of positive epistatic interactions (supplementary fig. S1, Supplementary Material online, see also Park and Lehner).
Protein Noise-Induced Epigenetic Epistasis . doi:10.1093/molbev/msw236 MBE To test whether low expression variability is associated with negative interaction degree even when the number of positive interactions is controlled for, we performed partial correlation analyses. We report that the associations between negative genetic interaction degree and protein noise becomes extremely weak and statistically nonsignificant when the effect of positive interaction degree is removed (partial Spearman's rho ¼ À0.036, P ¼ 0.198). We note that positive interaction degree showed a significant partial correlation with noise even after controlling for negative interaction degree (partial Spearman's rho ¼ À0.12, P ¼ 2eÀ5). In a similar vein, the association between negative genetic interaction degree and evolutionary divergence in gene expression became extremely weak after removing the effect of positive interaction degree (partial Spearman's rho ¼ À0.065, P ¼ 0. 0013). To visualize how the two forms of genetic interaction degrees are associated with protein noise and expression divergence, we binned yeast genes into five equally sized groups ranked according to the total number of genetic interactions. Next, within each bin, we categorized genes according to their enrichment in positive or negative genetic interactions, respectively (see fig. 1 legend). We found that within each bin with similar total numbers of genetic interactions, the sign of dominating interactions had no influence on either protein noise or inter-species expression divergence ( fig. 1). Specifically, hub genes with predominantly negative genetic interactions do not show significantly lower expression variation (noise, divergence) than positive interaction hubs (onetailed Mann-Whitney U test, P ¼ 0.98 and 0.45 for gene expression noise and divergence, respectively). These results are robust to changes in bin number (i.e., 2-10) or in the definition of enrichment in the type of interaction. Taken together, it appears that both negative and positive epistatic "hubs" display low expression variations within isogenic individuals and between species. We conclude that, in sharp contrast to the verbal model, negative genetic interactions cannot specifically explain the peculiar expression properties of genetic interaction hubs.
Finally, we suggest that the observed pairwise correlation between positive genetic interaction degree and expression noise is likely caused by confounding variables (e.g., gene function, single mutant fitness) and is therefore largely a by-product of other mechanisms. As a support, we report that the correlation becomes much weaker (i.e., explaining $1% of variance) when other determinants of gene expression noise are taken into account in a multivariate model (supplementary table S2, Supplementary Material online).

Noise Level of Epistatic Partners Does Not Constrain Gene Expression Evolution
A further untested prediction of the verbal theory is that genes epistatically interacting with high-noise genes should evolve especially low noise levels and diverge slowly in expression because they are more likely to induce a fitness drop as their expression levels vary across individuals. In contrast to this expectation, we failed to find a negative correlation between either the noise level or the expression divergence of the focal gene and the average noise level of its negative epistatic partners (P ¼ 0.97 and 0.50, for expression noise and divergence, respectively, one-sided Spearman correlations, fig. 2A). Notably, these results remained when only focal genes with many negative epistatic interaction partners (epistatic "hubs") were considered (P ¼ 0.76 and 0.61, see "Materials and Methods" section).

Dosage Sensitivity Is Not Influenced by the Number of Epistatic Partners
The verbal model further predicts that genes with many genetic interaction partners are expected to be especially sensitive to a reduction in gene dosage, as variation in the activity of many other genes will now have more severe fitness effects (Park and Lehner 2013). To test whether a direct relationship exists between a gene's genetic interaction degree and its haploinsufficiency phenotype (i.e., whether or not a 50% dosage reduction causes a significant growth defect), we first have to control for the fitness defect incurred by complete Enzyme-encoding gene 9.6 4.1*** GO process: translation 2.6 1.9*** Expression level (log) 1.4 1.5*** GO cellular component: Mitochondria 2.7 1.1*** Environmental genetic degree (log) 1.3 1.1*** GO cellular component: Golgi apparatus 1.4 0.7*** Negative genetic interaction degree (log) 4.1 0.5** Single deletion fitness defect 3.3 0.5** GO process: response to stress 0.4 0.3* Plasma-membrane transporter 1.1 0.2 NOTE.-The set of gene features that can parsimoniously explain protein noise (as measured by DM, see "Materials and Methods" section) were identified by applying a bothway (forward and backward) stepwise model selection algorithm based on Akaike information criterion within a linear regression modeling framework (Hastie and Pregibon 1992). Variables marked with "(log)" and protein noise were log-transformed in the analyses. P-values of explanatory variables in the multivariate model are shown as */**/***, representing P < 0.05/0.01/0.001. Only genes with both protein noise and genetic interaction degree data available were included (N ¼ 1,247 MBE deletion of the gene. Null mutation fitness defect is expected to be a confounding variable as it correlates strongly both with the number of genetic interactions a gene has (Costanzo et al. 2010) and its haploinsufficiency (Deutschbauer et al. 2005). Indeed, whereas genes with more negative genetic interaction partners are more likely to be haploinsufficient (Park and Lehner 2013), this association disappears when we restrict the analysis to those genes that exhibit a slow growth phenotype upon complete gene deletion ( fig. 2B, Furthermore, the gene expression noise of the epistatic partners does not influence the haploinsufficiency of the focal gene ( fig. 2A). Thus, neither the number nor the expression noise of negative genetic interaction partners influences the dosage sensitivity of genes.

Epigenetic Epistasis Is Not Expected to Aggravate the Fitness Cost of Expression Noise
Having demonstrated that available large-scale data does not support the verbal model of Park and Lehner, we next asked why gene expression noise might not be constrained by epistatic interactions. We propose that the verbal model fails because gene expression noise not only decreases, but also increases protein concentrations over the wild-type level. Whereas a simultaneous drop in the expression levels of two interacting genes results in a greater-than-expected fitness defect, the opposite might be true when one gene is overexpressed whereas the other is simultaneously underexpressed as a result of stochastic variation. Indeed, a general mathematical model in which epistasis is simply determined by the shape of the gene activity-fitness curve supports this notion ( fig. 3). In particular, if cellular fitness is a diminishing return function of the sum expression level of the two genes (i.e., a hyperbolic curve) then fluctuations in the level of the two proteins is expected to yield close-to-zero average epistasis. For a geometrical explanation of this phenomenon, see figure 3. As argued in the "Discussion" section, a diminishing return function between gene activity and fitness appears to be a general property of genetic systems (Kacser and Burns 1981;Fell 1992;Deutschbauer et al. 2005).
To examine whether this theoretical expectation also applies to a realistic biological system, we next turned to metabolic modeling. As we aimed to study the fitness impact of quantitative perturbations to protein levels, we employed a detailed kinetic model of a well-characterized pathway, yeast glycolysis. Specifically, a recent work systematically measured the kinetic parameters of glycolytic enzymes under the same standardized condition and built a self-consistent and physiologically relevant mathematical model of this pathway (Smallbone et al. 2013). The model captures metabolite  where the X-axis is the sum of expression levels of two proteins that perform the same function with identical activities (i.e., full redundancy), and Y-axis shows the fitness corresponding to a given expression state.
Gray arrows indicate fitness loss upon reducing the expression level of one of the two proteins in a state where the other protein is expressed at the wild-type level (WT). When expression is reduced in a state where the other protein is underexpressed (UE), the observed fitness will be lower than expected based on the wild-type state resulting in negative epistasis (red arrow). On the other hand, reducing expression from an overexpression state (OE) causes positive epistasis (blue arrow), i.e., when the observed fitness is higher than expected. (B) When expression of both proteins is altered around the wild-type level in the model, both negative and positive epistatic interactions occur in a symmetric manner. (See "Materials and Methods" section for more details.) Color scale shows epistasis score values. Boross and Papp . doi:10.1093/molbev/msw236 MBE concentrations with good accuracy (Smallbone et al. 2013) and is able to predict the fitness impact of single-gene deletions (Spearman correlation between experimental and in silico fitness values: rho ¼ 0.78, P < 0.003, see "Materials and Methods" section). In line with the assumption of our general mathematical model ( fig. 3) and with previous empirical observations (Kacser and Burns 1981;Fell 1992), the in silico calculated gene activity-fitness curves of all genes in the glycolytic pathway model are far from linear and can be well-approximated by hyperbolic functions (supplementary fig. S2, Supplementary Material online).
Using the glycolytic model, we scrutinized whether negative epigenetic epistatic interactions impose a significant fitness burden in the presence of gene expression noise, which is the basic assumption underlying the verbal model of Park and Lehner. As expression noise is expected to incur a marked fitness cost even in the absence of epistatic interactions (Wang and Zhang 2011), we were specifically interested in the extent to which epigenetic epistasis further aggravates this baseline cost. To this end, we first identified gene pairs displaying negative epigenetic epistasis by simulating the fitness effect of a fixed amount of decrease in gene dosage in either one or both of the genes (see "Materials and Methods" section for details). For each such interacting enzyme pair (table 2), we then monitored the fitness of a population of 10 6 cells in which the two genes displaying negative epigenetic epistasis were stochastically expressed (for tractability, the expression of all other genes was deterministic, see "Materials and Methods" section). Noise levels were set to experimentally determined values based on Newman et al. (2006). We first computed the fitness advantage of reducing noise in one of the stochastically expressed genes under the assumption that it does not act epistatically with the other noisy gene (i.e., we enforced zero epistasis in the simulations, see "Materials and Methods" section). We reduced noise by 30% below the wild-type level as this value represents an upper estimate of noise reduction that can typically be achieved without altering the mean expression level (see "Materials and Methods" section and supplementary fig. S3, Supplementary Material online). We note that single mutations in the promoter of a yeast metabolic gene reduced protein noise to a much lower extent than this value (Metzger et al. 2015). We investigated the effect of noise reduction in each gene in turn, whereas holding constant the noise of its partner gene (for a toy example, see fig. 4A). As expected, decreasing noise even in the absence of epistasis caused a fitness advantage (bootstrap test, P < 10 À4 in 90% of all cases, see table 2 and fig. 4A and B). Remarkably, repeating the calculations without enforcing zero epistasis between the gene pairs revealed a statistically indistinguishable fitness advantage of noise reduction (see fig. 4A and B; bootstrap test, P ! 0.1 in all cases, table 2). Thus, according to the mathematical model, the presence of negative epigenetic epistasis between stochastically expressed genes does not incur a noticeable extra fitness cost compared with nonepistatic genes.
To illustrate the mechanistic basis of this result, we focused on TDH1 and TDH3, which show negative epistasis when their dosage is simultaneously reduced. Figure 4C shows the in silico calculated epistasis scores at different expression values around the wild-type level for these two genes. Considering all possible combinations of expression changes in TDH1 and TDH3 (--,Àþ,þÀ, þþ) demonstrated that not just negative, but also positive epistasis scores occur frequently and, on average, cancel each other out. This finding is consistent with the prediction of our general mathematical model ( fig. 3) and appears universal as it applies to all gene pairs of the glycolytic model showing negative epigenetic epistasis (supplementary fig. 4, Supplementary Material online). In sum, noise-induced epigenetic epistasis is not expected to incur an extra fitness cost because it has a net value close to zero as the expression level of the two genes stochastically varies up and down around the wild-type value. Based on these considerations, we do not expect to find adaptively reduced expression noise in genes with many NOTE.-The mean in silico fitness advantage of reducing expression noise in a partner gene was calculated for all gene pairs displaying negative epigenetic epistasis (i.e., a negative epistasis value upon dosage decrease). The absence of epistasis (to calculate fitness advantage owing to single locus effect) was simulated by eliminating the additional fitness effect of the epistatic interaction between the two genes as described in the "Materials and Methods" section. Reducing expression noise in one member of the pair, whereas keeping the noise of the other member at wild-type level significantly improved fitness in almost all cases. However, this fitness improvement was not enhanced by epistatic effects (see also fig. 4B). Simulations were performed using a population of 10 6 cells, and P-values were calculated by bootstrapping (see "Materials and Methods" section). Significant fitness improvements are marked by asterisks: */**/*** corresponding to P < 0.05/P < 0.01/P < 0.001, respectively.
Protein Noise-Induced Epigenetic Epistasis . doi:10.1093/molbev/msw236 MBE genetic interaction partners. In addition, epigenetic epistasis is also unlikely to constrain expression divergence on an evolutionary timescale. The latter is because gene pairs that show negative epistasis when their expression is concurrently lowered generally show positive epistasis when their expression levels are altered in opposite directions (figs. 3B and 4C, sup plementary fig. S4, Supplementary Material online). As a consequence, whereas negative epigenetic epistasis between genes might slow down expression divergence towards concurrent downregulation, it can possibly speed up divergence towards other combinations of expression changes. Taken together, we do not expect epistatic interactions to constrain expression divergence between species.

Discussion
Epistatic interactions have a fundamental influence on various evolutionary phenomena, including the accessibility of evolutionary trajectories (Weinreich et al. 2005), the evolution of sexual reproduction (de Visser and Elena 2007), the accumulation of harmful mutations (Kondrashov 1994) and the genetic divergence between species (Wu and Palopoli 1994). A recent verbal model proposed that epistasis also shapes the evolution of gene expression noise and the conservation of gene expression between species (Park and Lehner 2013). For gene pairs where simultaneous mutation is especially harmful (i.e., negative or synergistic genetic interaction), a diminished gene expression noise should be beneficial as it helps to avoid concurrent low expression of both gene products in the same individual. Thus, genes involved in numerous negative epistatic interactions are expected to evolve lower stochastic variation between individuals. In a similar vein, such genes should display highly conserved gene expression patterns across species as the opportunity for neutral expression changes should be more constrained. Here, we revisited this verbal model using bioinformatics analysis of available functional genomic data and biochemical modeling. First, in contrast to the specific predictions of Park and Lehner's model, we found that (i) not only negative, but also positive epistatic "hubs" are associated with lower expression variation, (ii) genes with high-noise epistatic partners do not show more constrained expression variation, and (iii) genes with many epistatic partners are not especially sensitive to dosage reduction. Furthermore, we revealed that both gene expression noise and expression conservation are largely determined by previously described gene properties (Newman et al. 2006;Tirosh and Barkai 2008), such as its functional role, and not by the number of its epistatic partners. Thus, the original verbal model is unsupported by available large-scale functional genomic data. However, why do the predictions of this conceptually simple and intuitive model fail? We propose that whereas selection is expected to reduce expression noise, this effect is unlikely to be enhanced for genes with negative epigenetic epistatic interactions. Using mathematical modeling, we showed that if fitness is a diminishing-return function of gene activity, then the presence of negative epigenetic epistasis does not generally aggravate the fitness burden of gene expression noise (for possible counterexamples see below). There is ample evidence for the widespread occurrence of  (table 2). By eliminating the effect of epistasis when calculating fitness (see "Materials and Methods" section), we were able to separate the effect of reducing gene expression noise on fitness into two distinct categories: fitness advantage caused by single locus effect (without epistasis) and epistasis. (B) Distribution of in silico fitness benefits due to single locus (blue) and epistatic effects (brown). Whereas reducing noise causes a significant increase in fitness (see also table 2), this effect is merely caused by single locus effects and is not enhanced by epistatic effects (fitness advantage due to epistasis is minuscule and more often negative, than positive). Boross and Papp . doi:10.1093/molbev/msw236 MBE diminishing-return gene activity-fitness relationships in genetic systems, including nonmetabolic genes. First of all, it has long been established that most loss-of-function and strongly deleterious mutations are recessive (Fisher 1928;Wright 1934;Simmons and Crow 1977;Orr 1991), that is a 50% decrease in gene dosage often has only a negligible phenotypic effect. This classical observation has been reinforced by recent quantitative genome-wide screens in S. cerevisiae (Deutschbauer et al. 2005;Delneri et al. 2008) and S. pombe  demonstrating that the vast majority of heterozygous null mutations have no measurable fitness effect. Furthermore, detailed quantitative studies of single proteins, including various enzymes (Kacser and Burns 1981;Dykhuizen et al. 1987;Fell 1992) and a heat shock protein (Jiang et al. 2013), have demonstrated that the activity-fitness relations fit well hyperbolic curves. A recent systematic analysis quantifying the effect of expression level on fitness for 81 yeast genes also reported a high frequency of diminishing return relationships (Keren et al. 2016). Based on the prevalence of such gene activity-fitness functions, we expect that our results inferred from metabolic pathway simulations should be generally applicable. We note that although our mathematical analysis focused on individual epistatic interactions, we expect that even the cumulative effect of multiple epistatic interactions should not aggravate the fitness cost of noise in epistatic hubs. This is because the extra cost of noise associated with a single epistatic partner fluctuates around zero and often has a negative value ( fig. 4B).
Whereas our results imply that epigenetic epistasis does not generally constrain the evolution of gene expression, it could still have an important physiological or evolutionary impact in specific genes. First, a further theoretical analysis showed that when fitness is not a hyperbolic but rather a sigmoid function of gene activity (i.e., switch-like behavior) then the presence of negative epistasis does indeed aggravate the fitness burden of gene expression noise (supplementary fig. 5, Supplementary Material online). Such S-shape relationships between gene activity and phenotype values are mostly thought to occur in developmental, signaling and regulatory circuits (Veitia et al. 2013) [but see Keren et al. (2016) for metabolic examples]. Notably, the best examples of how stochastic variation in the activity of genetic interaction partners determines the phenotypic outcome of a mutation (Lehner 2013) come from developmental regulator genes of C. elegans (Raj et al. 2010;Burga et al. 2011) and Bacillus subtilis (Eldar et al. 2009). A second possible scenario in which epigenetic epistasis could enhance the effect of stochastic expression variation is when the noise of epistatically interacting genes is correlated. Correlated fluctuation in gene expression is expected to result in a net negative value of epigenetic epistasis even if the shape of the activity-fitness curve is hyperbolic. Whereas certain gene pairs show correlated expression noise in eukaryotes (i.e., noise regulons, Stewart-Ornstein et al. 2012), this phenomenon is likely to be more frequent in bacteria owing to the widespread occurrence of operonic organization. Further studies are needed to examine whether noise-induced epigenetic epistasis plays a significant role in shaping gene expression evolution in bacteria.
Finally, we emphasize that all currently feasible large-scale empirical tests of Park and Lehner's model suffer from the same shortcoming. Notably, all tests assume that epistasis between partial reductions in the activity of two genes can be well approximated by epistasis between null mutations of the very same genes. However, completely abolishing gene expression is not equal to reducing expression level by $20%-a typical value of stochastic variation in protein levels in yeast (Newman et al. 2006). As mentioned above, even halving the dosage causes a measurable fitness defect for only a minority of genes in yeast (Deutschbauer et al. 2005). Thus, it remains to be seen whether epistasis between mild alterations in gene activity show any correlation with epistasis between null mutations. Undoubtedly, an ultimate empirical test of the model demands systematic genetic interaction data between dosage decrease mutations.

Mathematical Modeling of the Yeast Glycolytic Pathway
To quantitatively study the fitness impact of protein level perturbations, we used a detailed kinetic model of a metabolic pathway. We focused on yeast glycolysis as a model system for several reasons: (i) it is an important area of metabolism and flux through the pathway can be directly related to cellular fitness, (ii) a consistent kinetic characterization of its enzymes has recently been carried out and made available as a mathematical model (Smallbone et al. 2013) (i.e., all enzymes were assayed under standard conditions designed to mimic the intracellular environment), (iii) it contains several isoenzymes that exhibit negative genetic interactions. Modeling was performed using the widely applied COPASI software package (version 4.12.81, Hoops et al. 2006) and steady-state fluxes were predicted. The model encompasses 21 enzymes with k cat values, associated with 12 biochemical reactions and 17 metabolites.
In silico fitness was approximated by the steady state flux through the pyruvate kinase reaction as this represents the glycolytic flux. Changes in expression levels were modeled by changing the turnover number (k cat ) values of the corresponding enzymes whereas null mutations were simulated by setting k cat to 1% of their original values. The value of 1% was chosen arbitrarily as the simulation would not run with a k cat value set to zero. However, the null mutation simulation results were robust to the exact value of kcat (from 0.5% to 3%). Epistasis scores were calculated according to the multiplicative model (Mani et al. 2008): where is the epistasis score, F AB is the observed fitness of the double mutant, whereas F A and F B are the observed fitness of the single mutants. F A Â F B is the expected fitness of the double mutant assuming a multiplicative model. When simulating fitness with enforcing zero epistasis between a pair of genes, fitness was obtained as F A Â F B . We used a cutoff of < À0.0015 for negative epigenetic epistasis (see supplemen To calculate epigenetic epistasis between dosage-decrease mutations, we reduced gene expression by 21% (the same value was applied for all genes). This is the mean of the coefficient of variations (CV) of experimentally measured protein levels of metabolic genes based on Newman et al.'s (2006) dataset. Thus, we applied a dosage reduction that is one standard deviation below the mean expression level based on the noise properties of an average enzyme.
To study in silico the fitness effect of gene expression noise, we simulated a population of 10 6 individuals as follows. For each virtual individual, we assigned a gene expression level to the gene of interest by drawing from a normal distribution with the mean set to the original expression level in the model and standard deviation (SD) set to the experimentally measured SD of protein levels for the given gene based on (Newman et al. 2006). Then the fitness of each individual was simulated deterministically and the fitness effect of noise was averaged across the population of 10 6 individuals.
To calculate the fitness advantage of reducing expression noise in one gene in the presence of wild-type noise level in its partner gene, we again simulated a population of 10 6 individuals as above with the following modification. In one member of the gene pair expression noise was set to its experimentally measured value, whereas it was reduced to 70% in the other member. This value was determined based on the amount of variation in noise levels that is observed across yeast genes that have approximately the same expression level (see sup plementary fig. S3, Supplementary Material online for details). We assume that this value is representative of the level of noise reduction that can possibly be achieved during evolution. We note that this is likely to be an overestimate of the amount of noise reduction that can be achieved by single mutations: according to a recent study (Metzger et al. 2015), the extent to which a single point mutation in the promoter region affected expression noise was substantially smaller than 30%. In these simulations, apart from the gene pair in question, the expression level of all other genes was kept constant (i.e., zero noise) to be able to separate the fitness impact of stochastic expression in an epistatic pair from the rest of the system, and also, to be able to simulate fitness with enforcing zero epistasis between the gene pair (see above).

Bootstrap Analyses to Assess the Significance of Fitness Differences
To statistically assess whether noise reduction is advantageous in a gene pair which either exhibits negative epistasis upon dosage reduction or exhibits no epistasis, we carried out resampling-based bootstrap tests as follows. The statistical significance of the fitness increment caused by reducing noise in one of the two proteins (with or without epistasis) was calculated by resampling 10,000 times both from the 10 6 fitness values simulated with wild-type noise and also from the 10 6 fitness values simulated with the noise level of one gene reduced to 70%. We defined P-values as the relative frequency of paired bootstrap samples in which reducing the noise did not increase fitness. We used a similar procedure to assess whether the presence of epistasis aggravates the advantage of noise reduction: we calculated the fitness effect of noise reduction by pairing the fitness values of individuals with wild-type noise for both genes and reduced noise for one of the genes. We performed this for fitness values inferred with and without epistasis (see "Materials and Methods" section on pathway modeling), resulting in 10 6 fitness difference values in the presence and 10 6 fitness difference values in the absence of epistasis. Finally, we calculated the P-value of the difference between these two sets using the bootstrap analysis described above.

A Conceptual Model of Epigenetic Epistasis When Fitness Is a Hyperbolic Function of Gene Activity
In the analysis shown on figure 3, epistasis scores for expression value pairs are calculated using the dose-response function f(x) ¼ (1 þ e À 5 * (x À 0.2) ) À1 , where f(x) is fitness and x is the sum of expression levels of the protein pair. We assume here the simplest case, two proteins executing the same function with the same activity being expressed at the same level. Model parameters were chosen so that a strong deletion epistasis is present between the two genes and wild-type expression level is on the plateau of the curve, but our results are robust to changes in the exact parameter values.

Large-Scale Datasets Employed in the Bioinformatics Analyses
Experimental data on single mutant fitness, positive and negative genetic interactions between null mutations were taken from the genome-wide screen of Costanzo et al. (2010). For each statistical test or plot, genetic interaction hubs were defined as genes in the highest quintile (highest 20%), when ranked by the number of epistatic interaction partners. Data on protein expression noise was obtained from the highthroughput single-cell proteomic measurements of Newman et al. (2006), which provides information on >2,200 yeast proteins. When determining noise levels of proteins, the raw coefficient of variation values were used (CV). To control for the influence of protein abundance on expression noise when identifying genomic variables that explain noise, the distance of the CV value from a running median of CVs was used instead [DM, see Newman et al. (2006) for details]. Experimental data on heterozygous deletion fitness in diploid yeast, the list of haploinsufficient genes and the list of slow-growing homozygous deletions were obtained from the fitness measurements of Deutschbauer et al. (2005). Expression divergence data comparing gene expression time series measurements in four yeast species under identical environments and stresses was taken from Tirosh et al. (2006) (with parameter c ¼ 2, standard by Tirosh et al.). GO terms for genes were downloaded from the Saccharomyces Genome Database (Dwight et al. 2002). As expression level, we used mRNA expression level as measured by RNA sequencing (Nagalakshmi et al. 2008). Environmental genetic degree was defined as the number of unique conditions in which the removal of the gene resulted in a fitness defect according to the large-scale chemogenomic screens of (Hillenmeyer et al. 2008). Information on protein complex Boross and Papp . doi:10.1093/molbev/msw236 MBE memberships were taken from two datasets: a manually curated dataset based on tandem affinity purification/mass spectrometry studies (Pu et al. 2009) and a consensus dataset of core protein complexes compiled by Benschop et al. (2010). The list of enzyme-encoding genes was taken from the yeast genome-scale metabolic network reconstruction of Mo et al. (2009).