Heterosis of Fitness and Phenotypic Variance in the Evolution of Diploid Gene Regulatory Network

Heterosis describes the phenomenon whereby a hybrid population has higher fitness than an inbred population, and has previously been explained by either Mendelian dominance or overdominance, where it is generally assumed that one gene controls one trait. How-ever, recent studies have demonstrated that genes interact through a complex gene regulatory network (GRN). Furthermore, phenotypic variance due to noise is reportedly lower for heterozygotes, whereas the origin of such variance-related heterosis remains elusive. There-fore, a theoretical analysis linking heterosis to GRN evolution and stochastic gene expression dynamics is required. Here, we investigate heterosis related to fitness and phenotypic variance in a system with interacting genes, by numerically evolving diploid GRNs. According to the results, the heterozygote population exhibited higher fitness than the homozygote population, that is, fitness-related heterosis resulting from evolution. In addition, the heterozygote population expressed lower noise-related phenotypic variance in expression levels than the homozygous population, implying that the heterozygote population is more robust to noise. Furthermore, the distribution of the ratio of heterozygote phenotypic variance to homozygote phenotypic variance exhibited quantitative agreement with previous experimental results. By applying dominance and over-dominance to the gene expression pattern rather than only a single gene expression, we confirmed the correlation between heterosis and overdominance. We explain our results by proposing that the convex high-fitness region is evolutionarily shaped in the genetic space to gain noise robustness under genetic mixing through sexual reproduction. Significance Statement Heterosis, that is higher fitness in hybrid populations than inbred populations, is a long-standing problem in genetics, evolution, and breeding studies. Heterosis is not necessarily the result of a trait uniquely determined by a single gene, but likely involves stochasticity and interactions among genes. Through numerical evolution of the gene regulatory network, we demonstrate that heterosis is shaped by evolution to achieve robustness against noise and genetic mixing by sexual recombination in gene expression dynamics. That is, a mixed population is more robust to noise than inbred populations, as revealed experimentally by reduced phenotypic variance. The observed link between heterosis and phenotypic robustness, Mendelian dominance, and the convex single-humped fitness landscape represents a novel avenue in evolution and genetics research.


Introduction
Sexual reproduction generally involves the mixing of genetic 2 information from multiple parents. Most diploid organisms 3 undergo meiosis, which recombines genomes between parents 4 to produce gametes, then combines the gametes to produce the 5 offspring's genome. One of the most remarkable phenomena in 6 sexual reproduction among populations is heterosis, whereby 7 hybrid populations exhibit higher fitness than inbred popu-  gene Aa expresses the trait over 'A' or 'a', it is called over- 17 dominance. For decades, the relationship between dominance 18 (or overdominance) and heterosis has been recognized as a 19 significant genetics problem; The first is the dominance hy-20 pothesis, which explains heterosis by assuming that Mendelian 21 dominance cancels out the phenotypes that reduce fitness in 22 heterozygotes. In contrast, the overdominance hypothesis pro- 23 poses that heterozygotes can have a novel trait that exhibits 24 higher fitness than both parental homozygotes. However, no 25 solid conclusions have yet been reached(6, 7). 26 In the standard context of heterosis, the fitness or trait is 27 compared between inbred and hybrid populations. In contrast, 28 Phelan and Austad examined data relating to the traits of 29 over a hundred species, and demonstrated that the variance 30 of traits also shows heterosis, in the sense that the variance 31 of traits for hybrid populations is smaller than that of inbred 32 populations (8). They also discussed the potential relevance 33 of genotype-environment interactions for this variance-related 34 heterosis; however, its theoretical explanation remains elusive. 35 To discuss the variance-related heterosis, we need to con-36 sider the noise or environmental variation, and robustness 37 of traits against such variation(9-13). However, the possible 38 link between noise and heterosis has rarely been discussed 39 (14), despite the fact that stochasticity in phenotypes (gene 40 expression) has recently received substantial attention in the 41 context of homeostasis and cell state selection (15)(16)(17)(18). 42 Moreover, the evolution of gene regulatory networks has 43 been extensively studied (19)(20)(21)(22)(23). To date, arguments related 44 to heterosis are based on the simple genotype-phenotype re-45 lationship, that is, the case in which one gene determines 46

Significance Statement
Heterosis, that is higher fitness in hybrid populations than inbred populations, is a long-standing problem in genetics, evolution, and breeding studies. Heterosis is not necessarily the result of a trait uniquely determined by a single gene, but likely involves stochasticity and interactions among genes. Through numerical evolution of the gene regulatory network, we demonstrate that heterosis is shaped by evolution to achieve robustness against noise and genetic mixing by sexual recombination in gene expression dynamics. That is, a mixed population is more robust to noise than inbred populations, as revealed experimentally by reduced phenotypic variance. The observed link between heterosis and phenotypic robustness, Mendelian dominance, and the convex single-humped fitness landscape represents a novel avenue in evolution and genetics research. KO and KK conceived and designed the studies. KO conducted a simulation. KO and KK analyzed the data and wrote the manuscript. All authors have read and approved the final version of the manuscript.
None of the authors have any competing interests to declare. 1 To whom correspondence should be addressed. E-mail: kaneko@complex.c.u-tokyo.ac.jp

D R A F T
ij (xj(t)−θ)]) as follows: Genetic changes progress according to the following pro-107 cedure. First, fitness is determined by the distance of the 108 expression pattern of genes {xi(T )} from a prescribed target 109 pattern for i = 1, 2, ..., M (< N ). Two parents are chosen with 110 a probability proportional to the fitness. Two genomes from 111 each parent, J (1) ij and J (2) ij , were mixed by recombination to 112 provide a gamete. Next, two gametes from the two parents 113 provide a new J (1) ij and J  We first measured the fitness of the homozygote and het-125 erozygote populations, where the mean values for the popu-126 lations are given by W homo and W hetero , respectively. Next, 127 phenotypic variance according to noise Vnoise is defined as 128 the magnitude of the variance of xi(T ) for individuals with 129 the same genotype but with different noise added to the gene 130 expression dynamics (see Methods for the definition). Vnoise 131 is computed as the average of the variance over genes i for 132 homozygote and heterozygote populations, giving V homo noise and 133 V hetero noise , respectively.

135
Fitness-related heterosis. First, we discuss heterosis related 136 to fitness. The change in the degree of fitness-related heterosis 137 W hetero /W homo during evolution is shown in Fig.1A. The 138 ratio of W hetero /W homo increased for earlier generations, and 139 then decreased. After evolution (at the 999th generation), 140 inequality W hetero /W homo > 1 was maintained; therefore, 141 fitness-related heterosis was generally observed. (See Fig. S1 142 for the evolution of the distributions of W homo and W hetero .) 143 In Fig.1B, we plotted W hetero versus W homo for each sam-144 ple after evolution. Except for three of the 100 samples, 145 W hetero > W homo was satisfied, indicating evolution of het-146 erosis. Then, across each sample, we computed the average 147 ratio of W hetero /W homo and examined its dependence on the 148 mutation rate. As shown in Fig.1C, with µ, indicating that a greater degree of heterosis evolved 150 at a higher mutation rate. Additionally, at higher µ, the ratio 151 of heterosis W hetero /W homo was larger at a larger σ.

245
Similarly, variance-related heterosis was negatively corre-246 lated with pattern dominance but positively correlated with 247 pattern overdominance; however, the correlation coefficient 248 was smaller than that for fitness-related heterosis (see Fig. 249 S3).

250
Explanation of the results according to the convex fitness re-251 gion and single-humped fitness function. In this section, we 252 suggest an intuitive explanation of our results using fitness 253 landscape pictures. Nevertheless, further mathematical eluci-254 dation is required in the future. 255 We consider a fitness landscape over a genotype space (35). 256 In particular, we discuss the high-fitness region. After evolu-257 tion, offspring are produced from the two parents by meiosis 258 within this high-fitness region. The genotypes of the diploid 259 genome (J ij )/2 will be located approximately at the 260 internal division of the genotype of the parents. The offspring 261 must have high fitness; otherwise, the offspring will not be 262 selected. For example, when the higher-fitness region is not 263 convex (Fig.4A), the offspring may not be selected because 264 of their lower fitness compared to the parents. Therefore, the 265 high-fitness region for the population under sexual reproduc-266 tion and selection shapes the convex region in the genotype 267 space (Fig.4B).

268
The fitness landscape, then, is expected to take a single-269 humped structure around the center of the region that de-270 creases as the genotype moves toward the periphery of the 271 high-fitness region (Fig.4B). When we consider the robustness 272 of the fitted state, the fitness is expected to be less sensitive to 273 genetic change around the center than at the periphery. This 274 suggests that the fitness function is flat around the center and 275 Fig. 4. Schematic of the fitness landscape and high-fitness region. (A)When the high-fitness region is not convex, the offspring may have lower fitness than the parents; therefore, the linage will not be selected. (B)When the high-fitness region is convex over the genotype space, the offspring's fitness is higher than or equal to that of the parents. Thus, the fitness of the heterozygote is higher than that of the homozygote, leading to heterosis.  higher fitness than homozygote populations, indicating hetero-319 sis. Heterosis exhibited a negative correlation with dominance 320 and a positive correlation with overdominance. Moreover, phe-321 notypic variance due to noise was lower in the heterozygote 322 population than in the homozygote population, indicating 323 variance-related heterosis. Furthermore, the numerically ob-324 tained distribution of the ratio of homozygote population 325 variance to heterozygote population variance agreed quantita-326 tively with that of previous measurement data (8).

327
We then proposed a possible explanation for these results 328 according to the convex fitness landscape. Specifically, the 329 observed landscape formation may arise if high fitness is main-330 tained under the conditions of sexual reproduction and noise. 331 Therefore, to increase or maintain fitness during sexual re-332 production, it is necessary to create a population of genomes 333 in which the fitness of the offspring is equal to or greater 334 than that of the parents. In addition, the population located 335 inside the high-fitness region will evolve to avoid the periphery 336 of the region in which the phenotype (fitness) is vulnerable 337 to noise. This fitness landscape implies epistasis. However, 338 epistasis alone is not sufficient to support heterosis. A convex 339 high-fitness region with a fitness function that flattens around 340 the top is a stronger hypothesis than epistasis, as it concerns 341 D R A F T the global shape of the fitness landscape for the entire genome 342 population. The validity of this hypothesis requires further ex-343 amination in theoretical models with evolving gene expression 344 dynamics, as well as experimental data. 345 We also studied the relationship between overdominance 346 and heterosis. Instead of a single locus, the dominance and 347 overdominance of expression patterns at multiple loci were 348 defined and measured in this study. Note that the previous 349 explanation for heterosis by overdominance was based on 350 the locus-specific overdominant effect in (36), whereas its molecular mechanism is still unclear (37). In contrast, pattern

399
Let us define y m i (t) as the concentration of mRNA i from each chromosome (m = 1, 2) and x i (t) as the concentration of the corre-sponding protein in a cell. Note that the proteins are synthesized from both chromosomes, where mutations or recombinations are mainly introduced in the promoter or enhancer region, so that the superscript m in x i (t) is not required. By replicating the regulation in the diploid cell by J (m) ij (m = 1, 2) for each gene, we obtain where θ(= 0.5) is the threshold of expression, which makes the states x = 0 and x = 1 symmetric for simplicity. As protein i is synthesized from the corresponding mRNA, we then obtain Evolutional simulation. We take the initial condition  AA and BB. Next, the expression pattern is computed by making the heterozygote AB. We then calculate the percentage of the expression pattern obtained from the heterozygote that matches only one of AA or BB for the differentially expressed