The Statistical Scale Effect as a Source of Positive Genetic Correlation Between Mean and Variability: A Simulation Study

The selection objective for animal production is the highest income with the lowest production cost, while ensuring the highest animal welfare. A selection experiment for environmental variability of birth weight in mice showed a correlated response in the mean after 20 generations starting from a crossed panmictic population. The relationship between the birth weight and its environmental variability explained the correlated response. The scale effect represents a potential cause of this correlation. The relationship between the mean and the variability implies: the higher the mean, the higher the variability. The study was to quantify by simulation the genetic correlation between a trait and its environmental variability. This can be attributable to the scale effect in a range of coefficients of variation and heritabilities between 0.05 and 0.50. The resulting genetic correlation ranged from 0.1335 to 0.7021 being the highest for the highest heritability and the lowest CV. The scale effect for a trait with heritability between 0.25 and 0.35 and CV between 0.15 and 0.25 generated a genetic correlation between 0.43 and 0.57. The genetic coefficient of variation (GCV) affecting residual variability was modulated by the strength reducing the impact of the scale effect. GCV ranged from 0.0050 to 1.4984. The strength of the scale effect might be in the range between 0 and 1. The scale effect would explain many reported genetic correlation and the additive genetic variance for the variability. This is relevant when increasing the mean of a trait jointly with the reduction of its variability.

Genetic correlation is explained by linkage, pleiotropy or a combination of these factors (Falconer and Mackay 1996). On the other hand, the skewness of the residual distribution provides information about the genetic correlation between the traits and their variability. Regarding the possible causes of the mean-variability genetic correlation, Sorensen and Waagepetersen (2003) demonstrated that the skewness of the distribution of the residuals determines the sign of the genetic correlation given that points in the skewed tail are more variable when they are located farther from the mean. Gutiérrez et al. (2006) observed this effect in litter traits in mice. Yang et al. (2011) demonstrated how a Box-Cox transformation (Box and Cox 1964) facilitates working with a non-skewed variable. Previous studies demonstrated that the meanvariability genetic correlation for litter size trait changed from -0.73 to 0.28 in rabbits and from -0.64 to 0.70 in pigs when the variable was transformed. However, the transformed variable is difficult to interpret (Pun et al. 2013). Although the skewness of the distributions can explain, to an extent, this genetic correlation, its skewness is not so often present in traits to provide significant and relevant genetic correlations. Therefore, there must be other reasons that make it appear in the data. The magnitude and sign of the genetic correlation between trait mean and its environmental variability is a concern given the correlated genetic response on one of them when selecting for the other. A wide range of genetic correlation estimates ranging from -0.93 to 0.97 has been reported in the literature (Hill and Mulder 2010;Formoso-Rafferty 2017). Formoso-Rafferty et al. (2016a) carried out a successful divergent selection experiment for birth weight environmental variability in mice. The birth weight trait was normally distributed (Formoso-Rafferty et al., 2016a), and a positive but low genetic correlation between birth weight and its environmental variability was observed. The statistical scale effect can be defined as the relationship between the mean and the variability of a trait in the sense that the higher the mean of a variable, the higher its variability. The estimated genetic correlation found by Formoso-Rafferty et al. (2016a) could be attributed to the statistical scale effect, which was suggested as a possible cause affecting the correlated trend in variability in another experiment to select weight gain in mice (Moreno et al. 2012).
The scale effect represents a potential cause for the observed correlation between a trait mean and its variance. The scale effect is caused by a direct relationship between the mean and variance, for instance when a trait has a constant coefficient of variation (CV) such that an increase in the mean also increases the variability. For most distributions the variance is directly connected to the mean, where the normal distribution is an important exception. In a Poisson distribution the variance is equal to the mean and in a Gamma distribution there is a direct relationship between the mean and CV (Rönnegård and Valdar 2012).
The scale effect is then a consequence of a manmade way to measure heterogeneity which is a difficult way to express the real natural determining process (Sun et al., 2013). Among the implications of this mean-variability relationship are the concerns found in studies on Genome-Wide Association Study in which "gene by gene" and "gene by environment" interactions are confounded with marker effects on variability. Several methodologies have been developed to correct the effect (Rönnegård and Valdar 2012) and particularly the use of CV (Mackay and Lyman 2005). This and other transformations like monotonic (Sun et al., 2013) or Box-Cox (Yang et al., 2011) transformations were also essayed, but because of the ongoing debate on their feasibility it is advised to avoid these transformations (Shen and Rönnegård 2013).
The presence of an additive genetic variance for a trait in a population would imply the possibility of changing the mean by selection of that trait which would automatically generate the appearance of an additive genetic variance for the variability originated by the scale effect.
The CV is a statistical parameter defined as the ratio of the standard deviation (s) to the mean (m) to quantify the variability in a dimensionless manner. Thus, for a given CV, the higher mean, the higher standard deviation. Considering the CV to be fixed allows simplifying the scope of the study, as there is no natural direct relationship between mean and variability in a normally distributed variable. For a fixed CV, modifying the mean would automatically increase the standard deviation and vice versa. Since it is not common to have a CV constant across the whole range of a trait, a scenario in which the scale effect should be modulated was considered. Houle (1992) reported that the value of CV for measuring variability depends on the degree to which they correct for the relationships that exist between mean and variances. Consequently, modifying the mean by selection would not automatically change the variability of the trait to the same extent. Subsequently this would reduce the strength of the scale effect accordingly and, therefore, the additive genetic variance generated for the residual variance due to the scale effect.
The objective of this study was to quantify by simulation up to what extent the genetic correlation between a trait and its environmental variability could be attributed to the scale effect. The reduction of the scale effect due to an incomplete relationship between mean and variability, understood as the strength of the scale effect, were also assessed in the resulting additive genetic variance, thus affecting the residual variance.

MATERIALS AND METHODS
The heteroscedastic model (HE) proposed by SanCristobal-Gaudy et al. (1998) was assumed to derive the additive genetic value v i affecting the residual variability, which has a Gaussian distribution, i.e., v i $ Nð0; s 2 v Þ. Under this model, the residual variance is heterogeneous and partially under genetic control. The simplest HE model was used: where y i is the record i, m is the mean of the trait, u i the additive genetic effect, e h is the residual variance (s 2 e ) in the model HO, and e i is a non-scaled residual with a Gaussian distribution of e i $ Nð0; 1Þ. It is assumed in this model that the corresponding vectors of additive genetic effects u and v can be correlated as follows: where A is the additive genetic relationship matrix, r is the genetic correlation and 5 is the Kronecker product. Note that the average value of v does not correspond with the mean residual variance due to the exponential nature of the model (Mulder et al. 2007). The scale effect shows the relationship between the mean and the variability so that the higher the mean of a variable, the higher its variability. Thus, for instance, the standard deviation of a variable multiplied by a constant k is in turn multiplied by k, with its CV unaltered. This relationship between the level of a trait and its variability was assumed here to be for the performance level of each animal. Two simulation analyses were performed to explore the influence the scale effect can have on the genetic correlation between a trait and its residual variability. The first analysis was performed to compute the genetic correlation between a trait and the variability that occurs as a consequence of the scale effect. The second analysis was performed to assess the strength of the scale effect originating from the incomplete determination that the CV would have on the modulation of the magnitude of the additive genetic variance generated for the variability based on the scale effect.
(i) Mean-variability genetic correlation generated by the scale effect A total of 100 scenarios were simulated for the values of CV and h 2 ranging from 0.05 to 0.50 at 0.05 increments. Single records were simulated for a trait with a mean value (m) of 100. The values for m, CV and h 2 were fixed, but all the other parameters were derived from them. But for the sake of simplicity, residual and additive genetic effects were assumed to be the only random effects in the model. Simulations were performed in such a manner that higher phenotypes would have higher variance keeping the CV constant as expected by the scale effect. Phenotypic variance (s 2 p ), residual variance (s 2 e ) and additive genetic variance (s 2 u ) were initially defined as follows: These reference variances were initially considered to be homogeneous to simulate the trait level for each individual, and then they were considered heterogeneous and modulated by the trait level as a direct consequence of the scale effect. The simulations were performed in three steps: 1. First, each record y i of an animal i was simulated assuming a classical homoscedastic model (HO): where a i is the additive genetic effect and e i the residual effect that were randomly obtained from the following Gaussian distributions with unique variances: Second, the simulated record y i described the level of the phenotype of the individual i, and the equivalent residual standard deviation in this individual (s ei ) with CV and h 2 remaining constant. The scaled phenotypic standard deviation was no longer unique and dependant on the magnitude of y i describing the performance level: The phenotypic standard deviation was, therefore, considered heterogeneous thus transferring this heterogeneity proportionally to both residual and additive genetic standard deviations that also became heterogeneous. The scaled residual standard deviation was then derived from the scaled phenotypic standard deviation and the heritability: The additive genetic variance also became heterogeneous in the simulation and also dependant on the scale effect. This is contrary to the definition of the HE model above, but necessary to avoid obtaining some possible residual variances higher than the phenotypic variance of the HO model used as reference. Therefore, the additive genetic standard deviation for the individual i (s ui ) also became scaled: Some minor negative phenotypes can accidentally be simulated for high CV values, resulting in negative values for s ei . When this occurred, s ei was changed to a positive value. This happened for CV values higher than 0.25 with a maximum of 2.3% of the records in some replicates for CV = 0.5. This was empirically checked to confirm the effect on the genetic correlation was negligible.
3. Based on the HE model equation described above, the additive genetic value of individual i affecting variability (v i ), would proportionally modify the residual variance in model HO (s 2 e ), which is used as a reference model, could lead to s 2 ei .
s 2 e . Using an exponential model, the corresponding environmental additive genetic value of the individual i (v i ) was then obtained from the following equation: To accommodate the additive genetic value affecting the trait, the a i simulated by the HO model was rescaled: The genetic correlations between the additive genetic values for trait (u) and for the variability (v) were directly computed from the simulated values since they were available from the simulation. The mean genetic correlation of 10 independent replicates was computed within a scenario and 100,000 individuals were simulated for each scenario.
(ii) Additive genetic variance of variability s 2 v generated by the scale effect strength It is assumed that the simulation described above has the scale effect proportional to the value of the trait, but this assumption could be unrealistic. A direct determination of the variability from the level of the trait and the CV would then seem to be unrealistic, leading in addition to unreliable values for s 2 v , particularly when the CV was higher than 0.25 (Hill and Mulder 2010). This second analysis was performed to study the strength of the scale effect due to an incomplete determination of v from the scale effect by defining a new parameter r (0 , r # 1). This new parameter r would weaken the scale effect if it is less than 1; as a consequence the scale effect strength reduces the absolute value of the environmental additive genetic effect of the individual i as follows: Neither the value of the heritability (h 2 ) nor the value of r will affect the genetic correlation between mean and residual variance, but the additive genetic variance of the variability s 2 v will be reduced by r 2 . In addition, this variance is dependent on the magnitude of the CV of the trait. This is a simple way to model the incomplete determination of the variability from the level of the trait, but many other models are possible.
Again, 10 replicates of 100,000 individuals were simulated per scenario. In this case, 200 scenarios were considered according to the values of CV and r. Here, CV ranged from 0.05 to 0.50, and r ranged from 0.05 to 1. Both variables were altered in increments of 0.05. Genetic coefficient of variation for environmental variance (GCV) is a measure of evolvability (Houle 1992). Therefore, for explanatory purposes, instead of presenting s 2 v , the average of its square root value was computed across replicates within a scenario given that this value roughly represents the genetic coefficient of variation (GCV % ffiffiffiffiffi s 2 v p ) of the variability (Hill and Mulder 2010).
The present work was motivated in the light of the correlated trend observed in the real scenario provided by the mice divergent selection experiment for birth weight environmental variance carried out by Formoso-Rafferty et al. (2016b). The information from this experiment was then used to discuss some aspects of the simulations regarding a real scenario. Estimated genetic parameters in this population were used as a reference to compare with simulations. In addition, the mean, standard deviation and CV were computed within five intervals of the data after the records were sorted for two different traits: birth weight and litter size. Based on this study design, the consistency of a unique CV value across the range of several variables will be discussed.

RESULTS
The averages of the environmental variability genetic correlations within scenarios combining CV and heritability are reported in Table 1. The results ranged from 0.1335 to 0.7021. Standard errors of the means are not presented but ranged from 0.0009 to 0.0085. Figure 1 shows the genetic correlations obtained by the scale effect across heritability ( Figure 1a) and across CV (Figure 1b). A growing trend of the genetic correlation across heritability was observed, and a decreasing trend was observed as the CV increased. The maximum value of genetic correlation was attained when heritability was maximal (0.50) and CV was minimal (0.05), and the minimum value resulted when heritabilities were minimal (0.05) and CV was maximal (0.50) ( Table 1). Values obtained for CV higher than 0.25 might be slightly biased by the anomalous appearance of some minor negative residual variances, while the other scenarios were not affected by this artifact, when approaching the lines themselves for high CV values (Figure 1a). In a similar way, the negative trend of the genetic correlations across the CV became smoother after this CV threshold of 0.25 (Figure 1b).
The mean GCV values obtained from 10 replicates under each of the different simulated CV values considering all the ranges of scale effect strength, are presented in Table 2. They increased linearly across the values considered for the strength of the scale effect defined by r, with growing slopes according to the growing CV. The values ranged from 0.0050 to 1.4854. Standard errors are not shown but also increased as simulated values of CV and r increased. The maximum was 0.0305 for CV = 0.50 and r = 1. Some of the GCV values obtained were higher than 0.69 and were inconsistent with the parameters reported in the literature.

DISCUSSION
The results show that the scale effect can partly justify at least the positive genetic correlations estimated between the mean and the environmental variability for some traits in some populations. As a consequence, modifying the mean of a trait by selection can bring about a change in the variability, as well as the reverse, modifying the variability might imply a modification of the mean in the same direction as a consequence of the scale effect. The genetic correlation generated by the scale effect assuming a constant CV would range from 0.13 to 0.70 depending on the heritability and CV in a range from 0.05 to 0.50 for both parameters. The additive genetic variance for the environmental variability, due to the scale effect, could be considered as an incomplete determination thus reducing its strength.
The genetic correlation between the mean and the variability is a matter for concern given that the environmental variability can be modified by correlated selection (Damgaard et al. 2003;Huby et al. 2003). In fact, correlated responses exhibit variability when selecting to increase the mean (Moreno et al. 2012), and changes in the mean trait are also present when selecting for variability (Formoso-Rafferty et al.  (Marjanovic et al. 2016); milk yield, 0.60 in dairy cattle ; teat count, 0.80 in pigs (Felleki and Lundeheim 2015); litter size, 0.49 in pigs (Sell-Kubiak et al. 2015a). Lower positive values were observed for other traits such as morphology traits in tilapia (0.11-0.37 (Marjanovic et al. 2016) or 0.06 for conformation scores in cattle (Neves et al. 2011). Other traits exhibited different signs and magnitudes, ranging from -0.06 to 0.43 for egg color . Alternatively, lower magnitude values for weight gain were observed in different periods: 0.17, 0.02 and -0.09 (Neves et al. 2011). Finally, negative values were also identified, such as -0.52 for litter size in pigs (Felleki et al. 2012), -0.23 to -0.45 for chicken birth (Mulder et al. 2009;Wolc et al. 2009) and -0.16 for adult body weight in n rainbow trout (Janhunen et al. 2012). The sign and magnitude of any genetic correlation was classically explained by linkage, pleiotropy or a combination of both (Falconer and Mackay 1996). Regarding the relationship between the mean level and the variability, a type of pleiotropic correlation would occur from the skewed distribution of the trait (Yang et al. 2011;Sorensen and Waagepetersen 2003;Ros et al. 2004;Gutiérrez et al. 2006). From this brief review, there is clearly an increased frequency of positive correlations between the mean of a trait and its variability suggesting that the scale effect could be generating these effects, but the effects seem to be different for different type of traits. For example, individual weight traits or dairy traits seem to be related to positive correlations while negative values appear more frequently in reproductive traits. Some populations with an important selection success would have had those animals with a prospect of low performance culled. In litter size, for instance, with repeated measures, some animals with low performance records would affect the mean litter size and these animals would not be selected, thus reducing the variability while increasing the mean value. Therefore, the final genetic correlation in each scenario would depend on the population status concerning selection, the trait and the influence of other causes affecting genetic correlation, but the scale effect seems to be present in many of them. In addition, assumptions established in these simulations, even when looking realistic, are only theoretical and avoid complexity. In fact the closer evolution lines of Figure 1a suggest that high values of variability are not easily supported by nature. This simulation study revealed that the genetic correlation generated by the scale effect would range from 0.15 to 0.70 depending on the heritability and CV values of the trait in a particular population. The mouse experiment motivating this research had an undesired correlated genetic response for variability that appeared in the mean birth weight when selecting for environmental variability (Formoso-Rafferty et al.  Table 1, the expected genetic correlation generated by the scale effect for a heritability of 0.15 and a CV of 0.15 was 0.3736. Thus, the scale effect would explain the estimated genetic correlation in this case in which the relationship between mean and variability is assumed to be a constant as it directly depends on CV. In general, for a weight type trait with heritability between 0.25 and 0.35 and a CV between 0.15 and 0.25, the genetic correlation generated by the scale effect would range from 0.43 to 0.57. Instead, a litter size type trait, with heritability under 0.10 and CV between 0.10 and 0.15, the expected genetic correlation generated by this effect would be between 0.21 and 0.31. The lowest expected genetic correlations generated by this effect would be defined for low heritability traits with high CV, and the highest would be inversely high heritabilities with low CV. The simulation performed in this research to define the phenotypic standard deviation as directly proportional to the mean and CV lead to the appearance of an additive genetic variance affecting the environmental variability. Using the square root as an approach for GCV, a simulated scenario of h 2 and CV equal to 0.15 resulted in a GCV of 0.3094, but the estimated value by Formoso-Rafferty et al. (2016a) updated to the 18 th generation was 0.1929, which is significantly different to that obtained by simulation. This indicated that there was a lower sensitivity, suggesting that the assumed proportionality between mean and standard deviation would be incomplete. For example, Figure 2 presents the evolution of the mean, standard deviation and CV of birth weight (Figure 2a), and litter size (Figure 2b) across five sorted groups with the same number of records ordered by the trait value in this experiment. A visual scale effect can be observed for birth weight with increased standard deviation accompanying the increase in the mean value. However, the evolution of the standard deviation is not proportional to the mean, leading to a reduction in the CV. The pattern of the trend for the case of the litter size is different. The scale effect has a greater effect for low values of the trait with an increase in the standard deviation. The effect stabilizes for intermediate values and decreases for high values of litter size. The scale effect would be surpassed by other causes of genetic correlation between the trait and variability; thus, for highly selected populations for litter size in prolific species, a negative genetic correlation would exist between this trait and its environmental variability as reported by Sorensen and Waagepetersen (2003) and Felleki et al. (2012) in pigs and Ibáñez-Escriche et al. (2008) in rabbits. Part of these genetic correlations can be attributed to the skewness of the distribution of residuals. Yang et al. (2011) estimated negative genetic correlations in untransformed litter size in pigs and rabbits that became positive after a Box-Cox transformation. However, the genetic correlation would remain positive in the previously mentioned mice population (Formoso-Rafferty et al. 2016a).
To account for the impact of the scale effect, the second simulation was performed to demonstrate different strengths of the scale effect (r). Different positive r values were employed to determine how GCV would be affected. For the sake of simplicity, a constant value for the entire range of the trait was assumed although the value of r can differ for different values of the interval (Figure 2a) or even be asymmetric. Thus, it is possible for some values of the trait to be positive, whereas others are negative (Figure 2b). Table 2 presents the GCV observed for the simulated scenarios with different scale effect strength (r) and CV values. The CV values ranged from 0.0050 (r = 0.05 and CV = 0.05), increasing with the value of the simulated parameters up to 1.4854 (r = 1.00 and CV = 0.50). Compared with the values reviewed by Hill and Mulder (2010), the estimations of the posterior revision reported above (Neves et al. 2011;Felleki et al. 2012;Janhunen et al. 2012;Rönnegård et al. 2013;Fina et al. 2013;Sae-Lim et al. 2015;Felleki and Lundeheim 2015;Sell-Kubiak et al. 2015a;Sell-Kubiak et al. 2015b;Mulder et al. 2016;Marjanovic et al. 2016) and with the exception of the anomalous unreliable estimation for birth weight in mice by Gutiérrez et al. (2006) as justified by Pun et al. (2013), GCV values greater than 0.69 were never reported and might even be considered meaningless (Hill and Mulder 2010). High-scale effect strength is not typical and would never be possible in the context of high CV values. The value of 0.1929 was estimated for GCV in the mice experiment (Formoso-Rafferty et al. 2016a) based on a CV value of 0.1470; these values roughly approach 0.1857 (r = 0.60) and 0.2008 (r = 0.65) from Table 2 for a CV = 0.15. This strength n would be reasonable in the light of the trends of the standard deviation and CV presented in Figure 2a. Then, the correlated response in the mice selection experiment can be completely explained by the scale effect modulated in this case by the strength 0.60 of the scale effect. According to the GCV values in the literature referred above, common values of GCV are in the range between 0.15 and 0.50. Assuming a CV less than 0.25, the strength of the scale effect is expected to be any point of the space defined for the parameter between 0 and 1.
To conclude, this simulation study demonstrates how the scale effect can justify the genetic correlation that often appears between the mean and the environmental variability of some traits in some populations, and the additive genetic variance for the variability that might simply be generated by this scale effect. Breeders should be well aware of this fact when their selection objective is to increase the mean of a trait jointly with the reduction of its variability to develop an adequate selection index for this trait or to find an adequate transformation of the trait to remove the scale effect. The issue is also important in the search of vQTL in which it would help to make the mean and variable independent (Rönnegård and Valdar 2012). Further studies are needed in each case given that the scale effect could vary across the range of traits and even counterbalanced by other factors affecting the genetic correlation.