Average semivariance directly yields accurate estimates of the genomic variance in complex trait analyses

Abstract Many important traits in plants, animals, and microbes are polygenic and challenging to improve through traditional marker-assisted selection. Genomic prediction addresses this by incorporating all genetic data in a mixed model framework. The primary method for predicting breeding values is genomic best linear unbiased prediction, which uses the realized genomic relationship or kinship matrix (K) to connect genotype to phenotype. Genomic relationship matrices share information among entries to estimate the observed entries’ genetic values and predict unobserved entries’ genetic values. One of the main parameters of such models is genomic variance (σg2), or the variance of a trait associated with a genome-wide sample of DNA polymorphisms, and genomic heritability (hg2); however, the seminal papers introducing different forms of K often do not discuss their effects on the model estimated variance components despite their importance in genetic research and breeding. Here, we discuss the effect of several standard methods for calculating the genomic relationship matrix on estimates of σg2 and hg2. With current approaches, we found that the genomic variance tends to be either overestimated or underestimated depending on the scaling and centering applied to the marker matrix (Z), the value of the average diagonal element of K, and the assortment of alleles and heterozygosity (H) in the observed population. Using the average semivariance, we propose a new matrix, KASV, that directly yields accurate estimates of σg2 and hg2 in the observed population and produces best linear unbiased predictors equivalent to routine methods in plants and animals.

Genomic variance (r 2 g )-the variance explained by genomewide associations between the underlying quantitative trait locus and DNA markers genotyped in the training population-is often estimated in genetic experiments (Visscher et al. 2007;Gao et al. 2012;Lee et al. 2012Lee et al. , 2013Lipka et al. 2014;Rutkoski et al. 2014;Kumar et al. 2015;Piaskowski et al. 2018;Rice and Lipka 2019;Krause et al. 2019;Pincot et al. 2020;Petrasch et al. 2021;Yadav et al. 2021) using genomic relationship matrices (GRMs, K), which measure the relatedness among entries (Yang et al. 2010;Habier et al. 2013). The selection of K is used directly in solutions to the mixed model equations and is central to estimating the correct variance components in LMM analyses (Henderson 1953;Searle et al. 1992;Lynch and Walsh 1998;Mrode 2014). The phenotypic variance-covariance (V) is V ¼ G þ R, where R ¼ Ir 2 e is the residual variance-covariance, and G ¼ Kr 2 g is the genomic variancecovariance (Henderson 1953;Searle et al. 1992;Lynch and Walsh 1998;Piepho 2019). The genomic variance r 2 g is a scalar and, thus, any change in K will impact r 2 g estimates. Genomic variance is found in many ratios throughout modern quantitative genetic research, including genomic heritability, prediction accuracy, selection reliability, prediction error variance, and response to genomic selection (Goddard 2009;Hickey et al. 2009;Gorjanc et al. 2015). Of these ratios, genomic heritability has been the most frequently reported in public research (Speed et al. 2012(Speed et al. , 2017Speed and Balding 2015;de los Campos et al. 2015;Legarra 2016;Lehermeier et al. 2017;Yang et al. 2017).
Genomic heritability is where r 2 g is the genomic variance and r 2 e is the residual variance on an entry-mean basis. Genomic heritability is often estimated by substituting restricted maximum likelihood (REML) variance component estimates into (1). We studied how different forms of K affect variance component estimates. We found that, even for large data sets, there are systematic differences in the genomic variance component estimates arising from different forms of K (VanRaden 2008; Astle et al. 2009;Yang et al. 2010;Forni et al. 2011;Endelman and Jannink 2012) and that the resulting variance component estimates may not always be correct when directly substituted into (1), as is routine practice. Despite this, researchers often simultaneously use the same approaches for genomic prediction and variance component estimation and may consequently report incorrect genomic heritability estimates.
Here, using the average semivariance (ASV), introduced by Piepho (2019) and expanded by Feldmann et al. (2021), we derive a new form for K, referred to as K ASV , which is the product K ¼ Z Z T from the mean-centered marker matrix Z ¼ PZ, where P ¼ I n À n À1 1 n 1 T n is the idempotent, mean-centering n Â n-matrix. The ASV relationship matrix is This matrix is scaled to the residual variance-covariance matrix and directly yields accurate estimates of r 2 g and h 2 g regardless of population constitution, population size, or true heritability. It is possible to scale other forms of the via a division by ðn À 1Þ À1 trðKÞ or to scale estimates of genomic variance from any form of K by multiplying ðn À 1Þ À1 trðKÞ byr 2 g to obtain ASV estimates of variance component. We explore the practical implications of K ASV for estimating r 2 g and h 2 g in a wild population of Arabidopsis thaliana (Atwell et al. 2010), a wheat (Triticum aestivium) breeding population (Crossa et al. 2010), a laboratory mouse (Mus musculus) population (Valdar et al. 2006), an apple (Malus Â domestica) breeding population (Kumar et al. 2015), and a pig (Sus scrofa) breeding population (Cleveland et al. 2012). The ASV approach that we propose can be used to estimate variance components in genetic evaluation studies in plants, animals, microbes, and humans.

The Average Semivariance
The ASV estimator of the total variance (Piepho 2019) is half the average total pairwise variance of a difference between entries and can be decomposed into independent sources of variance, e.g. genomic and residual. There are two alternative ASV derivations, both leading to the same definitions of the estimators. The first derivation originated in geostatistics and estimated the semivariance as half of the variance among all pairwise differences among genotypic values (g), i.e. 2 À1 varðg i À g j Þ (Webster and Oliver 2007;Piepho 2019). Piepho (2019) derived ASV from a study's observations, worked out the semivariance and took the average across all pairs of observations. In our context, there is an equivalent alternative derivation based on the sample variance of the genotypic values Estaghvirou et al. (2013). The sample variance among genotypic values is ðn À 1Þ À1 P n i¼1 ðg i À gÞ 2 . That is to say that the expected values of the sample variance of genotypic values are the ASV, i.e. Eðs 2 g Þ ¼ h ASV g . ASV can be used to estimate and partition the total variance in LMM analyses into parts; such as the total variance, as in Piepho (2019), the variance explained by large effect markers and marker-marker interactions, as in Feldmann et al. (2021), and genomic variance, as shown below.

ASV definitions of genomic variance and heritability
In complex traits analyses, there is a crucial difference in the treatment of genotypes and effects in statistical models used for data analysis vs the quantitative genetics theory (Yang et al. 2010;Speed et al. 2012Speed et al. , 2017de los Campos et al. 2015;Speed and Balding 2015;Legarra 2016). In quantitative genetics theory, between entry differences in genetic values and genomic variance are attributed to the a random sampling of marker genotypes (Bulmer et al. 1980;Lande and Thompson 1990;Falconer and Mackay 1996;Lynch and Walsh 1998) and, in an LMM framework, variation stems from a random sampling of the marker effects. Despite differences in derivation and assumptions regarding the source of randomness, the resulting variance-covariance structure between the two coincides under specific experimental, population, and marker sampling conditions (de los Campos et al. 2015;Legarra 2016). With this in mind, we derived an approach using the ASV that relies on the assumptions of LMM analyses, e.g. random marker effects, but yields correct estimates of genomic variance.
The analyses shown throughout this paper assume the dependent variables are least squared means (LSMs) or other adjusted means for entries (y). R ¼ I n r 2 e gives the residual variance of the LSMs. The ASV can efficiently deal with more general forms of variance-covariance matrices in generalized LMMs (Piepho 2019). The LMM for this analysis is where y is the vector of phenotypic LSMs of for n entries, n is the number of entries, 1 n is an n-element vector of ones, l is the population mean, I n is the identity matrix of size n, g is an n-element vector of random effect values for entries with g $ Nð0; Kr 2 g Þ, and e is the residual for each entry with e $ Nð0; I n r 2 e Þ. The ASV definition of variance from LMM (3) is (4) where h ASV y is the phenotypic variance, V ¼ Kr 2 g þ I n r 2 e is the variance-covariance among observations, h ASV g is the genomic ASV, and h ASV e is the ASV of the residuals. If we assume G ¼ Kr 2 g , where G is the variance-covariance of the best linear unbiased predictors (BLUPs) of the genotypic values g, it can be inferred that the magnitude of r 2 g in directly inverse to trðKÞ because The ASV definition of the genomic variance is where Z ¼ PZ is the mean-centered marker matrix, and K ¼ Z Z T is the realized genomic relationship or kinship matrix described by VanRaden (2008), omitting the scaling constant 2 P j p j ð1 À p j Þ, where p j is the allele frequency of the jth SNP, which requires Hardy-Weinberg equilibrium (HWE) to hold (de los Campos et al. 2015), and trðZZ T PÞ ¼ trðZ Z T Þ. The trace of Z Z T is a function of heterozygosity in the observed population (Vitezica et al. 2013(Vitezica et al. , 2017Legarra et al. 2018). When the observed population is in HWE, n À1 trðKÞ ¼ 1, and when the population is not in HWE due to inbreeding, the n À1 trðKÞ ¼ 1 þ f , where f is the in coefficient of inbreeding (Endelman and Jannink 2012;Legarra et al. 2018). In the general case, h ASV g ¼ ðn À 1Þ À1 trðKPÞ r 2 g , where K is any form of the GRM calculated from Z, without centering, or Z, with centering, because trðKÞ ¼ trðKPÞ.
The ASV definition of the residual variance is Notably, the genomic variance h ASV g is on the same scale as the residual variance h ASV e , and both are defined such that (4) is accurate. REML estimates of the residual variance are equivalent to ASV estimates when best linear unbiased estimators or LSMs are the response variable y.
Two equivalent methods yield accurate h 2 g estimates There are two equivalent ways to obtain accurate estimates of genomic variance and subsequently genomic heritability. The first method, our recommended approach, utilizes K ASV (2) in the LMM analysis and directly yields accurate estimates of the genomic variance components from the model by rescaling the GRM. The first method works because V ¼ Kr 2 g þ I n r 2 e is a true statement regardless of K, but different choices of K change the scaling and interpretation of r 2 g . Thus, variance components estimated by ASV can then be substituted directly into (1) without any adjustment.
The second method is to adjust the genomic variance component estimates from any form of K by multiplying them by a scaling factor (ðn À 1Þ À1 trðKÞ) defined by the population size (n) and the diagonals of the chosen GRM (trðKÞ). Through substitution of (5) and (6) into (1), the ASV estimator of genomic heritability h ASV This formulation can be used directly with any form of K or K by substituting REML variance component estimates. Note that ðn À 1Þ À1 trðKÞ is the same as the scaling coefficient used in (2). The second strategy is analogous to the post hoc adjustment approach Feldmann et al. (2021) proposed.

Genomic relationship matrices
We calculated and applied seven relationship matrices for each population, simulated or case example, including K ASV . We used The form proposed by VanRaden (2008) is where Z is the marker matrix centered on column means (2p j ), and p j is the minor allele frequency (MAF) for the jth SNP. This form assumes HWE and obtains p j from a historical reference population, not the observed population. When p j originates from the observed population, the centering by 2p j is equivalent to column centering and K VR only differs from K ASV by a scaling factor. The normalized relationship matrix, K GN , was explicitly introduced as the normalized relationship matrix by Forni et al. (2011) as This form is the most numerically similar to K ASV and only differs by a single denominator degree of freedom.
The form of the relationship matrix proposed by Endelman and Jannink (2012) is where d % ðn=mÞCV À2 is a shrinkage factor, CV 2 is the coefficient of variation of the eigenvalues of S, S ¼ m À1 Z Z T À hZ •k ihZ T •k i; hS ii i is the mean of diagonal elements of S. Notably, at high marker densities, when d ¼ 0, Endelman and Jannink (2012) is equivalent to VanRaden (2008).
The method proposed by Yang et al. (2010) also centers the columns of Z by subtracting 2p j where z ij is the jth SNP in the ith individuals, z jk is the jth SNP in the kth individual when j 6 ¼ k, and m is the number of markers. The diagonals are treated differently than the off-diagonals in this form. The method proposed by Astle et al. (2009) is where z j is the i-element vector of the jth SNP. The classical identity-by-state definition is (Astle et al. 2009): Note that this is the only calculation that is not scaled or centered by any function of p j .
For each model and each simulation, we estimated two variance components (r 2 g and r 2 e ) using sommer::mmer() and took the ratio of variance components in R v4.1.0 (R Core Team 2020). We estimated genomic heritability using the standard form by substituting REML estimates from (3) into (1).

LMM analysis in R
In the sommer R package (Covarrubias-Pazaran 2016), LMM (3) is expressed as where data is an n Â 2 matrix with Y as a column of LSMs, Entry is a column of factor-coded entries, and K is one of the seven GRMs in this study given.

Simulated data
We generated 36 experiment designs with different heterozygosity H ¼ 0.0, 0.25, 0.5, and 0.75 and different trait heritability h 2 g ¼ 0.2, 0.5, and 0.8 and for population sizes of n ¼ 250, 500, and 1,000. In all examples, 1,000 populations genotyped at m ¼ 5,000 causal loci were used to generate the genetic traits. We simulated all m ¼ 5,000 marker effects following a normal distribution l ¼ 0 and r ¼ 1. When multiplied by the marker genotypes and summed, the score is an individual's true genetic value, g. Residuals were simulated with l ¼ 0 and r 2 e ¼ ð1 À h 2 Þ=ðh 2 s 2 g Þ to obtain a trait with the desired genomic heritability (Endelman 2011) and s 2 g ¼ ðn À 1Þ À1 P n i¼1 ðg i À gÞ 2 is the sample variance among genotypic values (Estaghvirou et al. 2013). In this study, the true value of h 2 g ¼ 0:2, 0.5, or 0.8. All plots were made with the ggplot2 package (Wickham 2016) in R 4.1.0 (R Core Team 2020).
We performed cross-validation to determine predictive ability rðĝ; yÞ, or the correlation between BLUPs and LSM, which is a measure of success commonly reported in genome prediction studies that indicates how informative the phenotype is as a measure of the genomic value. We also estimated the prediction accuracy rðĝ; yÞ= ffiffiffiffiffi f h 2 g r , which is a measure of success that scales the predictive ability to the upper limit ( An ideal situation for genomic prediction is a low value of predictive ability and a high value of prediction accuracy. When the predictive ability is high, genomic selection is unlikely to outperform phenotypic selection. When the prediction accuracy is low, the model is bad at capturing the variation in genomic values. We first split each population into 80% train and 20% test and estimated genomic BLUPs and then calculated the accuracy as the correlation between the estimated LSM y and the BLUPĝ for all entries in the test set. We performed this cross-validation scheme 100 times for each population and each trait.

Analysis of simulated data confirms that ASV yields accurate estimates of genomic variance
The ASV relationship matrix yielded suitable estimates of genomic variance and genomic heritability in the observed populations, while the other methods varied with the level of heterozygosity. When heterozygosity H < 0.5, the genomic variance tends to be underestimated, and when H > 0.5, the genomic variance tends to be overestimated (Fig. 1) by methods excluding (2) and (9). This pattern was realized regardless of the population size, e.g. n ¼ 250, 500, and 1,000. All methods tend to produce accurate estimates when H ¼ 0.5, in which case the inbreeding coefficient f ¼ 0 and HWE is not violated.
The precision (variance) improved by increasing the population size (n), but the accuracy (bias) did not improve. It has been demonstrated ad nauseam that increasing n increases precision or lowers the sampling variance of the estimates but does not eliminate bias (Laird and Ware 1982;Searle et al. 1992;Lynch and Walsh 1998;Legarra 2016). Notably, the entire parameter space of h 2 g was observed when the population size is small (Fig. 1). Only K ASV and K GN yielded stable precision as H increased (Fig. 2). Other methods that we examined have variable precision and variable accuracy depending on the sample size, heterozygosity, and the true value of h 2 g (Figs. 1 and 2). Interestingly, we observed an interaction between h 2 g and H that impacted the precision of genomic heritability estimation did not affect K GN or K ASV . Precision improved as H increased for high heritability traits and precision worsened as H increased for low heritability traits. For traits where h 2 h ¼ 0:5, precision was constant.

Analysis of simulated and empirical data confirms that ASV does not impact BLUPs or prediction accuracy
Neither the predictive ability (rðĝ; yÞ) nor the BLUPs from genomic best linear unbiased predictor are affected by ASV. In our simulated populations, the predictive ability was equal across all seven GRMs that we tested (Fig. 3), but the prediction accuracy (ĥ À1 g rðĝ; yÞ) varies with the choice of GRM and therefore the heterozygosity in the sampled populations. In 22 empirical trait Â population examples we evaluated, the differences in the prediction accuracy, when present, appeared to be negligible and do not lend themselves clearly to "better" or "worse" categories (Figs. 4 and 5). While the choice of K does not impact BLUP, it does impact estimates of genomic variancer 2 g , genomic heritabilityĥ 2 g , prediction accuracyĥ À1 g rðĝ; yÞ (Fig. 5), average prediction error variance PEV, and selection reliability 1 À r À2 g PEV, which all rely onr 2 g . Differences in Fig. 5 are more pronounced for the fully inbred populations, e.g. Arabidopsis and wheat, than the partially inbred populations, e.g. pig, mouse, and apple. ASV allows users to understand how well GS is performing relative to phenotypic selection and to predict how reliable genomic selection can be for certain traits in specific populations more accurately than other methods since it directly yields accurate estimates of r 2 g and h 2 g .

The relationship between K ASV and K GN
We found that the normalized K, i.e. K GN (9), proposed by Forni et al. (2011) and further described by Legarra (2016), yields estimates of K that only deviated from K ASV by a single degree of freedom in the denominator of the matrix scaling factor. Although these estimators were derived through different approaches and with different concepts in mind, they are numerically similar, apart from a single degree of freedom difference in the divisor of the GRM: Forni et al. (2011) used the number of entries (n), whereas we used df g ¼ n À 1 for calculating the sample variance (Bulmer 1979). K GN , instead of being biased by a factor of 1=ð1 þ f Þ; K GN is biased by a factor of ðn À 1Þ=n. Our simulations confirm this deviation and the median genomic variance estimates using K GN were slightly larger than K ASV , which was equal to the true value in the simulations (Fig. 1). This work, Forni et al. (2011), and Legarra (2016) all arrive at numerically similar solutions through conceptually different derivations, which we feel is indicative of the value of these approaches for the plant, animal, and human genetic studies that rely on genomic relatedness, e.g. GWAS, genomic prediction, or inferring population structure and ancestry.

K ASV yields genomic variance estimates that naturally account for inbreeding
Inbreeding changes the patterns of among and within entry genomic variance and drives deviations from HWE (Bernardo 2002;Wricke and Weber 2010;Legarra 2016;Isik et al. 2017). A challenge of partial inbreeding is that researchers may not know or infer the reference population, making unadjusted genomic variance estimates hard to interpret (Legarra 2016). In genomic evaluations in plants and animals, the current population is often interpreted as the reference population, but this is an inaccurate Fig. 1. Effect of heritability (h 2 g ), population size (n), and heterozygosity (H) on the accuracy of genomic heritability estimates. Phenotypic observations were simulated for 1,000 samples with n ¼ 250, 500, and 1,000 (left to right) genotyped for m ¼ 5,000 SNPs and the average heterozygosity H ¼ 0%, 25%, 50%, and 75%. The accuracy of genomic heritability estimates (ĥ 2 g ) from LMMs fit using the seven relationship matrices is shown for true genomic heritability (h 2 g ) ¼ 0:2 (upper panel), 0.5 (middle panel), and 0.8 (lower panel). The upper and lower halves of each box correspond to the first and third quartiles (the 25th and 75th percentiles). The notch corresponds to the median (the 50th percentile). The upper whisker extends from the box to the highest value that is within 1.5 IQR of the third quartile, where IQR is the interquartile range, or distance between the first and third quartiles. The lower whisker extends from the first quartile to the lowest value within 1.5 IQR of the quartile. The dashed line in each plot is the true value from simulations.  Phenotypic observations were simulated for 1,000 samples with n ¼ 250, 500, and 1000 (left to right) genotyped for m ¼ 5,000 SNPs and the average heterozygosity H ¼ 0%, 25%, 50%, and 75%. rðĝ; gÞ estimates from LMMs fit using the seven relationship matrices is shown for true genomic heritability h 2 g ¼ 0:2 (upper panel), 0.5 (middle panel), and 0.8 (lower panel). Each box's upper and lower halves correspond to the first and third quartiles (the 25th and 75th percentiles). The notch corresponds to the median (the 50th percentile). The upper whisker extends from the box to the highest value within 1.5 IQR of the third quartile, where IQR is the interquartile range or distance between the first and third quartiles. The lower whisker extends from the first quartile to the lowest value within 1.5 IQR of the quartile.
interpretation unless the population is at HWE and H ¼ 0.5 by design or happenstance. It may be that the only reference population that is concretely defined is the sample population. In connection to Legarra (2016), our work will allow researchers to directly obtain accurate estimates of the genomic variance in the sample population regardless of whether the assumptions of HWE are met.
When the study populations are entirely, or partially, inbred as in wheat, Arabidopsis, or inbred per se evaluations in hybrid crops, such as maize, tomato, rice, the covariance among marker effects increases. Lehermeier et al. (2017) proposed a novel method (termed method M2) to account for the covariance of marker effects, which increases the genomic variance estimates in recombinant inbred line populations. Our analyses of the same flowering time data with For the Arabidopsis data set (second row), K Y systematically produced singular systems in sommer::mmer() and prediction accuracy was not estimated for either FT10 l or FT16 l . Each box's upper and lower halves correspond to the first and third quartiles (the 25th and 75th percentiles). The notch corresponds to the median (the 50th percentile). The upper whisker extends from the box to the highest value within 1.5 IQR of the third quartile, where IQR is the interquartile range or distance between the first and third quartiles. The lower whisker extends from the first quartile to the lowest value within 1.5 IQR of the quartile.
ASV yielded equivalent results and patterns to Lehermeier et al. (2017), suggesting that K ASV may be providing an estimate of genomic variance that naturally accounts for linkage disequilibrium (LD) and the covariance of marker effects (Table 1). We believe that the similarity in results is because LD is associated with the offdiagonal elements of K, which is taken into account using ASV. For the Arabidopsis data set (second row), K Y systematically produced singular systems in sommer::mmer() and prediction accuracy was not estimated for either FT10 l or FT16 l . Each box's upper and lower halves correspond to the first and third quartiles (the 25th and 75th percentiles). The notch corresponds to the median (the 50th percentile). The upper whisker extends from the box to the highest value within 1.5 IQR of the third quartile, where IQR is the interquartile range or distance between the first and third quartiles. The lower whisker extends from the first quartile to the lowest value within 1.5 IQR of the quartile. Pincot et al. 2020;Petrasch et al. 2021;Fan et al. 2021), and for accounting for population structure and relatedness in marker-trait association analyses (Kang et al. 2010;Yang et al. 2010Yang et al. , 2011Tian et al. 2011;Peiffer et al. 2014;Spindel et al. 2016;Alqudah et al. 2016;Pincot et al. 2018;Ferguson et al. 2021;Freebern et al. 2020). As advocated by Speed and Balding (2015) and Legarra (2016), the ragged diagonal elements of K ASV equal 1, on average, and the off-diagonal elements equal 0, on average. ASV directly yields accurate estimates of genomic heritability in the observed population and can be used to adjust deviations that arise from other commonly used methods for calculating genomic relationships regardless of the population constitution, such as inbred lines and F 1 hybrids, unstructured GWAS populations, and animal herds or flocks (Fig. 1).

Discussion
The interpretation of genomic variance and heritability estimates was systematically affected by the available methods used to estimate K. The bias that we show in this paper is independent of sampling error (large data sets mitigate sampling error) and exists even for enormous data sets. We derived a new relationship matrix, K ASV , using the ASV that yielded consistent variance component estimates. We also derived a correction factor ðn À 1Þ À1 trðKÞ that allowed accurate estimates of genomic heritability in the observed population from LMM analyses using various software packages (Clifford and McCullagh 2006;Endelman 2011;Zhou and  Adopting experiment designs that enable screening of a greater number of entries n yield more precise estimates of key variance components in research programs (Smith et al. 2006;Moehring et al. 2014;Borges et al. 2019;Mackay et al. 2019;Hoefler et al. 2020) and ASV can ensure that those estimates are accurate and comparable across populations. In many plant quantitative genetic studies, the population sizes are n % 500, which may pose a general problem for variance component and ratio estimation as those variance components can have high sampling variability between replicated experiments (Fig. 1). For large populations, common in human and domesticated animal studies, it is possible to precise (low variance) but inaccurate (high bias) estimates of r 2 g and h 2 g resulting from different relationship matrices, unless the assumptions of HWE happen to be perfectly met in the study population.
We did not explore differences that arise from population structure or rare alleles, which is a limitation to our simulation approach (Astle et al. 2009;Lee et al. 2012Lee et al. , 2013Speed et al. 2012). We believe, but have not demonstrated, that our ASV approach could be applied to many of the existing methods that have been proposed to handle these real-world situations. For example, Lee et al. (2012) propose that K be calculated among different sets of SNPs with similar MAFs and then the genomic variance for each MAF bin are jointly estimated and summed to account for unique variation attributable to common vs rare alleles. Speed et al. (2012) proposed a scaling factor for each SNP based on its own sample variance (varðx l Þ s ), where s ranges from À2 to 2 and x l is a vector of marker genotypes at the lth locus (Speed et al. 2012;Lee et al. 2013). This means that SNPs are either being centered and scaled (s ¼ -1), which is equal to K GN , or that SNPs are being centered but not scaled (s ¼ 0). While Speed et al. (2012) indicate that s ¼ -1 yields more stable estimates of h 2 g , it is not entirely clear how to optimally select a value of s for each locus. Our simulations exposed systematic differences between (2) and other forms of K. Our simulation and empirical experiments also suggested limited, if any, differences between genomic variance estimates from five other commonly cited GRMs ( Fig. 1; Table 1). The lack of significant differences is perturbing. In every case, there are multiple reasons given for using one relationship Table 1. Genomic heritability (ĥ 2 g ) estimates for the 22 traits from five case studies, including six traits in an apple population with n ¼ 247 entries genotyped at m ¼ 2,829 SNPs (Kumar et al. 2015), four traits in an wheat population with n ¼ 599 entries genotyped at m ¼ 1,278 SNPs (Crossa et al. 2010), four traits in an Arabidopsis population with n ¼ 1,057 entries genotyped at m ¼ 193,697 SNPs (Atwell et al. 2010), and three traits in an mouse population with n ¼ 1,814 entries genotyped at m ¼ 10,346 SNPs (Valdar et al. 2006), and five traits in a pig population with n ¼ 3,534 entries genotyped at 52,843 SNPs (Cleveland et al. 2012) using the seven GRMs compared in this article.

Case study
Trait ASV Forni et al. matrix over any other that do not seem to play any role in either bias (accuracy) or variance (precision) of the genomic variance component estimates. Both (2) and (9) have the necessary numeric properties advocated by Speed and Balding (2015) that enable the variance components from LMM (3) to be interpreted directly as the genomic variance in the sampled population. We recommend that the ASV approach be considered for adoption by genetic researchers working in humans, microbes, or (un)domesticated plants and animals.

Data availability
The input and output data from simulations and analyses have been deposited, along with the code for the simulations, in a public Zenodo repository (https://doi.org/10.5281/zenodo.6211739).