A fast algorithm to factorize high-dimensional tensor product matrices used in genetic models

Abstract Many genetic models (including models for epistatic effects as well as genetic-by-environment) involve covariance structures that are Hadamard products of lower rank matrices. Implementing these models requires factorizing large Hadamard product matrices. The available algorithms for factorization do not scale well for big data, making the use of some of these models not feasible with large sample sizes. Here, based on properties of Hadamard products and (related) Kronecker products, we propose an algorithm that produces an approximate decomposition that is orders of magnitude faster than the standard eigenvalue decomposition. In this article, we describe the algorithm, show how it can be used to factorize large Hadamard product matrices, present benchmarks, and illustrate the use of the method by presenting an analysis of data from the northern testing locations of the G × E project from the Genomes to Fields Initiative (n ∼ 60,000). We implemented the proposed algorithm in the open-source “tensorEVD” R package.


Introduction
Hadamard products of positive definite matrices appear in many genetic models including gene-by-gene (e.g.additive-by-additive or additive-by-dominance, Henderson 1985) and gene-by-environment interactions (Crossa et al. 2006) as well as in hybrid prediction models (Bernardo 1998).In this article, we focus on highdimensional Hadamard products derived from 2 positive semidefinite matrices, each with a dimension considerably smaller than the resulting Hadamard product.
To motivate this problem, consider a reaction norm infinitesimal model (Falconer and Mackay 1996) for n G genotypes tested over n E locations (environments).Following Jarquín et al. (2014), interactions between genetic and environmental factors can be modeled using a Gaussian random effect with a covariance matrix K that is the product of a genetic (K G , derived from DNA or pedigree data) and an environmental (K E , typically derived from environmental covariates) relationship matrix.If all genotypes are tested in all environments, K is a Kronecker product K = K G ⊗ K E of dimension n = n G × n E .However, usually, not all genotypes are tested in all environments and genotypes may be replicated.In these cases, the K matrix takes the form where Z 1 and Z 2 are incidence matrices connecting phenotypes with the rows (and columns) of K G and K E , respectively, and "•" denotes the Hadamard product.A very similar problem arises when modeling hybrids' effects where K G and K E are replaced by additive relationship matrices between the female and male parental lines (Bernardo 1998).
Fitting Gaussian models with dense covariance structures such as the one presented above requires factorizing K using, for example, the eigenvalue decomposition (EVD) of K.The EVD has an O(n 3 ) computational complexity; therefore, a standard decomposition of K does not scale well to large sample sizes.To tackle this problem, we use results about the EVD of Kronecker products, and the fact that Hadamard products are submatrices of Kronecker products, to propose an algorithm that derives a basis for K that only requires factorizing K G and K E matrices that usually are much smaller than K.We show that the proposed approach provides a very good approximation to the target matrix (K) and that, in large n problems, the proposed approach can be orders of magnitude faster than performing EVD on K directly.Finally, we provide real data analyses showing that the proposed approach yields very close variance component estimates and almost an identical prediction accuracy in cross-validation to an exact EVD.The methods described in this article are implemented in the open-source "tensorEVD" R package, which is available through CRAN and the GitHub repository.

Methods
Recall the EVD of an N × N positive semidefinite matrix K that has the form Consider the Kronecker product ("⊗") of 2 symmetric positive semidefinite matrices, K 1 and K 2 , (1) Let the EVD of the 2 matrices in the right-hand side be K respectively.Replacing these matrices with their EVD we get: Using properties of Kronecker products (e.g.Searle 1982, p. 265), it can be shown that the eigenvectors of K are Kronecker products of the eigenvectors of K 1 and K 2 .Likewise, the eigenvalues of K are Kronecker products of the eigenvalues of K 1 and K 2 (see Supplementary Note 1 for a proof).Specifically, we have that (a numerical example of the above results is presented in Supplementary Note 2): A Hadamard product ("•") of 2 matrices is a submatrix of the corresponding Kronecker product.For example, an n × n matrix: is a submatrix of K 1 ⊗ K 2 in Equation (1).Therefore, the linear space spanned by 2) is a subspace of the linear space spanned by K 1 ⊗ K 2 .This suggests that we can find a basis for a Hadamard product from the EVD of the corresponding Kronecker product.The tensor EVD algorithm is inspired by this idea.

Tensor EVD algorithm
We assume that the input data consist of the following: • Covariance structures: K 1 and K 2 of dimensions n 1 × n 1 and n 2 × n 2 , respectively.For example, K 1 may be a genomic relationship matrix and K 2 may be an environmental relationship matrix describing environmental similarity between testing environments.• IDs: ID 1 and ID 2 are n vectors (n here is the sample size) mapping from observations to the rows and columns of K 1 and K 2 , respectively.(The row and column names of K 1 and K 2 are the unique entries of ID 1 and ID 2 , respectively.)These IDs are used to form the incidence matrices Z 1 and Z 2 in Equation (2).For instance, the matrix Z 1 K 1 Z ′ 1 can be obtained by indexing rows and columns of K 1 by ID 1 ; in R's (R Core Team 2021) notation, this is Using the above-described inputs, our algorithm (which we named tensorEVD) proceeds as follows: 4) For unbalanced or replicated data, the eigenvectors in Ṽ may not have a norm equal to 1; thus, the sum of the eigenvalues dk will no longer be equal to trace(K).Therefore, we normalize each eigenvector ṽk to have unit norm.
5) Order the eigenvalues dk and eigenvectors ṽk according to dk .
The tensorEVD algorithm described above renders orthonormal vectors only for the balanced case (i.e. for the Kronecker product of K 1 and K 2 ).For unbalanced cases, the eigenvectors are not guaranteed to be mutually orthogonal; however, they provide a basis for the Kronecker product.Therefore, the eigenvectors are also a basis for Hadamard products that span a subspace of the corresponding Kronecker product.
Note that the tensorEVD algorithm produces the complete basis containing N = n 1 × n 2 eigenvectors for the Kronecker matrix product K 1 ⊗ K 2 .As consequence, this basis can include more vectors than the ones needed to span . This can be particularly relevant if the size of the Hadamard product is considerably smaller than the corresponding Kronecker product.Furthermore, most of those vectors will have a very small eigenvalue (resulting from the product of a small eigenvalue of K 1 and a small eigenvalue from K 2 ).Therefore, instead of forming all possible eigenvectors, we allow for the user to specify a proportion of variance explained (0 < α ≤ 1, e.g.α = 0.95) and build only the eigenvectors needed to achieve such proportion of variance.
The "tensorEVD" R package can be installed from CRAN using the following instruction: Alternatively, it can be installed from the GitHub platform via, for instance, the "remotes" R package (Csárdi et al. 2023) using the instructions below: The following script shows how to perform EVD using the tensorEVD function (see Supplementary Note 3 for an actual numerical example).

Results and discussion
We benchmarked the tensorEVD routine against the eigen function of the "base" R package (R Core Team 2021) in terms of the computational time used to derive eigenvectors, the accuracy of the approximation provided by tensorEVD, and the dimension of the resulting basis.All the analyses were performed in R v4.2.0 (R Core Team 2021) run on the High Performance Computing Center (HPCC) from Michigan State University (https://icer.msu.edu/hpcc/hardware) using nodes equipped with Intel Xeon Gold 6148 CPUs at 2.40 GHz with 84 GB of RAM memory in a single computing thread.
The data used in these benchmarks was generated by the Genomes to Fields (G2F) Initiative (Lima et (2023).This data set was used to derive a genetic (GRM) and an environmental relationship matrix (ERM, from the environmental covariates, see Supplementary Note 4) for 4,344 maize hybrids and 97 environments (year-locations), respectively, corresponding to the northern testing locations.We formed Hadamard products [K in Equation ( 2)] between the GRM (as K 1 ) and the ERM (as K 2 ) matrix of various sizes by sampling hybrids (n G = 100, 500, and 1,000), environments (n E = 10, 30, and 50), and the level of replication needed to complete a total sample size ranging from n = 10,000 to 30,000.Then, we factorized the resulting Hadamard product matrix using the R base function eigen (R Core Team 2021) as well as using tensorEVD, deriving as many eigenvectors as needed to explain 90, 95, and 98% of the total variance.
The tensorEVD method was consistently orders of magnitude faster than eigen (see Supplementary Figs.1-3).The difference in computation time is particularly clear (e.g.tensorEVD ∼10,000 faster than eigen) when the product of the dimensions of each of the relationship matrices (n G × n E ) was smaller than sample size (n)-compare the left, middle, and right columns of Fig. 1.
The Cholesky decomposition is an alternative factorization that can be used to implement the models discussed in this study.This factorization has a computational complexity of O 1 3 n 3  , which is smaller than the complexity of the EVD [O(n 3 )].However, the Cholesky decomposition can be numerically unstable for matrices that are (near) singular, a situation that is not uncommon in G × E models.Another approach would be to use partial EVD methods that compute only a fraction of the eigenvectors.To evaluate these approaches, we benchmarked tensorEVD against the Cholesky decomposition as per the chol function from the "base" R package (R Core Team 2021) and against partial EVD computed using the trlan.eigenand eigs_sym functions from the "svd" (Korobeynikov et al. 2023) and "RSpectra" (Qiu and Mei 2022) R packages, respectively.As expected, partial SVD methods were faster than eigen only when the product n G × n E was smaller than n (see Supplementary Figs.1-3); however, tensorEVD was much faster than the partial EVD methods (Supplementary Fig. 4).Likewise, tensorEVD was faster than chol only in cases where n G × n E < n.

Approximation accuracy
We measured the accuracy of the approximation of the basis derived by the eigen and tensorEVD routines for each of the α-values by evaluating the Frobenius norm (i.e. a matrix generalization of the Euclidean norm; Golub and Van Loan 1996) of the difference between the Hadamard product matrix (K) and the approximation ( Kα = Ṽα Dα Ṽ′ α , where Ṽα and Dα are the eigenvectors and eigenvalues derived by each method and α-value; see Supplementary Note 5 for more details).In general, both methods provided a very good and similar approximation (Fig. 2).As expected, the values of the norm decrease when α increased (smaller norm indicates better approximation).The values of the norm for different sample sizes cannot be compared because the Frobenius norm is a cumulative sum of n × n elements.Therefore, we also computed the correlation matrix distance (CMD; Herdin et al. 2005) between the Hadamard product matrix (K) and the approximation provided by each method and α-value Fig. 1.Computation time ratio (log 10 scale, average across 20 replicates) of the EVD of the matrix K using the eigen method relative to tensorEVD method, by sample size (n = 10,000, 20,000, and 30,000 in the x-axis) and proportion α of variance of K explained (α = 0.90, 0.95, and 0.98).Each panel represents a combination of number of hybrids (n G ) and number of environments (n E ).
Fast factorization of tensor product matrices | 3 Fig. 2. Frobenius norm (average ± SD across 20 replicates) of the difference between the Hadamard matrix K and the approximation ( Kα ) provided by the eigen and tensorEVD procedures, by sample size (n = 10,000, 20,000, and 30,000) and proportion α of variance of K explained (α = 0.90, 0.95, and 0.98).Each panel represents a combination of number of hybrids (n G ) and number of environments (n E ).Smaller norm indicates better approximation.and 0.98) and sample size (n = 10,000, 20,000, and 30,000).Each panel represents a combination of number of hybrids (n G ) and number of environments (n E ).
( Kα ; see Supplementary Note 5 and Fig. 5).These CMD values are always between 0 and 1.In all the cases, the CMD was very small (<0.006), which indicates that both approximations were very good.As with the Frobenius norm metric, the CMD shows that both methods provide similar approximations (Supplementary Fig. 5); however, there is evidence that whenever n G × n E becomes larger than sample size n, tensorEVD provides a slightly better approximation than the eigen method (e.g.bottom-right panel in Fig. 2 and Supplementary Fig. 5, see Supplementary Fig. 6).

Dimension reduction
We also compared eigen and tensorEVD in terms of the number of eigenvectors provided by each method for a given α-value, relative to the rank of the K (i.e. the number of eigenvectors of K with positive eigenvalue).By construction, eigen is very efficient at maximizing the proportion of variance explained in the derivation of eigenvectors.The tensorEVD function is as effective as the eigen method at dimension reduction only for cases where n G × n E < n, for example the case n G = 100 and n E = 10 (top-left panel in Fig. 3).However, tensorEVD becomes less effective at dimension reduction when n G × n E exceeds sample size n (e.g.bottom-right panel in Fig. 3, see Supplementary Fig. 7).

Application in genomic prediction
Finally, we evaluated the performance of the approximation of K provided by the tensorEVD method in Gaussian linear models in terms of variance component estimates and cross-validation prediction accuracies.For this evaluation, we used all the G2F data from the northern testing locations included in the data set presented by Lopez-Cruz et al. (2023).For the northern testing locations, this data set includes n = 59,069 records for 4 traits (grain yield, anthesis, silking, and anthesis-silking interval) from n G = 4,344 hybrids and n E = 97 environments.
We analyzed these data with a Gaussian reaction norm model (Jarquín et al. 2014) in which the response (y ijk ) is modeled as the sum of the main effect of hybrid (G i ), main effect of environment (E j ), and the interaction hybrid × environment (GE ij ) term, and this is as follows: (3) Above, μ is an intercept and i, j, and k are indices for the hybrids, environment, and replicate, respectively.The term ε ijk is an error term assumed to be Gaussian distributed as ε ijk ∼ iid N(0, σ 2 ε ), with σ 2 ε variance parameter associated to the error.Hybrid, environment, and interaction effects were assumed to be multivariate normally distributed with zero mean and effect-specific covariance matrices, specifically , and σ 2 G , σ 2 E , and σ 2 GE are variance parameters associated to G, E, and GE, respectively.We fitted  Gibbs sampler was implemented using BGLR for the model in Equation (3) fitted to each trait (grain yield, anthesis, silking, and anthesis-silking interval).Each model was run with 50,000 MCMC iterations (discarding 5,000 as burning and using a thinning of 10 samples) and replicated 5 times.c Average (SD), across 4 traits and 5 replicates, time per iteration (median value across iterations).d Average (SD), across 4 traits and 5 replicates, time to perform the Gibbs sampler, which includes the initial overheading time (matrix preparation and hyperparameter setting) plus time to complete all iterations.
e Estimated total computing time (EVD computation + Gibbs sampling).All the computations were carried out on the MSU's HPCC in nodes with Intel processors with 96 GB of RAM memory using 3 computing threads.
the model in Equation (3) to each trait in a Bayesian fashion using the "BGLR" R package (Pérez-Rodríguez and de los Campos 2022) with the decomposition of the GE kernel (K) computed using eigen and tensorEVD methods for different percentages of variance of K explained (α = 0.90, 0.95, and 0.98).For these analyses, we used computing nodes equipped with Intel Xeon E5-2680 v4 CPUs at 2.40 GHz with 96 GB of RAM memory using 3 computing threads.
As one would expect, reducing α from 1 to 0.98, 0.95, and 0.90 led to a slight reduction in the proportion of variance explained by GE term and a small increase in the error variance (Fig. 4).In general, for the same α-value, the reduction in proportion of variance explained and the increase in the error variance was smaller with the tensorEVD compared to eigen.We obtained similar patterns for anthesis, silking, and anthesis-silking interval .
To evaluate prediction performance, we conducted a 10-fold crossvalidation with hybrids assigned to folds (this mimics the CV1 scheme of Burgueño et al. 2012).For any given α-value, the models fitted using the factorization derived with tensorEVD and eigen produced almost identical predictions (Fig. 5).Furthermore, there was a negligible reduction in prediction accuracy associated to lower values of α.For instance, for grain yield, the prediction correlations with the tensorEVD method were 0.387, 0.386, and 0.384 for α-values of 0.98, 0.95, and 0.90, respectively (Fig. 5).
In the analysis presented above, the Hadamard product matrix had a dimension of n = 59,059 and a rank (number of eigenvalues greater than 0) of 38,187.Table 1 gives the number of eigenvectors returned by eigen and tensorEVD by α-value.As previously noted, tensorEVD is less efficient than eigen at dimension reduction when n G × n E ≫ n, a condition met in the analysis just described (4,344 × 97 ≫ 59,059).However, using an α = 0.95 tensorEVD already provides substantial dimension reduction that translates into a shorter total computation time (decomposition + model fitting, Table 1; Supplementary Fig. 11).

Conclusion
The tensorEVD method can be used to factorize large Hadamard product matrices that are submatrices of Kronecker products of smaller positive semidefinite matrices.Examples where such matrices are key components of genomic models include hybrid prediction and G × E models involving genetic and environmental relationship matrices.The proposed algorithm can be several orders of magnitude faster than a standard EVD, with relatively negligible effect on variance component estimates and prediction performance.The proposed method can be very advantageous in terms of speed and dimensionality reduction in cases where the dimensions of the low-rank matrices are very small relative to the sample size.

Fig. 3 .
Fig. 3. Number of eigenvectors (average ± SD across 20 replicates) produced by the eigen and tensorEVD methods, relative to the rank of matrix K, by α-value (i.e.proportion of variance explained, α = 0.90, 0.95,and 0.98) and sample size (n = 10,000, 20,000, and 30,000).Each panel represents a combination of number of hybrids (n G ) and number of environments (n E ).

Fig. 4 .Fig. 5 .
Fig. 4. Proportion of the phenotypic variance (average ± SD across 5 replicates) of grain yield explained by each model term (G, E, GE, and Error) in Equation (3).The EVD of the Hadamard matrix K (covariance matrix of GE) was performed using eigen and tensorEVD methods for different α-values (α = 1.00, 0.98, 0.95, and 0.90).Numbers on the top represent the percentage of change (%) relative to the variance explained by each term in the model that uses full information in K (i.e.α = 1.00) obtained with the eigen method (horizontal dotted line).

Table 1 .
Number of eigenvectors and the times required to perform EVD and to generate 50,000 posterior samples by method and target proportion of variance explained.

min) a Time in the Gibbs sampling b Total time (min) e Per sample (sec) c Per 50,000 samples (min) d
a Average (SD) across 10 replicates of the decomposition.b