Genomic Predictive Ability for Foliar Nutritive Traits in Perennial Ryegrass

Forage nutritive value impacts animal nutrition, which underpins livestock productivity, reproduction and health. Genetic improvement for nutritive traits in perennial ryegrass has been limited, as they are typically expensive and time-consuming to measure through conventional methods. Genomic selection is appropriate for such complex and expensive traits, enabling cost-effective prediction of breeding values using genome-wide markers. The aims of the present study were to assess the potential of genomic selection for a range of nutritive traits in a multi-population training set, and to quantify contributions of family, location and family-by-location variance components to trait variation and heritability for nutritive traits. The training set consisted of a total of 517 half-sibling (half-sib) families, from five advanced breeding populations, evaluated in two distinct New Zealand grazing environments. Autumn-harvested samples were analyzed for 18 nutritive traits and maternal parents of the half-sib families were genotyped using genotyping-by-sequencing. Significant (P < 0.05) family variance was detected for all nutritive traits and genomic heritability (h2g) was moderate to high (0.20 to 0.74). Family-by-location interactions were significant and particularly large for water soluble carbohydrate (WSC), crude fat, phosphorus (P) and crude protein. GBLUP, KGD-GBLUP and BayesCπ genomic prediction models displayed similar predictive ability, estimated by 10-fold cross validation, for all nutritive traits with values ranging from r = 0.16 to 0.45 using phenotypes from across two locations. High predictive ability was observed for the mineral traits sulfur (0.44), sodium (0.45) and magnesium (0.45) and the lowest values were observed for P (0.16), digestibility (0.22) and high molecular weight WSC (0.23). Predictive ability estimates for most nutritive traits were retained when marker number was reduced from one million to as few as 50,000. The moderate to high predictive abilities observed suggests implementation of genomic selection is feasible for most of the nutritive traits examined.

Compared to other forage grass species, perennial ryegrass is regarded as having relatively high nutritive value, providing a cost effective, nutrient rich feed for ruminant livestock (Wilkins 1991;Baert and Muylle 2016). Breeding for improved nutritive value in this species has focused principally on higher in vitro dry matter (DM) digestibility to enhance energy availability and voluntary intake from grazed pasture (Jung and Allen 1995). This is a key selection criterion in many ryegrass breeding schemes (Casler and Vogel 1999;Easton et al. 2002;Muylle et al. 2013), particularly in Europe, where Wilkins and Humphreys (2003) reported genetic improvement of approximately 10g kg -1 per decade for DM digestibility. Breeding to increase watersoluble carbohydrate (WSC) content in ryegrass herbage, one of few reported studies of successful breeding for a nutritive trait in perennial ryegrass (Humphreys 1989a;Jones and Roberts 1991;Smith et al. 1997), has been a major contributor to genetic improvement of digestibility (Wilkins and Humphreys 2003;Muylle et al. 2013). More recently, there has been increased emphasis on addressing digestibility through the improvement of fiber degradability per se, by targeting changes in the biochemical composition of the cell wall (Faville et al. 2010;Van Parijs et al. 2018).
Minerals and trace elements are essential elements for plant growth and are critical to various biological functions of the plant. In forages, these macro-and micronutrients are also important components of nutritive quality, critical for maintaining livestock health (Waghorn 2007). For example, metabolic disorders can be caused or contributed to by mineral imbalances in the diet, such as hypomagnesaemia (grass tetany) which is caused by insufficient magnesium and calcium in the diet. Earlier studies have identified genetic variation among families (Easton et al. 1997;Smith et al. 1999) or genotypic variation among cultivars (Crush et al. 2018a;Crush et al. 2018b) for micro-and macronutrients, indicating that breeding for mineral content is a realistic opportunity.
The reduced emphasis on breeding for nutritive traits in forages is affected by a number of factors, including a lack of consensus on specific breeding targets (Wheeler and Corbett 1989;Chapman et al. 2015), ambiguous evidence for the impact of specific nutritive traits on animal production outcomes (Easton et al. 2002;Edwards et al. 2007;Mcevoy et al. 2011), the confounding influence of environment and genotype x environment (G x E) interactions, and the significant additional resources needed in a breeding program to undertake nutritive trait measurements in large panels of selection candidates (Smith et al. 1997).
Genomic selection (GS), where breeding value for a trait may be costeffectively predicted for selection candidates using genome-wide markers, was initially proposed for animal breeding by Meuwissen et al. (2001). In GS, a training population combining phenotypic and genotypic information is used to develop a model that can subsequently be used to predict genomic estimated breeding values (GEBVs) for individuals in a test or selection population that have been genotyped only. In essence, GS replaces the need to phenotype the target trait in selection candidates for one or more cycles of recurrent selection. GS has been demonstrated in dairy cattle breeding, where the rate of genetic gain was doubled by reducing generation interval from 7 to 2.5 years or from 4 to 2.5 years, depending upon selection strategy (García-Ruiz et al. 2016). Over the last decade the declining cost of genotyping single nucleotide polymorphisms (SNPs), largely through reduced representation sequencing approaches such as genotyping-bysequencing (GBS) (Elshire et al. 2011), has made this tool feasible for plant breeding. GS is now being applied in major crop species, including wheat (Rutkoski et al. 2011;Poland et al. 2012;Lopez-Cruz et al. 2015;Hayes et al. 2017), maize (Zhao et al. 2012;Fristche-Neto et al. 2018) and barley (Zhong et al. 2009;Lorenz et al. 2012) and is under adoption in forage species, including perennial ryegrass (Fè et al. 2016;Grinberg et al. 2016;Byrne et al. 2017;Arojju et al. 2018;Faville et al. 2018;Pembleton et al. 2018), and alfalfa (Annicchiarico et al. 2015;Li et al. 2015;Biazzi et al. 2017;Jia et al. 2018), principally in terms of assessing the influence of training set size and composition, trait characteristics and genotyping approaches on predictive ability. In perennial ryegrass, so far only two studies have evaluated the possibility of using genome-wide markers to predict GEBVs for nutritive quality traits. Grinberg et al. (2016) reported predictive ability of 0.45 and 0.32 for WSC and N and Fè et al. (2016) reported predictive ability for NDF (0.68) and HMW WSC (0.45). A comprehensive study on a range of nutritive traits is still lacking.
GS can accelerate genetic gain particularly for complex traits, which are controlled by many genes with small effects and for traits which are difficult to measure and expensive (Heslot et al. 2015). GS is therefore a very attractive tool for nutritive traits, given the barriers, described above, to routine integration of nutritive traits into forage breeding programs. The success of GS primarily depends on predictive ability, which is influenced by trait heritability (h 2 n ), training population size, marker density, extent of linkage disequilibrium (LD) and relatedness between training and test population . While the heritability of a trait and the extent of LD in a training population cannot be easily optimized, the density of markers and the size and composition of the training population are two factors that can be controlled. Several methods have been developed for genomic prediction and can be broadly classified as whole-genome regression methods (discussed by De Los Campos et al. (2013)) or machine learning methods (outlined by González-Camacho et al. (2018)). Based on simulation and empirical results, Daetwyler et al. (2013) concluded that genomic best linear unbiased predictor (GBLUP) and Bayesian variable selection methods (BayesB and BayesCp) were the benchmark for genomic prediction, as these methods are appropriate for a range of genetic architectures, from traits which are controlled by many genes with small effects (infinitesimal model) to traits with large SNP effects (variable selection model).
The principle aim of the current study was to assess genomic predictive ability for 18 nutritive quality traits, measured in a large multi-population training set in two key New Zealand grazing environments, and to investigate the impact of marker density and of genomic prediction models with different prior assumptions regarding the distribution of SNP effects. The study also provided an opportunity to assess the magnitude of genetic variation and to estimate heritability for a large range of nutritive traits under New Zealand grazing environments.

Plant material and experimental design
The half-sibling (half-sib) families used in this study were derived from five different advanced breeding populations (Pop I -Pop V) which are part of the Grassland Innovation Ltd breeding program. From each population, 102 to 117 plants that tested positive for endophyte infection (Epichloё festucae var lolli) by immunoblotting (Hahn et al. 2003), were polycrossed in isolation during spring 2012 in Palmerston North, New Zealand (Faville et al. 2018). Polycrosses were performed separately for each population, without admixing, and seeds from the maternal parents were harvested and cleaned. In total 543 half-sib families were harvested for seed, however only 517 families had sufficient seed ($ 3.6g) for sowing field trials. Further details regarding population development are presented in Table S1.
A total of six trials were sown for measuring DMY (Faville et al. 2018), of which two were used for the current study. These were trials established at Lincoln (Canterbury region, southern New Zealand, 43.38°S 172.62°E; Wakanui silt loam) and Aorangi (Manawatu region, central New Zealand, 40.34°S 175.46°E; Kairanga sandy loam), during the autumn of 2013. The experimental design at each site was rowcolumn with three replicates. Within each trial, populations were blocked, and families randomized in three replicates within these blocks. Multiple repeated checks were also randomly allocated within and across the replicated blocks. Half-sib families were evaluated as a 1m row of plants (referred to from now as plots), by sowing 0.2 g of seed (which is equivalent to 14 kg ha -1 , if a sward was sown at 7 rows m -1 ). Space between the plots was 25-30 cm and 30-50 cm separated the ends of neighboring plots. Nitrogen and phosphate fertilizer was applied at the rate of 15-30 kg N ha -1 and 8.8 kg P ha -1 , in late autumn each year (Faville et al. 2018). Climate data regarding rain fall, sunshine hours and air temperature of the two locations during the trial period was is provided as Figure S1.

Phenotypic measurements
Plot harvests were undertaken at Lincoln starting April 14, 2014 (24 days post-defoliation by sheep grazing) and at Aorangi starting April 29, 2014 (25 days post-defoliation), during the southern hemisphere autumn. At each site a single harvest was undertaken over three days, between 10:30 AM and 3:00 PM on each day to minimize the influence of diurnal variation on levels of measured constituents. Split harvesting of populations or replicate blocks over two days was avoided. Plots were cut to a height of approximately 5 cm, above the pseudostem, meaning that only leaf lamina material was harvested. Harvested foliage was placed into micro-perforated plastic bread bags and immediately snap frozen in liquid nitrogen. Samples were subsequently maintained at ca. -80°on frozen CO 2 to preserve labile components and then freeze-dried at one of two commercial facilities -Genesis Bio-Laboratory Ltd (Christchurch, New Zealand) or Horowhenua Freeze-Dry (Levin, New Zealand). Freeze-dried samples were milled to powder through a 1mm sieve and thoroughly mixed to homogenize the sample. Subsamples were weighed out and transferred to Hill Laboratories (Hamilton, New Zealand) for near-infrared spectroscopy (NIRS) and minerals analysis and to AgResearch (Palmerston North, New Zealand) for analysis of water-soluble carbohydrate (WSC). A total of 3082 samples (n = 1476 from Lincoln and n = 1606 from Manawatu) were provided for analysis. Hill Laboratories provided NIRS data for a range of nutritional traits, as outlined in Table 1. Data for mineral concentrations (Table 1) were based on inductively coupled plasma-optical emission (ICP-OES) analysis of plant material digested with nitric acid: hydrogen peroxide (2:1). Grass tetany ratio was calculated as [K/(Mg + Ca)] using the data provided for the individual minerals. WSC was extracted and quantified as described by Hunt et al. (2005). Briefly, 25 mg of milled leaf material was extracted twice with 1mL of 80% ethanol (low-molecular-weight fraction; LMW WSC) and then twice with 1 mL water (high-molecularweight fraction; HMW WSC), for 30 min at 65°. Extracts were centrifuged, and supernatants of the respective fractions were analyzed using anthrone as a colorimetric reagent (Jermyn 1956).

Statistical models and variance components
Data analyses were performed across the five populations, for individual locations and across the two locations, using the restricted maximum likelihood (REML) method, by fitting a linear mixed model in GenStat (Payne et al. 2009). Analyses were also performed on the five populations individually, by fitting linear mixed models in DeltaGen (Jahufer and Luo 2018). Family, family-by-location interaction, replicates, rows and columns were considered as random effects, whereas location, population and repeated checks were considered as fixed effects. Three different mixed linear models were used: (i) Model 1, to estimate family variance components, pooling all five populations, all 517 families together, within individual locations; (ii) Model 2, for estimating family variance components and interactions of family and location, pooling all five populations, across locations; and (iii) Model 3, for estimating family variance and family-by-location interactions, among half-sib families within individual populations, across locations.
Model 1: Mixed model for individual locations.
n■ Table 1 Trait family (s 2 g ), family-by-location interaction (s 2 gl ) and residual error (s 2 e ) variance components and their associated standard errors (SE), repeatability (R) and genomic heritability (h 2 g ), estimated for the range of nutritive traits, among 517 half-sib families of perennial ryegrass evaluated across the two locations (Lincoln and Aorangi). All s 2 g for nutritive traits were significant (P < 0.05). y ijklno is the phenotypic value measured on half-sib family i in row j and column k of replicate l nested within population n, and i ¼ 1; . . . ; n g ; j ¼ 1; . . . ; n r ; k ¼ 1; . . . ; n c ; l ¼ 1; . . . ; n b ; m ¼ 1; . . . ; n s ; n ¼ 1; . . . ; n p , where g; r; c; b; and p are half-sib families, rows, columns, replicates and populations respectively. Where, m is the overall mean; g i is the random effect of half-sib family i, Nð0; Is 2 g Þ; p n is the fixed effect of population n; b nl is the random effect of replicate l in population n, Nð0; Is 2 b Þ; r nlj is the random effect of row j within replicate l of population n; Nð0; Is 2 r Þ; c nlk is the random effect of column k within replicate l of population n, Nð0; Is 2 c Þ; e ijkln is the residual effect of half-sib family i in row r and column c of replicate b of population n, Nð0; Is 2 e Þ. Model 2: Mixed model for across locations.
y ijklmn is the phenotypic value measured on half-sib family i in row j and column k of replicate l nested in location m within population n, and i ¼ 1; . . . ; n g ; j ¼ 1; . . . ; n r ; k ¼ 1; . . . ; n c ; l ¼ 1; . . . ; n b ; m ¼ 1; . . . ; n s ; n ¼ 1; . . . ; n p , where g; r; c; b; s and p are halfsib families, rows, columns, replicates, locations and populations respectively. In the equation, m is the overall mean; g i is the random effect of half-sib family i; Nð0; Is 2 g Þ; s m is the fixed effect of location m; ðgsÞ im is the random effect of interaction between half-sib family i and location m, Nð0; Is 2 gs Þ; p n is the fixed effect of population n; b nml is the random effect of replicate l within location m in population n; Nð0; Is 2 b Þ; r nmlj is the random effect of row j within replicate l in location m of population n, Nð0; Is 2 r Þ; c nmlk is the random effect of column k within replicate l in location m of population n, Nð0; Is 2 c Þ; e ijklmn is the residual effect of half-sib family i in row r and column c of replicate b in location m of population n, Nð0; Is 2 e Þ. Model 3: Mixed model for individual populations.
y ijklm is the phenotypic value measured on half-sib family i in row j and column k of replicate l nested in location m. In the equation, m is the overall mean; g i is the random effect of half-sib family i, Nð0; Is 2 g Þ; s m is the fixed effect of location m; ðgsÞ im is the random effect of interaction between half-sib family i and location m, Nð0; Is 2 gs Þ; p n is the fixed effect of population n; b ml is the random effect of replicate l within location m, Nð0; Is 2 b Þ; r mlj is the random effect of row j within replicate l in location m; Nð0; Is 2 r Þ; c mlk is the random effect of column k within replicate l in location m, Nð0; Is 2 c Þ; e ijklmn is the residual effect of half-sib family i in row r and column c of replicate b in location m, Nð0; Is 2 e Þ. The variance components estimated based on the mixed model analysis were used to calculate repeatability (Model 2) (Falconer 1960) and narrow sense heritability (Model 3) for each trait. Repeatability was based on family variance estimated across five populations, whereas narrow-sense heritability is based on additive genetic variance among half-sib families within each population. Repeatability and narrow sense heritability, on a family mean basis, were estimated using the equation: Where, R and h 2 n are repeatability and narrow-sense heritability. For repeatability, s 2 g was the family variance among all the 517 half-sib families. In the estimation of narrow-sense heritability, s 2 g was the estimated additive genetic variation among halfsib families within a specific population, s 2 gs is the variance associated with family-by-location interaction and s 2 e is the variance of residuals.

Genotypic and phenotypic correlation
The genotypic correlation among traits was estimated as proposed by Falconer (1960). Multivariate analysis of variance (MANOVA) was performed in DeltaGen (Jahufer and Luo 2018), using the multilocation trait data from half-sib families, to estimate variance and covariance among traits: Where, Cov gðx;yÞ is the genotypic covariance between trait x and y; s 2 ðxÞ is the variance associated with trait x, and s 2 ðyÞ is the variance associated with trait y. Phenotypic correlation was performed in DeltaGen (Jahufer and Luo 2018) using the best linear unbiased predictors (BLUPS) estimated based on Model 2.

Genotyping and genomic heritability
All maternal parents of the 517 half-sib families were genotyped using a GBS approach described in Faville et al. (2018), following the protocol proposed by Elshire et al. (2011). Briefly, a reference ryegrass genome assembly was constructed using scaffolds from a published ryegrass assembly (Byrne et al. 2015). Scaffolds were aligned to the barley genome using Lastz version 7.0.1 (Harris 2007) from Geneious 8 (https:// www.geneious.com/, (Kearse et al. 2012)) with default parameters.
Demultiplexing of sequencing reads was performed using the TASSEL 5.0 GBS pipeline (Glaubitz et al. 2014) and initial quality control was based on read count statistics. The quality GBS tags were aligned to the reference genome using Bowtie2 (Langdon 2015). Genotype calling was performed using the TASSEL GBS pipeline to obtain 1,093,464 SNPs, after filtering for maximum missing SNPs per site (50%), minor allele frequency (. 0.05) and read depth (. 1) using VCF tools (Danecek et al. 2011). Genotyped 1,093,464 SNPs were exported and filtered for Hardy-Weinberg disequilibrium (HWdiseq . -0.05). The resulting 1,023,011 SNPs, with a mean read depth of 2.98, were used to compute a genomic relationship matrix (GRM) based on the kinship using depth adjustment (KGD) protocol proposed described by Dodds et al. (2015) and implemented by Faville et al. (2018). The KGD method enables the unbiased estimation of relatedness using low depth sequence data. In the KGD approach, the data are considered to be a set of allele reads rather than genotypes per se. In addition, the elements of the GRM are calculated using only those SNPs which have reads in both individuals. This allows the GRM to be calculated without needing to impute missing values. The KGD matrix was used for genomic predictive modeling. The 1,093,464 genotyped SNPs were also mean imputed using A.mat function in rrBLUP package (Endelman 2011) and the GRM was calculated using these imputed SNPs based on method proposed by Vanraden (2008). Genomic heritability (h 2 g ) was calculated using Equation 4, based on variance components estimated using the mixed model proposed in Equation 2. In the model, the KGD matrix was fitted as variancecovariance among genotypes (De Los Campos et al. 2015) and the genetic variance was calculated as proportion of variance explained by regressing markers on phenotypes. The model was fitted in ASreml-R (Butler et al. 2009).

Genomic prediction modeling
Three whole-genome regression methods, with two different prior assumptions regarding the distribution of marker effects, were used for generating GEBVs. The first method was a univariate linear mixed model, called GBLUP (Goddard et al. 2011) in which markers effects were assumed to have equal variance. The linear model can be expressed follows: Where y is the vector of BLUP values of the trait, b is the vector of grand mean, Z is the design matrix associated with random marker effects m, with m Nð0; s 2 m GÞ, in which G is the additive genetic relationship matrix, and e Nð0; s 2 e IÞ, in which I is the identity matrix. The G matrix was calculated based on the method proposed by Vanraden (2008); Endelman and Jannink (2012) using A.mat function in rrBLUP package (Endelman 2011).
The second method is a variant of GBLUP method with KGD matrix as G in the linear mixed model. The GBLUP and KGD-GBLUP models were fitted using the rrBLUP package (Endelman 2011), implemented through R programming language (R Core Team 2017). The equations used to calculate the KGD G matrix are detailed in Dodds et al. (2015).
The third method was BayesCp (Habier et al. 2011), in which markers effects can depart from normality, that is, large variances are allowed for markers with larger effects.
The model is expressed as follows: Where y is the vector of BLUP values of the trait, b is the vector of grand mean, k is the number of makers, Z k is the vector of genotypes at marker k; a k is the additive effect of the marker, and e is the vector of residual effects with a normal distribution Nð0; s 2 e Þ. The BayesCp model was implemented through R programming using the BGLR package (Pérez and De Los Campos 2014), with the number of burn-ins set to 2,000, total number of iterations set to 10,000, and other parameters set to default (Pérez and De Los Campos 2014).
The predictive ability of the models based on data from the composite training population was assessed by a ten-fold cross validation approach. For each cross validation, randomized data were divided into ten equal parts, of which nine parts (training set) were used to train the model and to predict GEBVs in the remaining one part of the data (test set). Randomization of the complete data set was repeated five times and the mean of the five iterations was reported as the predictive ability of the model (Faville et al. 2018).

Evaluating predictions in individual populations
As the overall training population is a composite of 517 individuals and their corresponding half-sib families from five discrete breeding populations, the predictive ability of the prediction models was also assessed within each individual population using KGD-GBLUP. A random 50% of individuals was selected from within each population (Pop I -Pop V; total = 255 individuals) as a training set in order to represent each population equally. Using this set of 255 individuals to train the model, GEBVs were then predicted in the remaining 50% of Pop I and the mean correlation of 500 iterations was considered as the predictive ability for this population. This approach was likewise extended to each of the other four populations.

Optimizing marker density
To evaluate the minimum number of markers needed to achieve maximum predictive ability for each nutritive trait, a random set of markers ranging from 1,093,464 (100%, unfiltered) to 1,093 (0.1%) in 10 steps were obtained from the training population. Using each set of randomly selected markers, a G matrix was computed based on the method proposed by Vanraden (2008) using the rrBLUP package (Endelman 2011). Considering the computational load, KGD method was not extended to randomly selected markers, to construct G matrix. Faville et al. (2018) reported broadly similar predictive ability for DMY in this training population, when G matrices based on Dodds et al. (2015) and Vanraden (2008) were compared. The G matrix was used in a GBLUP model to estimate predictive ability for each randomly chosen marker set. The predictive ability was assessed via Monte-Carlo cross validations with 500 iterations, where 80% of the data were used to train the model (training set) and 20% to predict the GEBVs (test set).

Data availability
Supplementary material file contains Tables S1 -S11 and Figure S1 generated in the current study to draw conclusions. File S1 contains genome-wide SNP data, File S2 is the genomic relationship matrix calculated using the method proposed by VanRaden (2008), File S3 is the KGD relationship matrix calculated using method proposed by Dodds et al. (2015) and File S4 contains phenotypic data used for the analysis. Supplemental material available at figshare: https://doi.org/ 10.25387/g3.10074323.

RESULTS
Variance components, repeatability, and genomic heritability There was significant (P , 0.05) variance among 517 half-sib families from five populations for all traits, based on mean performance across the two locations, Lincoln and Aorangi (Table 1, Table S2, S3 and S4). There were also significant (P , 0.05) family-by-location interactions for all the traits, indicating a relative change in ranking among the 517 half-sib families between the two locations. There was a high genotypic correlation (r = 0.93) between R and h 2 g in the across-location dataset and these ranged from a low of 0.26 (R) and 0.22 (h 2 g ) for traits N and P to a high of 0.75 (R) and 0.74 (h 2 g ) for Na (Table 1) across the two locations. Genotypic correlation between R and h 2 g was slightly lower in Aorangi (r = 0.85) compared with Lincoln (r = 0.93). Because of the high correlation between R and h 2 g and because h 2 g captures markerbased additive variance, from here on results for h 2 g only are reported and discussed. Overall, h 2 g estimated within a location was substantially higher at the Aorangi site than Lincoln (mean of all traits h 2 g = 0.62 and 0.43, respectively) (Table S3 and S4), with values from the acrosslocation analysis (h 2 g = 0.42) lying between those of Lincoln and Aorangi. Traits with low h 2 g tended to have relatively large family-by-location interactions, whereas those with high h 2 g had a low family-by-location interactions (Table 1). Variance component analysis within the two locations (Lincoln and Aorangi) indicated significant (P , 0.05) family variance for all 18 nutritive traits. Differences in family variance were observed for the same trait among the five populations in the across location dataset (Table S5-S9). For example, family variance was nonsignificant (P . 0.05) for ADF, NDF and DOMD in Pop I & II, but was significant for these traits in Pop III -V (Table S5-

Correlation among traits
Genotypic and phenotypic correlation coefficients for all nutritive quality traits are shown in Tables 2 and S10, respectively. Strong, positive genotypic correlation was observed between fiber measures ADF and NDF and these in turn were negatively correlated with energy traits including ME, DOMD and WSC (Tables 2 and Table S10). A positive genotypic correlation was estimated for both LMW WSC and total WSC with DOMD, however, a weak positive correlation was found between HMW WSC and DOMD. A strong negative genotypic correlation was observed for both ADF and NDF with both LMW WSC and total WSC. A moderate genotypic correlation was observed between fiber traits (ADF and NDF) and minerals traits including K, Mg and Mn (positive), P and Ca (negative).

Genotyping-by-sequencing
The results of GBS were described in detail by Faville et al. (2018). Population structure was analyzed using multi-dimensional scaling based on genomic relationship matrix (see Figure 1 in Faville et al. (2018)), revealing structure describing the five individual populations used in the training set and, with two exceptions, no structure within each population. The exceptions were the presence of eight apparently outlying individuals associated with Pop V and six with Pop II; these were confirmed as being true to their respective populations (Faville et al. 2018) and so retained for genomic prediction analyses. The percentage of missing SNPs in each population were between 22-24%, with no particular population skewed toward higher missing rate. The missing SNPs were further investigated to see if they were population specific and found that it was not the case in the current study.

Predictive ability for nutritive traits
Predictive ability for all nutritive traits was evaluated using GBLUP, KGD-GBLUP and BayesCp genomic prediction models, and the results are summarized in Figure 1 as the Pearson correlation coefficient between observed (adjusted means) and predicted values. There were no significant differences (P . 0.05) in terms of predictive ability between GBLUP, KGD-GBLUP and BayesCp across all nutritive traits (Figure 1). Although slight differences can be noted from the Figure 1, no single statistical approach consistently gave higher predictive ability for all nutritive traits. Because the results from the three models were largely indistinguishable, from here on results from KGD-GBLUP only are reported and discussed. Using the adjusted phenotypic trait means (BLUPs) estimated across both locations, predictive ability for all traits was positive and was strongly correlated with h 2 g (r = 0.65). The highest predictive ability observed was for Na and S (both r = 0.45), followed by CFAT (0.38) (Figure 1). The lowest predictive ability was noted for P (0.16), followed by DOMD with a value of 0.22 (Figure 1). The slope of the regression model (bias) for all nutritive traits was around 1, meaning unbiased estimates were obtained by regressing GEBVs on adjusted means (BLUPs) ( Table S11).
Predictive ability of models based on phenotypic means from Lincoln only (location-specific predictive ability) was negative to low and showed a very high correlation with h 2 g (r = 0.93) ( Table S3). The highest predictive ability was obtained for Na (0.35), similar to the across locations analysis, and the lowest predictive ability was for ADF with a negative accuracy of -0.06. Predictive ability of models using phenotypic data from Aorangi were generally higher than both the Lincoln and acrosslocation models (Table S4) and the correlation between h 2 g and predictive ability was 0.67. In this dataset the highest predictive ability was for HMW WSC (0.56) and lowest predictive ability was for Ca (0.16) (Table S3).
In terms of different trait categories, for the measures of fiber content, ADF and NDF, predictive ability of the across-location models was moderate, at 0.24 and 0.36 respectively. There was a strong effect of location on these traits, with moderate predictive ability at Aorangi (ADF = 0.29 and NDF = 0.35) whereas at Lincoln, the predictive ability was almost zero for NDF (0.02) and negative for ADF (-0.06) (Table S3 and S4).
The traits DOMD, CFAT, WSC (LMW, HMW and total) and ME were grouped as energy traits in this study. Predictive ability for energy traits in the across location analysis was generally low to moderate, with CFAT (0.38) and LMW WSC (0.34) the highest, and DOMD (0.22) and n■ Table 2 Genotypic correlations among a range of nutritive quality traits measured from 517 half-sib families, estimated using data from across two locations (Lincoln and Aorangi) HMW WSC (0.23) low ( Figure 1). As with the fiber traits, the ranking of predictive ability for CFAT varied by environment and was highest in Lincoln and in across-location analysis, whereas predictive ability for CFAT ranked fourth highest in Aorangi. By contrast, the predictive ability estimated for DOMD was ranked similarly (fifth highest) for Lincoln and Aorangi.
The predictive ability of genomic prediction models for mineral traits assessed in this study was generally high, with Mg, Na and S consistently ranked highest in terms of predictive ability within the two locations (Lincoln and Aorangi) and in across-location analysis. The lowest h 2 g was observed for P, which was reflected in the predictive ability of prediction models for Lincoln and across-location analysis. Models for tetany ratio ([K/(Ca+Mg)]), a predictor of hypomagnesaemia risk in livestock, had a predictive ability of 0.34 across locations, 0.29 at Lincoln and 0.18 at Aorangi.
The measures CP and N are both indicative of protein content, with crude protein a derivate of measured N, obtained by multiplying N by a conversion factor of 6.25 (Waghorn 2007), hence predictive ability estimated within and across locations was highly similar for both the traits. Predictive ability for these traits was low to moderate, at 0.28 (CP) and 0.26 (N) in the across location analysis, 0.14 for both traits at Lincoln and 0.20 and 0.21 for CP and N at Aorangi.
Genotyping efficiency impacts the design and overall cost of implementing GS in a breeding program. For all nutritive traits, a steady decline in predictive ability was observed from 100% (1,093,464) to 0.5% (5,467) markers and a rapid decrease in predictive ability was noted from 0.5 to 0.1% (1,093) ( Figure 2 and Table S11). Overall, reducing the marker number to 5% (54,673) of the total available SNPs had minimal impact on overall predictive ability (Figure 2 and Table S11). Further reductions in marker number resulted in losses in predictive ability, the extent of which varied by trait (Table S11). For example, with 10,934 markers (1% of the total dataset) the predictive ability for LMW WSC, HMW WSC and total WSC decreased by 3%, 7% and 4%, respectively compared to the total dataset (100%) (Figure 2). At 1,093 markers (0.1%) the predictive ability for these traits declined further although the absolute values were still positive, at 0.31 for LMW WSC, 0.18 for HMW WSC and 0.26 for total WSC (Figure 2). The decay in predictive ability was typically highest for those traits which had low h 2 g and low predictive ability under the full SNP dataset. For example, between the highest and lowest marker number datasets there was a 36% decrease in predictive ability for P (h 2 g = 0.22), while for S (h 2 g = 0.53) there was a 14% decrease in predictive ability (Table S11).
The training population used in this study is a composite of five different breeding populations, with differing genetic relationships (see Figure 1 in Faville et al. (2018)). The predictive ability of a model constructed based on a composite training set, for each of the individual populations is therefore an important consideration. Cross-validations were conducted within the individual populations using the protocol reported by Faville et al. (2018). Predictive ability varied among the populations (Figure 3). For example, predictive ability for ADF ranged from 0.13 to 0.24 among the five populations (Figure 3). The majority of predictions were positive across all populations, with the exception of K for Pop I, and only LMW WSC and P in Pop II had notably poor predictive ability (Figure 3). No population was superior for genomic prediction of all nutritive traits. However, Pop V returned the highest predictive ability overall (mean predictive ability of Pop V = 0.29, compared with 0.30 in the training set, TP), followed by Pop III, Pop I, Pop IV, and Pop II (Figure 3).

DISCUSSION
Nutritive quality traits in forages are important for animal productivity and for maintaining livestock health and are therefore important targets for genetic improvement in perennial ryegrass. Nutritive traits can be expensive to measure and are labor-intensive, hindering the improvement of these traits by conventional breeding methods. Genomic selection (GS), the use of genome-wide molecular markers for the prediction of breeding values in selection candidates, is well suited for traits that are costly and difficult to phenotype (Heffner et al. 2009;Jannink et al. 2010) and therefore represents a promising approach for enabling cost-effective improvement of nutritive traits in forages. In this study we demonstrate that GS is a strong prospect for improvement of nutritive quality traits as assessed by cross-validation predictive abilities estimated for 18 nutritive traits in a multi-population training set. Furthermore, the extensive phenotypic dataset, collected from two contrasting environments, has enabled the contribution of family, location, and family-by-location variance components to be estimated across a large range of nutritive traits.
Several methods for GS have been proposed for both plant and animal breeding, including GBLUP, Bayesian alphabets (BayesA, BayesB and BayesCp), Ridge Regression (RR) BLUP, Random Forest, Support Vector Machine and deep learning through Multilayer Perceptron's and Convolutional Neural Networks (De Los Campos et al. 2013;Crossa et al. 2017). Both simulations and empirical data suggests that linear models are superior in terms of predicting GEBVs at higher accuracy (Daetwyler et al. 2010;De Los Campos et al. 2013;Byrne et al. 2017;Bellot et al. 2018;Faville et al. 2018). In this study, we compared three linear models characterized by two different assumptions with respect to the distribution of variance for marker effects. In GBLUP and KGD-GBLUP all marker effects are shrunk equally, assuming the predicted trait is controlled by many markers with small effect (Goddard et al. 2011), whereas BayesCp assumes that the trait is a mixture of distributions with large and small effect markers (Habier et al. 2011). Even with different prior assumptions, Figure 1 illustrates the similarity in predictive ability among the three methods for all nutritive traits, with only minor differences (Figure 1). Through simulation and empirical data, De Los Campos et al. (2013) pointed out that the superiority of Bayesian variable selection models can be illustrated when applied to a trait with large effect quantitative trait loci (QTL). The lack of improvement in predictive ability under the BayesCp model observed here may reflect a complex genetic architecture for the nutritive traits studied, which are likely controlled by many genes with small effects. For instance, QTL studies in perennial ryegrass reported 25 loci for WSC (Cogan et al. 2005;Turner et al. 2006;Shinozuka et al. 2012;Gallagher et al. 2015), however genetic variation explained by the multiple QTL was no more than 20%, suggesting that genetic control of WSC may tend toward an infinitesimal model.
The success of GS primarily depends on the predictive ability of the genomic prediction model, which is influenced by h 2 n , training population size, linkage disequilibrium (LD), genetic diversity within the training population and relatedness between training and test set Crossa et al. 2017;Arojju et al. 2018). Traits with low h 2 n need a larger training population to achieve the same level of predictive ability as a trait with higher h 2 n . Results from our study indicate that predictive ability estimated by cross-validation and h 2 n will not be a limiting factor for implementing GS for nutritive traits in perennial ryegrass, as predictive ability and various measures of heritability (R, h 2 g and h 2 n ) were moderate to high for most traits (Table 1,  Table S3-S9 and Figure 1). A strong positive correlation was observed between predictive ability and h 2 g for traits at the individual locations (Aorangi and Lincoln) and in the across-location analysis, confirming previous findings  and suggesting that genomic prediction can be more accurate for highly heritable traits. A positive correlation between predictive ability and heritability was also previously observed for nutritive traits in switchgrass (Fiedler et al. 2018) and alfalfa (Jia et al. 2018), as well as for crown rust and heading date in perennial ryegrass (Arojju et al. 2018) and for fruit quality traits in apple (Muranty et al. 2015).
For most traits, h 2 g at Aorangi was consistently higher compared to Lincoln, and consequently higher predictive abilities were observed. This difference between locations was due to a combination of the family variance component estimated at Aorangi being higher and estimates of trait-associated experimental error being higher at Lincoln (Table S3 and S4). While it is not possible to conclusively determine the basis of this disparity in experimental error, it may be explained by greater within-environment variability at Lincoln, due to factors such as climatic variations over the sampling period ( Figure S1) soil heterogeneity or operator-to-operator variations. The predictive ability for a given trait also varied among each of the individual populations. For some traits, a positive correlation was observed between family variance and predictive ability but for the majority of traits there is no such relationship (Figure 3 and Table S5 -S9). Another factor that may have affected prediction outcomes was the experimental design. While every practical effort was made to control for diurnal variation in levels of measured constituents, the relatively wide daily sampling window (4.5 hr) used will affected the accurate estimation of breeding values (BLUP's). The application of a two-phase design (Smith et al. 2006) accommodating both field and laboratory sources of variation in the experiment, may also have been beneficial in terms of the accuracy of the estimated breeding values and should be considered for future studies of this nature. Finally, another contributing factor could be the inability of the marker dataset to capture differences in allelic effects between the populations. A recent study has shown that across cattle breeds (Holstein and Jersey) genomic predictive ability can be improved, using a pre-selected marker set based on GWAS performed on combined breeds using whole-genome sequencing data (Raymond et al. 2018). This approach could be extended to improve across population prediction ability in perennial ryegrass. Finally, in the current study, a two-stage GS approach was undertaken, whereby in stage one adjusted means (BLUPs) were calculated from field trial data and, in stage two, these BLUPs were used as a dependent variable in the model to predict GEBVs based on markers. The two-stage GS approach is a commonly-used procedure, in part because it is a computationally -efficient method for analyzing large datasets from multi-environment trials (Piepho et al. 2012). This approach has consequently been widely used across many plant species for GS application (Lipka et al. 2014;Annicchiarico et al. 2015;Fernandes et al. 2018;Sun et al. 2019). However, the double shrinkage imposed by the two-stage GS system is considered a major limitation, which may lead to less reliable GEBVs if not de-regressed (Ostersen et al. 2011). As opposed to two-stage, in a single-stage GS method the adjusted means are shrunk only once and this is hence considered the 'gold standard' approach (Schulz-Streeck et al. 2013). Implementing single-stage GS has its own limitations, as it cannot be adopted for models beyond general and generalized linear mixed models, and these models are computationally demanding with increased complexity of the dataset. In two-stage GS, to overcome the limitation of double shrinkage, Garrick et al. (2009) used de-regressed BLUPs in the genomic prediction model. However, Sun et al. (2017); Sun et al. (2019) suggested that, when lines are balanced across replicates in each environment (which was the case in our study), differential shrinkage of BLUPs will not be an issue, when used as a dependent variable in genomic prediction models. The use of single-stage approaches should be considered for future application of GS in ryegrass, with trials carefully designed to accommodate this.
In contrast to switchgrass (Fiedler et al. 2018) and alfalfa (Jia et al. 2018), prior to this study genomic predictive ability for nutritive traits has been evaluated in perennial ryegrass for only a limited set of traits. Fè et al. (2016) reported high predictive abilities of 0.68 for NDF and 0.45 for fructan in a large training set of 1918 F 2 families, evaluated at multiple environments. In another study, Grinberg et al. (2016) reported similarly high predictive abilities for WSC (0.59), DMD (0.41) and N (0.31) from prediction models applied in F 14 generation families after training using a set of 364 families from earlier generations, phenotyped at a single location. Predictive ability for nutritive traits in the present study were overall lower compared to those reported by Fè et al. (2016) and Grinberg et al. (2016) with predictive abilities of 0.35, 0.29 and 0.22 for NDF, total WSC and HMW WSC (fructan), respectively. The lower predictive ability was likely affected by the smaller training population used in this study compared to Fè et al. (2016), as well as its composite nature. Overall, the values in the current study, based on a relatively small, composite training set were sufficiently high to support prediction of GEBVs and implementation of GS to accelerate genetic gain for nutritive traits across environments in perennial ryegrass.
Determining the magnitude and genetic basis of G x E interactions for a trait is important, as it can assist in making appropriate breeding design decisions for the development of cultivars that are adapted to a broad range of target environments. In the current study family-bylocation interactions were significant for all nutritive quality traits. The majority of traits displayed a family-by-location interaction that was small in comparison to family variance, when nutritive traits were evaluated at two distinct locations (Table 1). This was reflected in the ratios of s g to s gs , which was . 1 for 60% of the traits, indicating that the family variance was predominant. However, the ratio for CFAT, total WSC, LMW WSC, HMW WSC, P and N were , 1, indicating a greater influence of family-by-location interactions. The identification of high G x E interactions for WSC contrasts with results reported by Easton et al. (2009) and are at variance with propositions by Casler and Vogel (1999) and Jafari (2012), that G x E for WSC is minimal to negligible. Our results are based on relatively large populations of half-sib families, compared to previous studies and may therefore be a more accurate reflection of the influence of family-by-location interactions on these traits, particularly in New Zealand environments. However, it should be noted that the family-by-location interactions estimated here were based on only two locations, and a more robust estimation would be derived if based on a larger number of locations, representing the full target population of environments. The presence of G x E interactions may negatively influence ability to improve these traits for broad adaptation and represents a challenge during selections (Holland et al. 2003).
Where family-by-location interaction effects are large and significant, genetic improvement for a trait may only be achieved through selection based on multi-year, multi-environment evaluation. Considering the relatively high costs associated with phenotyping of nutritive quality traits, this approach might not always be feasible, and decisions will be based on available resources. Genomic selection, however, represents a promising approach to more directly tackle family-bylocation interactions. Models such as marker-by-environment interactions proposed by Lopez-Cruz et al. (2015) and further developed by Crossa et al. (2016), can be used to identify genomic regions that are stable across environments and other regions that are associated with specific environments that contribute to G x E interactions. These marker effects can be fixed in GS models to assist the selection of stable genotypes. However, these models were primarily developed for wheat, and a detailed investigation is needed to assess models perform in outcrossing species such as perennial ryegrass.
Traits with high family-by-location interactions displayed both lower h 2 g and comparatively low predictive abilities (Table 1, Table S1-S2 and Figure 1). For such traits multi-trait genomic prediction models (Jia and Jannink 2012) may be one way of improving predictive ability and thereby genetic gain. The concept of multi-trait genomic prediction approaches is to improve the predictive ability of a primary target trait (which may be difficult and expensive to phenotype) by utilizing the genetic correlation with a secondary trait which is highly heritable and significantly less expensive to phenotype. Heritability and genotypic correlation data generated in the current study may assist in designing multi-trait prediction models for key nutritive traits. For example, a negative genetic correlation was observed between fiber and WSC traits, as reported previously in Italian ryegrass (Wang et al. 2015), and a positive genetic correlation was observed between DOMD and WSC traits as described previously by Humphreys (1989b); Jafari et al. (2003b) (Table 4). These secondary traits (ADF, NDF and DOMD) are measured routinely and relatively inexpensively by NIRS and may therefore be useful in multi-trait genomic prediction models to more accurately predict WSC traits that are most accurately measured using more expensive wet chemistry methodologies.
Mineral composition of forages is of interest from a perspective of livestock health and, as with nutritive traits overall, there has been little or no emphasis on selection for mineral composition in forage breeding programs (Masters et al. 2019). Significant family variation was observed for all minerals in this study, with relatively low influence of family-by-location interactions, moderate to high heritability and genomic prediction models with predictive abilities high in comparison to the other nutritive quality traits assessed (Figure 1). This indicates that selective breeding for levels of micro-and macro-minerals is feasible and that genomic selection represents a strong option for pursuing improvement in these traits. In general, ryegrass cultivars that grow well under low soil P will compete less for P in the sward, increasing P availability for uptake to support legume growth (Easton et al. 1997;McDowell et al. 2011). For instance, Crush et al. (2006), reported that in a mixed sward of ryegrass and clover (18% clover content), net annual flux of P into ryegrass was 4.7 times higher compared to clover. A small improvement in ryegrass phosphate use efficiency (PUE), can significantly change these proportions and may have large environmental and economic benefits (Crush et al. 2018a). In the current dataset predictive ability for P was very low (0.13), underpinned by a significant familyby-location interaction component to total phenotypic variation. This indicates that breeding for this P levels in perennial ryegrass foliage needs to be designed to account for family-by-location interaction effects. Alternatively, moderate to high genetic correlation with high h 2 g traits, such as Mg (genotypic correlation -0.62), might support an indirect multi-trait genomic selection strategy, as discussed earlier.
Hypomagnesaemia or grass tetany is a metabolic disorder in ruminants, caused by inadequate supply of Ca and Mg. This is often described in terms of a tetany index ([K/Ca+Mg]), for which values exceeding 2.2 (Kemp and T Hart 1957) are associated with increased risk of the disorder. We observed a moderate predictive ability for the ratio and the magnitude of family-by-location interaction was low compared to family variation, suggesting that tetany ratio could be used successfully as a selection criterion for developing cultivars with reduced potential for the incidence of hypomagnesaemia. This is in contrast to the results of Smith et al. (1999), who reported large G x E variance for the tetany ratio evaluated at two locations in Australian environments and suggested the use Mg alone as a selection criteria to improve tetany ratio. Results from the current study showed a high predictive ability for Mg, making GS a viable strategy for this trait. Although, increasing Mg concentration alone may be sufficient to decrease the incidence of hypomagnesaemia, the presence of a positive correlation between Mg and K observed in the current study (Table 2) and reported by Smith et al. (1999), suggests that selections based on Mg concentration alone should be monitored and might not always give the expected outcome.
Using approximately 50k random markers the predictive ability of genomic prediction models for all nutritive traits was similar to using the full dataset of ca. 1M markers (Figure 2 and Table S11). Previously, Faville et al. (2018) showed that reducing SNP number, by filtering out sites based on different missing data thresholds (1%, 10% and 50%), did not significantly affect genomic predictive ability for the traits herbage accumulation or days-to-heading in this training set. Similarly, filtering GBS data based on a read depth threshold (.7) did not improve predictive ability for those traits. To investigate the minimum number of SNP markers needed to achieve maximum predictive ability within the current dataset, without introducing bias in terms of data missing-ness, random marker sets with varying numbers of SNPs were used to build genomic prediction models for all nutritive traits, using the across locations dataset. Below the 50k marker number there was a decrease in predictive ability, and this was particularly evident for traits with low h 2 g . Considering the low levels of LD (r 2 decaying to 0.25 after 366-1750 base pairs (Faville et al. 2018)) observed in the component populations of the training set, the major proportion of predictive ability is likely a result of capturing relationship among individuals, rather than historical LD with QTL. In perennial ryegrass, to capture genetic variance associated with all causative QTL a very large number of markers and a large training population are needed, due to rapid decay of LD as a result of a very large past effective population size (N e ) (Hayes et al. 2013;Fiedler et al. 2018). Predictive ability based on relatedness between training and selection population can deteriorate after a few selection cycles (Habier et al. 2007), and to maintain adequate predictive ability, either the training population should be very large and highly diverse or some form of relatedness should exit between training and selection population (Hayes et al. 2013;Norman et al. 2018).
In conclusion, family variation and family-by-location interactions were significant for all nutritive quality traits evaluated in two distinct New Zealand environments. The predictive ability of genomic prediction models reported in this study for most of the traits would be sufficient to implement GS for nutritive traits in perennial ryegrass. Although a major proportion of this predictive ability is the result of capturing relatedness among individuals, maintaining relatedness between training and selection population would be an option to implement GS in perennial ryegrass. Predictive ability for most of the nutritive traits was retained even with as few as 50,000 markers. A next step would be to simulate a cost-benefit analysis to study the implications of manipulating marker number for cost-effective GS. For traits with low G x E interactions, single-trait genomic prediction models can be considered and for traits with large G x E, and consequently lower heritability, multi-trait approaches may be useful to explore as a method for obtaining high levels of prediction accuracy. This appears to be particularly important for WSC, which is considered to be one of the primary constituents of nutritive value for forages.