Single-step genomic BLUP enables joint analysis of disconnected breeding programs: an example with Eucalyptus globulus Labill

Abstract Single-step GBLUP (HBLUP) efficiently combines genomic, pedigree, and phenotypic information for holistic genetic analyses of disjunct breeding populations. We combined data from two independent multigenerational Eucalyptus globulus breeding populations to provide direct comparisons across the programs and indirect predictions in environments where pedigreed families had not been evaluated. Despite few known pedigree connections between the programs, genomic relationships provided the connectivity required to create a unified relationship matrix, H, which was used to compare pedigree-based and HBLUP models. Stem volume data from 48 sites spread across three regions of southern Australia and wood quality data across 20 sites provided comparisons of model accuracy. Genotyping proved valuable for correcting pedigree errors and HBLUP more precisely defines relationships within and among populations, with relationships among the genotyped individuals used to connect the pedigrees of the two programs. Cryptic relationships among the native range populations provided evidence of population structure and evidence of the origin of landrace populations. HBLUP across programs improved the prediction accuracy of parents and genotyped individuals and enabled breeding value predictions to be directly compared and inferred in regions where little to no testing has been undertaken. The impact of incorporating genetic groups in the estimation of H will further align traditional genetic evaluation pipelines with approaches that incorporate marker-derived relationships into prediction models.


Introduction
Quantitative genetics has utilized pedigree-derived relationships to improve selection efficiencies of breeding programs for many years (Henderson 1975). Well-defined relationships within the populations improve the accuracy of breeding value estimates, the partitioning of genetic variance, and the estimation of environmental effects. Forest tree breeding has delivered greater gains over time as the relationships amongst experimental units have increased in precision from open-pollinated (OP) families to control-pollinated (CP) families (McKeand and Bridgwater 1998), to progeny themselves ("individual model"; Borralho 1995), to clonally replicated progeny (Costa e Isik et al. 2005;Callister and Collins 2008;Baltunis et al. 2009).
Genetic markers such as single nucleotide polymorphisms (SNP) provide empirical rather than expected relationships by using estimates of allele similarity between pairs of individuals (VanRaden 2008). Genotyping provides direct estimates of relationship coefficients that may replace or be combined with expected values derived from the documented pedigree.
Relationships determined by genotyping have been used to recover nonadditive genetic variances from OP tree families (Klá p st e et al. 2014;Gamal El-Dien et al. 2016) and remove inflation in half-sib-based additive genetic variance estimates (Klá p st e et al. 2014; Ratcliffe et al. 2017). Large-scale genotyping in OP families has also produced substantial improvements to breeding value accuracy (Gamal El-Dien et al. 2016;Thavamanikumar et al. 2020) and accounted for inbreeding depression (Klá p st e et al. 2017depression (Klá p st e et al. , 2018. In hybrid tree populations, genotypic relationships have revealed the magnitude of epistatic variation and partitioned genetic variances more precisely (Bouvet et al. 2016;Tan et al. 2018). Genotyping also resolved errors in pedigree information and improved model fit in a clonally replicated full-sib population of loblolly pine (Munoz et al. 2014).
Although genotyping has demonstrated great promise for improving relationship estimation, industrial breeding programs are comprised of thousands of individuals in multigenerational populations and genotyping entire populations is not possible or impractical. Animal breeders faced with a similar challenge developed the "single-step genomic BLUP" procedure, in which empirical relationship estimates from a genotyped subset (G) are merged into the traditional pedigree-derived matrix of expected relationship coefficients (A) of the entire population Misztal et al. 2009;Christensen and Lund 2010). The resulting relationship matrix (H) is then used in place of A in linear mixed models (LMM) to produce Best Linear Unbiased Predictions (BLUP) of genetic value. This "HBLUP" approach has been combined with efficient computing procedures ) to allow for genetic evaluations that integrate hundreds of thousands of genotypes with phenotypes from millions of animals (Misztal et al. 2013a). The approach is regularly used for genomic evaluation and selection in many livestock species (Lourenco et al. 2020).
HBLUP has been applied to the analysis of growth and wood quality traits for OP populations of Picea glauca (Ratcliffe et al. 2017), Eucalyptus grandis , Eucalyptus nitens (Klá p st e et al. 2018), and Eucalyptus pellita (Thavamanikumar et al. 2020) and for OP and CP families of Pinus contorta (Ukrainetz and Mansfield 2020). These studies involved genotyping of progeny and unanimously revealed the presence of relationships amongst parents that had not been previously known. Assumed relationships between parents and progeny (expected 0.5) and between half-sib progeny (expected 0.25) are more precisely estimated when blended with genomic similarities in G to create the H matrix . A greater density of relationship information in H compared with the sparse relationships in A led to improved theoretical accuracy of estimated breeding values (EBVs), in some cases only for genotyped individuals (Ratcliffe et al. 2017) and in other cases for ungenotyped individuals and parents as well Thavamanikumar et al. 2020). Phenotypic information from trees that have not been genotyped was also shown to improve genomic predictions (Cappa et al. 2019). In forest tree populations, HBLUP efficiently combines genomic, pedigree, and phenotypic information and offers substantial advantages over conventional pedigree-based analysis. On the other hand, the published studies applying HBLUP to forest tree populations have all been limited to small experimental populations, ranging upwards to 5742 genotyped and ungenotyped progeny (Thavamanikumar et al. 2020). This is one to two orders of magnitude smaller than required for long-running commercial tree improvement programs. Ukrainetz and Mansfield (2020) noted computational limitations extending their methodology to larger phenotypic and marker datasets while recognizing that HBLUP has been applied to millions of individuals in animal breeding applications (e.g., Tsuruta et al. 2021). Scaling up the HBLUP approach to a complete multigenerational breeding population is required to integrate this technology into applied tree breeding programs.
Eucalyptus globulus Labill. is a commercially important plantation species in Mediterranean climates around the world and is particularly favored for high-yielding short-fiber pulpwood crops. Approximately 460,000 hectares of E. globulus plantations are managed in Australia (Downham and Gavran 2019), principally in three major planting regions: (1) Southwest Australia (50% of area), (2) the "Green Triangle" of western Victoria and southeast South Australia (32% of area), and (3) central and eastern Victoria (10% of area). The species' native range in Southeast Australia has been classified into 13 races and 20 subraces (Dutkowski and Potts 1999;Potts et al. 2004). Landraces are also recognized where E. globulus has been naturalized in areas such as Portugal, Spain, and Chile. Racial variation in evolutionary and commercial traits has been well described (Lopez et al. 2001). Narrow-sense heritability estimates from E. globulus progeny trials are generally low for growth and moderate to high for wood basic density and pulp yield (Costa e Silva et al. 2009;Callister et al. 2011;Araujo et al. 2012). Nonadditive genetic effects on E. globulus growth can be significant (Araujo et al. 2012). Genotype-environment interaction has been reported for E. globulus at the regional scale across Australia. Costa e Silva et al. (2006) reported an average intersite correlations amongst four Australian growing regions of 0.73 for subrace effects and 0.76 for OP family effects and regional correlations are accommodated in routine analyses of Australian breeding populations (e.g., Dutkowski et al. 2015).
Most commercial tree improvement activities are practiced within the confines of an individual program. However, important exchanges of information and genetic material also occur amongst programs to support breeding and the establishment of planted forests. A major limitation to such exchanges exists when pedigrees have been coded differently and little connectivity is available between programs to provide unbiased comparisons among populations. This situation provided the stimulus for our work to quantify the impact that combining two distinct pedigrees into a joint relationship matrix (H) has on genetic parameter estimates and the accuracy of breeding value predictions. Replacement of A with H as the numerator relationship matrix (NRM) involves two types of changes to the information used in the analysis. Not only do genomic data inform the relationships amongst individuals in H, but the formulation of H applied thus far in tree breeding does not allow for races as genetic groups. Therefore, we aimed to disentangle the impact of including genomic information from that of removing race groups by comparing results from three types of models: ABLUP with race genetic groups (ABLUP þrace ; benchmark model), ABLUP without groups (ABLUP Àrace ), and HBLUP.
A second motivation was to demonstrate a methodology for combining large datasets in anticipation of a new generation of analytical tools. The newly emerging ability to model genetic responses to environmental attributes presented as continuous surfaces (Resende et al. 2020), to predict unobserved phenotypes through environmental relationship matrices (Jarquín et al. 2014), and to synthesize large 'omics datasets into coherent models of plant function (Weighill et al. 2019) rely on very large experimental populations distributed across the landscape. The most informed results from these alternative analytical approaches may be derived from datasets that include structured genetic effects, such as those managed in formal breeding programs evaluating diverse sets of germplasm across a wide range of environments.

Experimental populations
This study was conducted using two completely separate multigenerational breeding populations of E. globulus in southern mainland Australia, EG1 (Australian Bluegum Plantations) and EG2 (HVP Plantations). These commercial tree improvement programs progeny test full-sib families predominantly derived from first-and second-generation selections. Forty-eight field trials established between 1998 and 2015 provided phenotypic data from 126,467 full-sib progeny (1973 families), 7051 half-sib progeny (263 families), and 24,767 unpedigreed trees of commercial entries (193 "checklots"; Table 1 and Supplementary Table S1A describes progeny trial network).
Field trials were located in three regions of southern Australia: Western Australia (WA), the Green Triangle (GT), and Gippsland (GIPPS). EG1's breeding program was focused in WA (25 trials) with five trials in GT and none in GIPPS. EG2 trials were located predominantly in GT (nine trials), with six trials in GIPPS and three in WA (Figure 1, Supplementary Table S1B describes trial distribution across regions).
The EG1 population of full-sib families was founded on 112 base-population trees and the EG2 population was founded on 83 base-population trees (Supplementary Table S1C describes populations). These "founders" are trees in native forests or international landraces that provided the seed for testing and domestication. Half the founders of the EG2 program originated from Strzelecki Ranges and Western Otways races, whereas the EG1 program is more strongly influenced by Furneaux, King Island, Portugal (landrace), and NE Tasmania. The classification of races follows the results of Dutkowski and Potts (1999). A total of 347 EG1 parents and 107 EG2 parents were represented by progeny included in these analyses.
Inadequate interprogram relationships were available for joint analyses using pedigree derived relationships as only four founders were common to both programs. Pedigree-derived relationship coefficients amongst the two parent cohorts were rare, with two relationships of 0.25, 152 relationships of 0.125, and one expected relationship coefficient of 0.0625.
Trials were primarily established in randomized incompleteblock designs (two EG1 trials were alpha-cyclic incomplete rowcolumn designs) with four to eight replications of each family established in four-to five-trees in contiguous row-plots. Two EG1 trials were single-tree plot designs. Incomplete blocks were arranged into replicates that were generally contiguous to enable the resolution of spatial trend effects within each trial.

Phenotyping of field trials
Diameter at breast height (DBH) was measured with a tape for each tree at 3 years (7 sites), 4 years (9 sites), 5 years (29 sites), or 8 years (3 sites; see Supplementary Table S1A) after planting. Tree height (HT) was measured with a hypsometer for each tree in the majority of trials, although a subset of 3-17% (mean 10%) of trees were assessed in 11 of the EG2 trials. In these cases, unmeasured HT data were predicted from the DBH-HT relationship established amongst measured trees for each trial. Stem volume (VOL) was calculated for each tree using DBH, HT (measured or predicted), and the region-specific volume function provided by each program. DBH of all stems forking below breast height was measured in 36 trials and in these cases, VOL was aggregated to the tree level before analysis. Given the expense of wood property assessments, a subset of trees in 20 of the 48 trials were selected for phenotyping (Supplementary Table S1D describes subset of populations assessed for wood properties). Basic density (DENS) data were collected for a subset of trees between 5 and 8 years. EG2 measured DENS directly by the displacement method using the outer 50 mm of wood cores extracted at breast height. EG1 used pilodyn penetration as an indirect measure of wood density (Muneri and Raymond 2000). Pilodyn data were converted to outer wood DENS using the equation from Callister and England (2010). Cellulose content (CELL) was predicted by Forest Quality Pty Ltd in Tasmania, Australia using near-infrared spectrometry. Samples of either whole-wood cores or shavings taken from the outer 50 mm of the stem between 5 and 8 years of age were ground and scanned to produce spectra for cellulose prediction ).

Genotyping
Leaves were sampled from 164 parents and 74 others in the EG1 program and from 93 parents and 51 others in the EG2 program, where "others" refers to phenotypic selections that have not been progeny tested. DNA extraction and genotyping were conducted at Gondwana Genomics Pty Ltd, Canberra (Thavamanikumar et al. 2020). The E. globulus marker panel consisted of 2579 SNP and small biallelic insertion/deletion (INDEL) markers within candidate genes associated with diameter, density, and cellulose yield that were identified using association analyses in previous studies (Southerton et al. 2011). After filtering out markers with minor allele frequency less than 5%, 2444 markers were available from 382 parents for the derivation of relationship matrices.

Relationship matrices
Relationship matrices were produced using preGSf90, which is a module of the Fortran-based BLUPF90 suite (Misztal et al. 2014). The default settings of preGSf90 were used for quality control, imputation of missing markers, and calculation of the relationship matrices (Aguilar et al. 2014). The EG1 G matrix (G EG1 ) was formed for 238 genotyped individuals, the EG2 G matrix (G EG2 ) was formed for 144 genotyped individuals, and the joint G matrix (G JOINT ) was formed amongst all 382 genotyped individuals. G was calculated in preGSf90 following the first method of VanRaden (2008): where Z is a matrix of centered genotype scores calculated as Z ¼ M À 2 P; M is the n Â m matrix of n genotypes and m markers scored 0 for homozygous reference allele, 1 for heterozygous, and 2 homozygous alternative allele, P is a matrix of frequency for the alternative allele, and p j is the reference allele frequency of the jth marker. The pedigree relationship matrix A was divided into four submatrices: A 11 representing relationships amongst nongenotyped individuals, A 22 representing relationships amongst genotyped individuals, and A 12 and A 21 representing relationships between genotyped nongenotyped individuals: Differences in the constitution of contemporary and historical populations can lead to different mean breeding value and genetic variance between G and A 22 (Forni et al. 2011;Vitezica et al. 2011). We therefore rescaled G to make it compatible with A 22 following the approach detailed in Christensen et al. (2012): (3) where a and b were determined by solving the following system of equations: The rescaled matrix G a was then weighted to avoid difficulties with inversion (Aguilar et al. 2010): which was then used to calculate H Christensen and Lund 2010): or equivalently (Lourenco et al. 2020): with inverse H was then resorted to align with the pedigree relationship matrix for comparisons with A. The section of H relating to genotyped individuals, their parents, and ancestors were compared with A in a preliminary step to highlight identity errors and mistakes in the documented pedigree. These errors were rectified in the pedigree and the process of creating the H À1 matrix was then repeated with the correct pedigree information.

Genetic analyses
All genetic analyses were conducted using ASReml v. 4.1 (Gilmour et al. 2015). Models incorporating among individual relationship matrices derived from pedigree (ABLUP) or markers and pedigree (HBLUP) were fit following Henderson (1984) with a general LMM: where y is the vector of phenotypic values, X is the incidence matrix for fixed effects, b is the vector of fixed effects, Z is the incidence matrix for random effect, u is the vector of random effects with E(u) ¼ 0, and e is the vector of residual effects expected to be independently normally distributed with E(e) ¼ 0. The relationship matrix that differentiates ABLUP and HBLUP connects each individual in u, and this change in connectivity impacts genetic parameter estimates and the accuracy of breeding value predictions.
Single-site ABLUP analyses were first conducted for each trial to provide spatially adjusted volume (VOL adj ) data that were used for preliminary cross-site analyses (following Ye and Jayawickrama 2008). Fixed effects were included for the overall mean, replicates, and checklots. Random effects included additive genetic effects, familyspecific genetic effects specifying the combination of parents evaluated by control pollinated progeny, incomplete block effects, and row-plot effects. The races listed in Supplementary Table S1C were included as fixed genetic group effects within the pedigree (Westell et al. 1988). Genetic parameter estimates for each trial did not vary substantially from expectation and all trials were retained for across-site analyses. Preliminary cross-site analysis of VOL adj provided additive variance estimates for each trial, which were used to standardize each trial to have a mean of zero and an additive standard deviation of one. Partitioning of additive and nonadditive effects depended on the structure of the families evaluated in different trials and this heterogeneity was accommodated by estimating variance four groups of trials with similar ratios of family to additive variance. A more detailed description of the single site and preliminary analyses is provided in Supplementary Methods as SM.M1 and SM.M2.
Multivariate models were fit to VOL adj , DENS, and CELL separately using an unstructured variance model that provided additive genetic variance estimates for each region and inter-region correlations. This model produced genetic parameter estimates as well as breeding value predictions with their associated accuracy for all trees in each region. Comparisons of genetic parameters and accuracy estimates when different models are used for different sets of the populations are detailed in Table 5, Supplementary Tables S2, A and B. Site-specific effects for incomplete blocks, plots, and multistem form were modeled as random effects specific to each individual site. OP family means were fitted separately and did not contribute to estimates ofr 2 a . Checklots were fitted as fixed effects and additional within-plot error was fitted to checklots at each site separately. The crosssite modeling progressed for each program using A with genetic groups, A without groups, and H without genetic groups as the NRM. Finally, the joint H was used with data from both programs in combined analyses.
Narrow-sense heritability and dominance proportions for VOL were estimated at regional scales as follows: wherer 2 a region i is the cross-site additive genetic variance estimate for the ith region,r 2 f region i is the cross-site family-specific variance estimate for ith region, and P n k¼1r 2 b k þr 2 p k þr 2 e k is the sum of variance due to incomplete blocks, plots, and residuals used to estimate the mean across n trials in each region. Region-specific estimates of variance components were used to estimateĥ 2 regional andd 2 regional separately for VOL in WA, GT, and GIPPS. Trials contributing to the estimate of family variance were identified as the Group 2 trials in Supplementary Methods SM.M2.
Narrow-sense heritability and dominance proportions for DENS and CELL were estimated across all sites basis as follows: wherer 2 a is estimated across sites,r 2 f is the family variance approximated across groups of trials with differentr 2 f :r 2 a ratios as described in Supplementary Methods SM.M2, so thatr 2 f is a weighted mean estimate of family variance across all n sites. Other terms are defined as above for VOL.
Following the recommendation of Putz et al. (2018), the accuracy of breeding value predictions was used to provide empirical comparisons of the accuracy of traditional and HBLUP models for different sets of the population. The accuracy (r gĝ ) of predictions was calculated as:r PEV estimates for each individual are derived from the inverse of the relationship coefficient matrix. The method for PEV approximation in ASReml is described by Welham et al. (2004) and Gilmour et al. (2004), which utilize methods suggested by Misztal and Wiggans (1998) and Kackar and Harville (1984) to approximate prediction errors as suggested by Henderson (1976).

Relationship matrices
Realized relationship coefficients amongst genotyped individuals in G EG1 and G EG2 were distributed around values expected from pedigree relationships (Figure 2). Note that these relationship matrices contain covariances between individuals estimating similarities in allele frequencies rather than probability estimates of shared identity by descent derived from pedigree. A small number of realized relationship coefficients that diverged excessively from expected values, highlighting individuals with erroneous pedigree information, were corrected by reassigning parentage before further analyses.
G JOINT revealed unknown relationships amongst parents in the experimentally disconnected breeding programs. Although the majority of the 34,272 interprogram relationships in G JOINT were near zero, 2071 relationship coefficients were greater than 0.1, 428 were greater than 0.2, and 70 were greater than 0.3, with a maximum of 0.54 (Figure 3). Using genomic rather than pedigree-derived relationships provided sufficient cross-program linkage within G JOINT to proceed with forming H JOINT .
Relationships that were quantified in G were projected into H to reveal relationships assumed to be zero in A. This was observed within relationship matrices for EG1 and EG2 separately and jointly to provide comparisons of relatedness estimates among different sets of the breeding population. Within-race relationship coefficients amongst founders (none of which were genotyped) in H matrices were greater than zero and provide further support for differentiation of races and population structure ( Table 2). The Strzelecki Ranges population displayed the largest mean within-race relationship coefficient amongst founders (H JOINT 0.055; Table 2). Between-race or landrace relationships amongst founders in H JOINT varied significantly from zero in both directions, with 17 inter-race/landrace combinations significantly less than zero, 16 significantly greater than zero, and 12 not significantly different to zero (Table 3). Cryptic relationships among the native range populations provided evidence of population structure and evidence of a Tasmanian origin rather than the mainland or Bass Straight origin of third-party populations (Portugal and Californian landraces). The Furneaux race was significantly dissimilar to all other races/landraces apart from the nearby Northeast Tasmania, while Western Tasmania showed the greatest similarity to other races/landraces with five interrace/landrace combinations significantly greater than zero (Table  3). Relationships amongst ungenotyped founders in H JOINT were around one-quarter the value of those amongst their genotyped OP offspring.

Genetic parameter estimates from traditional ABLUP models and HBLUP models
While there are consistent increases in additive variance estimates when moving from ABLUP þrace to both ABLUP Àrace and HBLUP, the changes in heritability, dominance variance, and genetic correlations are similar to the standard error estimates. These populations do not provide strong evidence that genetic parameter estimates derived from ABLUP differ dramatically from estimates derived from HBLUP models.
As expected, removing the race/landrace genetic groups from the EG2 pedigree caused significant increases inĈV a andĥ 2 from individual-site analyses of VOL, whiled 2 was not significantly   One-way t-test result for H 1 : mean significantly greater than zero: * P < 0.05, ** P < 0.01, *** P < 0.001. affected (Table 4). The removal of fixed race effects resulted in consistently poorer model fit to single-site VOL data [Akaike information criteria (AIC) on average 41.9 higher; Table 4]. HBLUP models fit the single-site VOL data marginally better than the ABLUP Àrace models (AIC on average was 2.3 lower). TheĈV a and h 2 of HBLUP models were intermediate between respective values for ABLUP models that included and excluded race (Table 4) providing a reduction in bias from excluding population structure.
Similar patterns were observed for the EG2 cross-site analyses of VOL. AIC indicated poorer model fit (115.2 greater) for the ABLUP model when race effects were removed from the pedigree, which was slightly improved by using HBLUP instead of ABLUP Àrace (AIC difference of 13.5; Table 5). Removing race effects led to a similar decline in model fit for the cross-site EG1 VOL analysis (AIC 256.8 larger; Table 5). However, the HBLUP model fit the EG1 cross-site VOL data better than the benchmark model of ABLUP þRace (AIC 26.5 lower; Table 5). The reason for a distinctively better model fit using HBLUP on EG1 data is unknown-it could be due to the greater number of genotyped trees in the EG1 program or different patterns of parent-within-race structure between the programs.
Removal of race effects from the ABLUP model reallocated variance from fixed genetic groups to variance among parents, inflatingr 2 a and the derivedĥ 2 estimates. Ther 2 a for VOL increased by an average of 21% across regions and programs (Table  5) when race effects were excluded from the model. Some of these increases were reduced when progressing to the cross-site HBLUP models for each program, althoughr 2 a andĥ 2 remained significantly larger than ABLUP þrace models (Table 5). HBLUP cross-siteĥ 2 estimates for VOL were 0.13 6 0.01 and 0.11 6 0.02 in the EG1 program in WA and GT, respectively, and 0.10 6 0.02, 0.20 6 0.03, and 0.10 6 0.02 in the EG2 program in WA, GT, and GIPPS, respectively. The joint-program HBLUP producedĥ 2 estimates for VOL of 0.14 6 0.01, 0.12 6 0.02, and 0.05 6 0.01 in WA, GT, and GIPPS, respectively (Table 5). Estimates ofd 2 for VOL were not affected by choice of NRM and estimates from the joint analysis were 0.14 6 0.01, 0.10 6 0.02, and 0.12 6 0.02 in WA, GT, and GIPPS, respectively (Table 5). Additive genetic correlations amongst regions were slightly increased by the removal of race effects and ranged from 0.69 6 0.08 forr aðGT;GIPPSÞ to 0.83 6 0.08 for r aðWA;GIPPSÞ from the joint HBLUP model (Table 5). Inter-region dominance correlation estimates from joint HBLUP were 0.31 for r dðWA;GTÞ , 0.10 forr dðWA;GIPPSÞ , and 0.62 forr dðGT;GIPPSÞ (not tabulated).
Estimated heritability in DENS was greater in the EG2 program (0.58 6 0.08) based on core samples than the EG1 program (0.28 6 0.03) based on penetrometer data, with an intermediateĥ 2 Xsite of 0.34 6 0.02 from the joint analysis (Table 6). The programs each displayed cross-siteĥ 2 Xsite around 0.30 for CELL, with a joint estimate of 0.29 6 0.03 (Table 6). Dominance effects were significant for both wood quality traits withd 2 Xsite of 0.10 from joint HBLUP analysis.

Breeding value accuracy from traditional ABLUP models and HBLUP models
Overall, the greatest breeding value accuracy estimates (r gĝ ) were observed for genotyped parents represented in the prediction region, followed by ungenotyped parents represented in the prediction region (Figure 4 and see Supplementary Tables S2, A and B for more details). HBLUP models produced smaller PEV estimates than the benchmark ABLUP þrace models for the same parents (see Supplementary Table S2), resulting in generally higher estimates ofr gĝ . An exception was for EG2 parents, where even though PEVs were significantly smaller,r gĝ from joint HBLUP were similar tor gĝ from ABLUP þrace due to the substantially smallerr 2 a of the joint HBLUP model (see Table 5). Prediction accuracy for parents of progeny tested in other regions improved markedly in the joint HBLUP analyses ("parents Two-way t-test results for H 1 : mean significantly different to zero: * P < 0.05, ** P < 0.01, *** P < 0.001. Table 4 Mean and standard error of Akaike information criterion (AIC), coefficient of additive variation (ĈV a ), narrow-sense heritability (ĥ 2 ), and dominance proportion (d 2 ) for stem volume at single sites analysed by ABLUP with races included as genetic groups (ABLUP þrace ), ABLUP without groups (ABLUP -race ), and HBLUP without groups  Figure 4). For EG1 parents of this class in GT, accuracy increased by 12% above the ABLUP þrace reference value (0.086 6 0.023 above 0.713) when parents were genotyped and accuracy increased by 44% (0.218 6 0.029) when they were not genotyped. For EG2 parents that were unrepresented by progeny in WA, r gĝ increased by 10% (0.070 6 0.011) when parents were genotyped and by 23% (0.117 6 0.114) when they were not genotyped (Figure 4 and Supplementary Table S2). Accuracy estimates of genotyped individuals not represented as parents but related through the pedigree ("genotyped related"; Figure 4) were also generally greater when genotype and pedigree were blended. In this class, joint HBLUP produced improvements of 11% and 7% (0.072 6 0.011 and 0.048 6 0.017) in WA and GT for EG1, respectively, and 19% for both GIPPS and WA (0.100 6 0.036 and 0.104 6 0.029, respectively) for EG2 (Figure 4 and Supplementary Table S2). Genotyped individuals unrelated to parents through the pedigree were included in HBLUP models, and their breeding values were estimated with an accuracyr gĝ of approximately 0.5 from the joint model (Figure 4). Accuracy of progeny EBVs was the least impacted by choice of relationship matrix. Average PEVs for progeny were marginally higher for EG1 (by 0.050 and 0.008 SD a units in WA and GT, respectively, under the joint HBLUP model), although averager gĝ was increased by 3% (0.019 and 0.020) in WA and GT, respectively, due to larger values ofr 2 a (Figure 4, Table 5, and Supplementary  Table S2). Average PEVs for progeny were 37%, 32%, and 10% smaller (by 0.173, 0.169, and 0.051 SD a units) for EG2 in GIPPS, WA, and GT, respectively, when estimated with the joint HBLUP model (Supplementary Table S2) . However, average accuracy from joint HBLUP was 3-5% lower for EG2 progeny across regions due to the lower values ofr 2 a relative to the EG2 cross-site models ( Figure 4, Table 5, and Supplementary Table S2).

Discussion
The single-step GBLUP (HBLUP) approach was used to integrate genomic, pedigree, and phenotypic information and provide unbiased predictions of genetic merit for breeding programs with few pedigree connections. Genotyping the parents that were evaluated in separate progeny trial networks created sufficient connectivity between the programs to provide predictions of genetic merit for parents in regions where they were not evaluated and also improved the accuracy of predictions. Although the benefits of HBLUP to tree improvement have been demonstrated in a number of studies Ratcliffe et al. 2017;Klá p st e et al. 2018Klá p st e et al. , 2020Cappa et al. 2019;Thavamanikumar et al. 2020;Ukrainetz and Mansfield 2020), this is the first published report of an HBLUP application that connects large, independent, multigenerational tree breeding programs so that breeding value predictions may be directly compared and inferred in regions where little to no testing has been undertaken.
A clear advantage of HBLUP is that relationships amongst genotyped individuals are quantified precisely, rather than represented by the expected average relationship derived from the pedigree. This advance has been well documented in tree breeding contexts (Munoz et al. 2014;Gamal El-Dien et al. 2016;Ratcliffe et al. 2017;Klá p st e et al. 2018) and implied that genotyping may be used to discover errors in the pedigrees, reveal unknown connections within pedigrees, as well as to improve the precision of relationship estimates. While the transfer of relationship information from genotyped to ungenotyped individuals is also well Table 5 Estimates of AIC, additive variance (r 2 a ), heritability (ĥ 2 ), and dominance proportion (d 2 ) for each region, and between region additive genetic correlations (r a Þ for different multisite models within each program and across programs jointly for stem volume   Aguilar et al. 2010;Ratcliffe et al. 2017;Thavamanikumar et al. 2020), our method of genotyping progeny tested parents from large breeding populations provides a novel approach to utilizing cryptic relationships amongst population founders. The approach provides breeding programs with information to guide the exchange of germplasm as well as predictions of how individual trees with genotype data will perform in regions where the breeding program has not established trials.
It also provides a more powerful platform for quantifying reaction norms to understand which climate and soil variables lead to changes in breeding value predictions and where specific parents should be deployed. Positive associations amongst founders within races are expected due to coevolution and estimates of genetic distance among races implicit in G are incorporated into H. The true average relationship coefficient amongst randomly selected individuals within races is expected to be substantially larger than the estimates in the H matrices and shown in Table 2. Evidence supporting this comes from observing that relationships in H amongst ungenotyped founders that were unrelated though the pedigree were typically around 25% of those directly estimated in G amongst their OP offspring. Relationships in G and H are based on the probability that alleles are identical by state rather than by descent, and moreover, the adjustment in H made by projecting G onto relationships in A 11 is relatively small as determined by the coefficient A 12 A À1 22 in Equation 7. Further evidence that relationships amongst founders were underestimated in H is that HBLUP models did not fully correct the inflation inr 2 a andĥ 2 that was observed by removing fixedeffect race groups from the pedigree. This may alternatively indicate that the inclusion of genetic groups provides an overly strong assumption of within-group relatedness. The practice of fitting races or provenances as genetic groups (Quaas 1988;Westell et al. 1988) is well established in tree breeding (examined by Dutkowski et al. 1997) and ensures that heritable genetic variation is estimated at the within-race or within-provenance level. When genetic groups were removed from the pedigree, we observed the expected increases inr 2 a andĥ 2 in ABLUP models, as genetic variance that had been apportioned to race effects was pooled withr 2 a . ABLUP models that excluded genetic group effects provided a poorer fit as indicated by larger AIC estimates. The greater prediction accuracy associated with ABLUP models that excluded genetic groups is therefore associated with the inflation of genetic variance and does not indicate an improvement in overall model fit. Progression from ABLUP Àrace to HBLUP models moderated the inflation of r 2 a estimates for both populations, using both singlesite and more complex multisite models. It is expected thatr 2 a andĥ 2 will be more similar between HBLUP and ABLUP þrace models when within-race relationships are correctly accounted for with H (Lourenco et al. 2020). Alternatives for incorporating groups in HBLUP are currently being explored (Bradford et al. 2019) to reduce the bias of genomic predictions without reducing accuracy (Garcia-Baccino et al. 2017). Options include following the approach of Quaas (1988) and Westell et al. (1988) to represent groups in A or H (Misztal et al. 2013b) and treating groups as "metafounders," which are quasi-individuals with inbreeding values representing the degree of similarity within the group and relationship coefficients representing associations with other metafounders (Legarra et al. 2015). Extending HBLUP models to accommodate ancestral race effects is a priority for forest tree breeding populations that are often derived from diverse wild populations.
Genetic parameter estimates are similar to those previous published for E. globulus. These results are generally comparable with our joint HBLUP results for VOL, although thed 2 estimates from these trials are greater (see Table 5). Cross-site pooled analyses of DENS have not been previously reported for E. globulus, so average single-site results are referenced for comparisons. Costa e Silva et al. (pooled estimate of 0.34 6 0.02). Comparable univariate, cross-site, heritability estimates have not been published for full-sib families of E. globulus for cellulose content or pulp yield; however, half-sib families have yielded higher heritability estimates (e.g., 0.40 6 0.06 reported by Stackpole et al. 2010).
Our approach to modelingr 2 f across sites represents a pragmatic treatment of specific combining ability (SCA) in the context of industrial breeding programs deploying CP seed. Althoughr 2 f is large enough in populations of E. globulus evaluated as full-sib families to contribute toward improvements in growth, Araujo et al. (2012) reported an among-site or type-B correlation estimate for E. globulus family-specific effects of 0.41 6 0.13, indicating significant re-ranking of families in 40 Portuguese trials. Our results provide divergent estimates ofr 2 f by site, which adds to the difficulty of utilizing SCA. Thed 2 estimates ranged between 0.10 and 0.14 for VOL at 34 of the 48 sites evaluated ("Group 2"). This provides motivation for continuing to evaluate full-sib families to identify those with advantageous SCA for deployment.
Combining data from two breeding programs in southern Australia allowed for the estimation of inter-region genetic correlations that were unavailable or based on few trials in certain regions. For example, the program that was based in WA established many trials in that region and had no trials in GIPPS. The estimates of additive genetic correlations between these regions were made available by the incorporation of relationships among the parents of the disjunct breeding programs. Inter-region genetic correlation estimates are slightly larger than those presented by Dutkowski et al. (2015) for Tree Breeding Australia's E. globulus program (0.75 v. 0.58 forr aðWA;GTÞ , 0.83 v. 0.80 for r aðWA;GIPPSÞ , and 0.69 v. 0.49 forr aðGT;GIPPSÞ ). This supports the continuation of research to understand the factors underlying the significant genotype by environment interactions evident at the regional scale in Australia.
One reason to recommend HBLUP to tree breeders is that most of the traditional mixed-model analysis approaches applied to traditional genetic analyses may be retained. Once the H 21 matrix is constructed, it can be read directly into software such as ASReml and BLUPF90 with subsequent analyses conducted with all the power and flexibility of the mixed-model platform. HBLUP is therefore applicable to any trait and it has been demonstrated, for example, through analysis of Dothistroma needle blight on clonally replicated full-sib families of Pinus radiata (Klá p st e et al. 2020) and for blight resistance in American chestnut back-cross populations (Westbrook et al. 2020). Although we demonstrated the approach here by genotyping a mere 382 individuals, it is applicable to far larger genotyped numbers using the same methodology. For example, Tsuruta et al. (2021) recently conducted an HBLUP analysis with 2.3 M genotyped individuals and a complete pedigree of 13.6 M Holsteins.
Our results illustrate that HBLUP conferred little benefit to the prediction accuracy of parents and progeny within each program, possibly due to the very low proportion of genotyped individuals. PEV andr gĝ for parents and progeny were generally similar between ABLUP Àrace and within-program HBLUP. On the other hand, PEV values for parents were generally reduced and accuracies improved by progressing from within-program HBLUP to joint HBLUP (see Supplementary Table S2), which generally resulted in larger values ofr gĝ for parents. Parent EBVs produced by HBLUP and ABLUP were well correlated (results not presented). It could be argued that HBLUP would produce more informed breeding value predictions due to the greater precision in relationship definitions provided by H compared with A, and this is a subject of ongoing exploration and validation.
HBLUP offers a clear advantage over ABLUP for predicting the value of genotyped individuals with no phenotype data and no pedigree linkages with individuals evaluated in trial networks. Breeding values were estimated for 52 EG1 individuals and 5 EG2 individuals that were genotyped but had no connections via pedigree. While accuracy estimates were lower than parental estimates, at around 0.4 and 0.5 for within-program and joint HBLUP, respectively, indirect estimates of breeding values were produced. Joint HBLUP enabled the prediction of a complete set of EBVs for EG1 parents, genotyped individuals with no phenotype data, and progeny in the GIPPS region where no progeny trials were established. Extending genomic prediction models to connect distinct pedigrees allows breeding programs to leverage information from one another and infer performance in environments where populations have not been evaluated. This study provides empirical evidence that may be used to promote collaboration among tree improvement programs to better characterize the genetic merit of individuals in environments where they have not been evaluated. The need for breeding value predictions in untested environments is expected to increase as forestry organizations examine options for adapting to climate change (Pinkard et al. 2015;Keskitalo et al. 2016), including breeding for altered climatic conditions (Gray et al. 2016). The strength of prediction across environments is dependent not only on joint genomic information but also on the precision of environmental definition and the treatment of GxE in modeling.

Conclusions
Using genotype data to blend disconnected pedigrees and phenotype data from separate breeding programs into a unified analysis produces unbiased breeding values for direct comparisons between programs and indirect predictions of merit in environments where individuals may not have been evaluated. The joint HBLUP analysis significantly improved prediction error variance of parents and genotyped individuals, provided similar estimates of genetic parameters, more accurate EBV predictions, and offered the profound advantage of EBV prediction for genotyped individuals in regions they had not been evaluated.
Overall, the genotyping proved useful for correcting pedigree errors, more precisely defining relationships within and among populations, identifying the source of landrace populations, and integrating genotyped individuals with no phenotype or pedigree connections into the prediction framework. Understanding the impacts of incorporating genetic groups in the estimation of H will further align the traditional genetic evaluation pipelines that underpin tree breeding programs with approaches that incorporate marker-derived relationships into prediction models.