The roles of pleiotropy and close linkage as revealed by association mapping of yield and correlated traits of wheat (Triticum aestivum L.)

Multiple-trait approaches dissected the pleiotropic architecture of wheat yield and correlated traits more as a function of pleiotropy rather than close linkage. Factors affecting yield were investigated using simulations.

MT methods can formally test if the co-location of QTLs for more than one trait is due to pleiotropy (i.e. a single genetic factor simultaneously controls several traits) or because of close linkage between more than one genetic factor influencing each trait separately (Jiang and Zeng, 1995;Almasy et al., 1997;Williams et al., 1999;Knott and Haley, 2000;Calinski et al., 2000;Varona et al., 2004;Da Costa E Silva et al., 2012). Pleiotropy and linkage are the basis of genetic correlations (Falconer and Mackay, 1996), and the ability to differentiate between them will determine the optimum breeding strategy (Chen and Lübberstedt, 2010). For instance, if undesired trait correlations are caused by linkage, breeders have to put effort into breaking this correlation by means of recombination. Some linkage studies aiming to disentangle linkage from pleiotropy in crop plants can be found, for instance on GY and GY-syndrome traits in maize (Calinski et al., 2000;Balestre et al., 2012) and on average ear and leaf lengths under saline stress in wheat (Lebreton et al., 1998). Nevertheless, to the best of our knowledge, this kind of study is lacking so far in the context of GWAS in crop plants.
The main goal of our study was to dissect the pleiotropic architecture of GY-syndrome by means of GWAS using MT statistical approaches in a bread wheat population of varieties adapted to European environments. The specific objectives were: first, to find genomic regions simultaneously associated with GY and GY-syndrome traits using MT-GWAS; secondly, to elucidate if the existence of these regions is a consequence of pleiotropy or due to closely linked non-pleiotropic QTLs; and, thirdly, to understand by means of simulations some factors that could have driven the ability to distinguish between these two phenomena in the current study. Our findings bring a better understanding of the pleiotropic architecture of GY-syndrome and its implications in applied MT wheat breeding.

Plant material and phenotypic data
Our study is based on the GABI-WHEAT population which is composed of 358 European winter plus 14 spring wheat varieties. A detailed description of the varieties can be found in Kollers et al. (2013a). The GABI-WHEAT population was tested in up to eight environments throughout Germany (Seligenstadt 2009and 2010, Wohlde 2009and 2010 and France (Andelu 2009and 2010, Saultain and Janville 2010. The experimental design for field trials was an alpha-lattice design with two replications. Plot sizes ranged from 5 m 2 to 6.8 m 2 . Sowing rates and dates were according to local agricultural practices. PH (cm), HD (days since 1 January), and TGW (g) data from these experiments were already available from past studies (Zanke et al., 2014a(Zanke et al., , b, 2015. Moreover, a sample of 10 ears was taken from each plot and then used to calculate GPE and EW (g). Plots were combine-harvested and grains were dried to reach 14% humidity. Later, GY was calculated in Mg ha −1 . TW was measured from a clean grain sample using a 250 ml cylinder and subsequently expressed in kg hl −1 . After a quality check, data for GY, PH, and HD remained available for all environments, while TGW and TW were unavailable for Saultain, and only one of two replicates was considered for these two traits in Andelu 2010 and Janville 2010. GPE was available for Andelu 2010 and Wohlde 2009, whereas EW from Wohlde 2009 and 2010 was taken into consideration for further phenotypic analyses.

Molecular data
In a previous study (Zanke et al., 2014a), the GABI-WHEAT population was genotyped by using a 90K Infinium single nucleotide polymorphism (SNP) chip . Quality control of SNP markers was assessed as stated by Jiang et al. (2015), making a total of 18 832 SNP markers available for GWAS. In addition, the GABI-WHEAT population was genotyped using a set of 24 functional markers (Kollers et al., 2013a, b;Zanke et al., 2015) associated in past studies with agronomic traits, disease resistance, or grain quality (Supplementary Table S1 at JXB online). Missing values for functional markers with <10% missing data were imputed according to allele frequencies.

Phenotypic analyses
Best linear unbiased estimators (BLUEs) and variance components were computed based on an unweighted two-stage univariate mixed model analysis approach (Möhring and Piepho, 2009). First, BLUEs of genotypes were separately computed for each trait-byenvironment combination by considering the following model:

Trait
Genotypes Replicates Blocks Replicates Error µ + + + ( )+ (1) where µ (i.e. the common mean), and genotypes were considered as fixed factors, whereas replicates, blocks nested within replicates, and error effects were considered as random. After this stage, BLUEs of all environments and unreplicated data were combined together. During the second stage, BLUEs of genotypes across environments for each single trait were computed using the following model:

Trait
Genotypes Environments Error µ + + + where µ and genotype effects were assumed as fixed factors, whereas environments and error terms were considered as random. In parallel, a model considering genotypes as random factor was fitted for estimation of variance components during the first and second stages. The genotypic variance (σ 2 G ) was estimated during the second step, whereas the error (σ 2 e ) and genotype×environment interaction (σ 2 G×E ) variance components are mixed at this stage. Therefore, an error variance (σ 2 Error ), namely the average of σ 2 e in single environments with replications, was derived during the first step and subsequently subtracted from σ 2 e during the second step for σ 2 G×E estimation. Estimates of variance components were later considered for the computation of heritability (h 2 ) on a plot basis as: For the estimation of genetic correlations among traits, we considered an MT model according to Henderson and Quaas (1976): where y kji is the phenotype of the ith genotype at the jth environment for the kth trait (i.e. each ST BLUE or unreplicated data after the first stage of phenotypic analysis), μ k corresponds to the overall mean of the kth trait, l kj denotes the effect of the jth environment on the kth trait, g ki represents the breeding value of the ith genotype for the kth trait, and e kji is the error term for the y kji observation according to the model. Terms μ k and l kj were designated as fixed, while g ki and e kji were considered random. In matrix nomenclature, the random terms g ki and e kji are assumed to be normally distributed in the way g~N(0 (t×n)×1 , G) and e~N(0 N×1 , R), respectively, where 0 (t×n)×1 and 0 N×1 are null vectors of length t×n and N, respectively, with t and n being the number of traits and genotypes, respectively, whereas N=t×No. Env.×n. In addition, G denotes the variance-covariance structure between the t×n MT breeding values included within g, while R represents the variance-covariance matrix for the N error terms contained in e. Provided that Y is the vector containing the N different y kji values and that these elements are conveniently sorted so that genotypes and environments are nested within traits, G and R can be further decomposed as G=G 0 ⊗A and R=R 0 ⊗I, respectively, where G 0 is the t×t additive genetic variance-covariance matrix among traits, ⊗ denotes the Kronecker product operator between matrices (Searle, 2006), A represents a relationship matrix between genotypes, R 0 corresponds to the t×t error variance-covariance matrix among traits, and I is an identity matrix of order No.Env.×n. The A matrix was estimated by 2×(J-RD), where J denotes an n×n matrix whose every element is 1 and RD is a matrix containing the Rogers' distances (Rogers, 1972) among n genotypes previously calculated from the SNP profiles in Jiang et al. (2015). As shown by Melchinger et al. (1991), these measurements of genetic similarity between homozygous inbred lines are, under simplifying assumptions, linearly related to the Mallecots' (1948) coefficient of co-ancestry calculated from pedigree records. The pre-specification of the A matrix allowed the estimation of G 0 by means of the restricted maximum likelihood algorithm. During analyses, t was fixed to 2 so that a bivariate model was fitted for each pair of traits. Denoting each element of G 0 as s kk' , the genetic correlation between any pair of traits

Multiple-trait genome-wide association mapping
We firstly considered a bivariate GWAS approach for each of the six GY plus one GY-syndrome trait combinations following the suggestion of Stich et al. (2008). In principle, the model underlying this approach is an extension of Equation 3, allowing now the inclusion of a marker with pleiotropic effects on k traits. Then, the MT-GWAS model corresponds to: where α k represents the effect of a particular marker on the kth trait, m i is a scalar taking values 0, 1, or 2 for individuals homozygous for the most frequent allele, heterozygous and homozygous for the second allele at the tested marker, respectively, whereas (αl) kj denotes the interaction of this marker with the jth environment for the kth trait. Both, α k and (αl) kj were assumed as fixed effects within the linear mixed model. In the case of multi-allelic functional markers, we treated each allele as a bi-allelic marker for simplicity. The genetic diversity of the GABI-WHEAT population has been extensively studied using different molecular marker platforms (Kollers et al., 2013a;Jiang et al., 2015;Zanke et al., 2015), and it was jointly concluded that there is no recognizable population structure in it. Therefore, only family structure correction using the relationship matrix between genotypes was considered in Equation 4. After estimation of effects and variance components, different Wald statistics were considered for significance test of α k and (αl) kj . A general Wald statistic is defined as ˆ' (ˆ)θ θ θ Var −1 , where θ is a vector containing the estimates for the p fixed effects being tested and Var(θ) -1 corresponds to the inverse of the variance-covariance matrix among tested effects estimators. The Var(θ ) matrix is a submatrix obtained by extracting the p corresponding elements from the upper-left quadrant (C 11 ) of the generalized inverse for the left-hand-side coefficient matrix in the mixed model equations (Henderson, 1975). For trait combinations with an unequal number of environments, R as well as the design matrices for fixed and random effects were modified as suggested by Henderson and Quaas (1976). Wald statistics follow a χ 2 p distribution. We performed a global Wald test for the α k effects (H 0 : α k1 =α k2 =0), followed by an individual test for each α k in the bivariate model. Markers were declared as potentially pleiotropic for two traits only when the α k effects were significant at the global and at the two post-hoc individual tests. R 2 values were computed by fitting all significant markers for each trait combination using a multiple-regression model with one of the traits as dependent variable. Markers with the lowest global Wald test P-values entered first in the model. Subsequently, R 2 values were divided by the respective entry mean-based heritability; thus, they measure the proportion of genetic variance explained by a marker. This last procedure allowed direct comparisons among R 2 values for different traits. A global Wald test for the (αl) kj effects of both traits together and one test for each trait separately were performed to study marker×environment interactions. In addition to false discovery rate (FDR; Benjamini and Hochberg, 1995) and Bonferroni methods, we considered the effective number of independent markers (M eff ) approach proposed by Gao et al. (2008) for genome-wide multipletest correction. Basically, this method is similar to the Bonferroni approach, but the whole number of markers is replaced by the M eff when correcting the nominal level of significance of tests. For this purpose, pairwise linkage disequilibrium (LD) matrices in the form of r 2 were computed for each of the 21 wheat chromosomes using SNPs with unique genetic map positions  along with functional markers. Later, principal component analysis was applied to each matrix and the number of eigenvalues needed to explain 95% of matrix variation was recorded. By adding these 21 values together, an M eff =3257 was obtained. Furthermore, because of the high collinearity between associated markers mapping in the same region, the effective number of bivariate associations was computed using the same principle as for M eff . A complete genome scan using Equation 4 and performing Wald tests with more than two traits would imply a very high computational load. Therefore, in cases where markers with significant associations for more than one two-trait combination, hereafter called higher order GY-syndrome markers, were found, only this subset of markers was subsequently evaluated for more than two traits. Since genome-wide distributions of P-values for the Wald test were unknown in these cases, only M eff and the Bonferroni correction methods were considered for higher order GY-syndrome markers.
A bivariate two-dimensional scan based on the method originally proposed by Jiang and Zeng (1995) for linkage studies was implemented in the context of GWAS. Briefly, this method contrasts H 0 : p(1)=p(2), stating that the QTLs for two different traits are located at the same m locus, with H 1 : p(1)≠p(2), in which these two QTLs have different positions, i.e. m 1 and m 2 . In principle, a pleiotropic model and a linkage model are compared in terms of their likelihoods. Since BLUEs across environments were considered as phenotypic data for the bivariate two-dimensional scan, Equation 4 is reduced to: in the case of the pleiotropic model, whereas: respectively, where 1 n is an n-size vector containing only ones, whereas m, m 1 , and m 2 denote n-size vectors of marker profiles. In both f 0 (Y) and f 1 (Y), the covariance matrix is where L 1 is the maximum of the likelihoods for a linkage model in the two-dimensional space (H 1 ), and L 0 is the maximum of the likelihoods on the diagonal of the two-dimensional space that corresponds to H 0 . Then, the test statistic under H 0 follows a χ 2 1 distribution for which a nominal significance level of 0.05 was applied (Jiang and Zeng, 1995). The twodimensional space was defined by considering an r 2 window with values >0.5 between the surrounding markers and each potential pleiotropic marker. Only markers with unique genetic map positions were taken into account for this purpose. Moreover, two traits were simulated under close linkage or pleiotropy scenarios to investigate the influence of minor allele frequencies (MAFs) along with balanced and unbalanced percentages of genetic variance explained by QTLs (QTL size) on the power and Type I error of close linkage detection in the GABI-WHEAT population. In balanced scenarios, QTL sizes were the same for both simulated traits, whereas QTL sizes were larger (or smaller) for one of the traits under the unbalanced scenario. The influence of r 2 values between QTL positions was considered as an additional factor in linkage scenarios. Marker information (Supplementary Table S2) and some general statistics from phenotypic analyses of the GABI-WHEAT population were considered during simulations. Methods for simulations are described in detail in the Supplementary data. Linear mixed models were solved using the ASReml-R package (Butler et al., 2009), and all computational methods were implemented within R environment (R Core Team, 2016).

Grain yield was significantly associated with various yield-syndrome traits
On a plot basis, GY was moderately heritable (h 2 Plot =0.44), while most GY-syndrome traits, except EW, had higher heritabilities than GY (Table 1). A broad phenotypic variation was observed for all traits (Supplementary Table S3), and GY ranged from 7.4 Mg ha −1 to 11.1 Mg ha −1 . Phenotypically, GY was positively correlated (P-value <0.0001) with GPE and EW but was negatively associated (P-value <0.0001) with PH and TW (Table 1). Almost all phenotypic correlations among GY-syndrome traits were significant (P-value <0.05), except between EW and the two traits PH and HD. At the genetic level, GY-PH and GY-TW correlations shifted towards zero (Table 1), whereas GY-TGW, GY-GPE, and GY-EW correlations were positive (P-value <0.05).

Bivariate genome-wide association mapping revealed several marker-trait-trait associations
Among all available markers, a total of 345 had significant (FDR <0.05) bivariate associations with GY and at least one of the GY-syndrome traits ( Fig. 1; Supplementary Tables S4-S9). With the exception of GY-GPE associations of SNP markers Ku_c12191_1123, Tdurum_contig13646_225, wsnp_ Ex_c22913_32130617, and wsnp_Ex_rep_c66274_64426834, all the remaining bivariate associations presented significant marker×environment interactions for at least one trait. Marker-trait-trait associations were found on all wheat chromosomes ( Fig. 1; Supplementary Fig. S1). Genome A concentrated a great proportion of them, with 144 markers presenting a total of 188 bivariate associations, whereas 106 and 37 associations were observed for B and D genomes, respectively, which were attributed to 87 and 28 markers for each genome, respectively. The minimum number of bivariate associations per chromosome was presented by chromosome 2D, with only one joint association for GY and TW, while 3A showed the maximum number of associations per chromosome, with up to 56 associations attributed to 29 markers. Depending on the trait pair, associated markers fitted in a multiple-regression model explained together 12.2-58.1% of GY phenotypic variation, while accounting for between 9.3% and 60.4% of GY-syndrome trait variation (Supplementary  Table S10). Nonetheless, all distributions of single marker R 2 values were L-shaped ( Supplementary Fig. S2) and, in most cases, associated markers explained <1% of genetic variation for at least one of the traits (Supplementary Tables S4-S9).
A total of 18 significant SNP markers (FDR <0.05) were jointly associated with GY and PH (Supplementary Table  S4). These markers represented a total of eight effective GY-PH associations (Supplementry Fig. S1) and half of them induced negative GY-PH co-variation (Fig. 2). Markertrait-trait associations explained on average almost a three times larger proportion of genetic variation for PH than for GY (2.1% versus 0.8%), with the highest R 2 for PH attributed to marker RAC875_rep_c105718_304. This marker accounted for 15.3% and 1.2% of PH and GY variation, respectively (Supplementary Table S4).
For GY and HD, 93 SNP markers were jointly associated with both traits at FDR <0.05 (Supplementary Table S5), which depicted an estimated total of 40 GY-HD effective associations ( Supplementary Fig. S1). Most GY-HD associations (89% of effective associations) induced negative co-variation between these two traits (Fig. 2). Markers with GY-HD associations explained on average a comparable percentage of variation for GY and HD (0.9% of genetic variation in both cases). Nevertheless, effect sizes on each trait were often unbalanced at the individual marker level (Supplementary Table S5). For example, marker wsnp_Ku_c16432_25320146 explained 9.2% of variation in GY but only 0.6% of HD variation.
Significant (FDR <0.05) bivariate associations for GY and TGW were observed for 28 SNP markers (Supplementary  Table S6), representing an estimated total of 11 effective GY-TGW associations (Supplementary Fig. S1). All these bivariate associations induced positive co-variation between GY and TGW (Fig. 2). Interestingly, significant markers explained on average three times as much of the genetic variance for GY (1.4%) than for TGW (0.5%), with the highest GY R 2 attributed to SNP BS00028033_51. This marker explained 4.6% of variation in GY but accounted for 0.1% of TGW variation (Supplementary Table S6).
Bivariate GWAS for GY and TW revealed 116 SNP markers simultaneously associated with these traits at FDR <0.05 (Supplementary Table S7). GY-TW-associated markers showed an estimated total of 40 effective associations ( Supplementary Fig. S1), with the majority of them (80% of effective associations) inducing positive trait-trait co-variation (Fig. 2). Regarding R 2 values, markers explained on average a comparable proportion of GY and TW variation (0.6% and 0.7% of genetic variation, respectively). Nonetheless, R 2 values for GY and TW were again often asymmetric at the single SNP level (Supplementary Table S7).
GY-GPE associations were found to be significant (FDR <0.05) for 70 markers (Supplementary Table S8), which represented an estimated total of 34 effective bivariate associations ( Supplementary Fig. S1). Most of these effective associations (88%) induced positive GY-GPE co-variation (Fig. 2). Although the percentages of variation explained by these markers were on average comparable between GY and GPE (1.1% and 0.8% of genetic variation, respectively), effect sizes on each trait were again often asymmetric at the single marker level (Supplementary Table S8). The GY-GPE association found on chromosome 4B corresponded to the Rht-B1 locus (Ellis et al., 2002). Rht-B1 explained 0.3% and 0.1% of GY and GPE variation, respectively, and the dwarfing allele presented positive effects on both traits.
The number of markers with significant (FDR <0.05) GY-EW joint associations amounted to 114 (Supplementary  Table S9), which showed an estimated total of 35 effective associations ( Supplementary Fig. S1). More than a half of them (67% of effective associations) induced positive covariation between GY and EW (Fig. 2). Similar to previous cases, even though average R 2 values were comparable for GY and EW (0.7 and 0.8% of genetic variance, respectively), R 2 values for each trait were often unequal at a single marker   Table S9). The unique GY-EW association found on chromosome 6B corresponded to a marker developed for the TaGW2-6B locus (Qin et al., 2014). This particular marker explained 0.3% and 0.2% of GY and EW variation, respectively, with the Hap-6B-1 haplotype having positive effects on both traits.

Simulation study on the power to distinguish close linkage from pleiotropy
The power to distinguish close linkage from pleiotropy ranged in our simulation study from 3% to 58% and increased with larger QTL sizes and more intermediate allele frequencies, but was hampered by tight LD between QTL positions (Table 2). Similar trends were found when simulating two QTLs with asymmetric sizes (Supplementary Table S11). Some exceptions to these main effects, such as power increments observed when the LD increased from r 2 ~0.55 to 0.70 at MAF ~0.13 with QTLs explaining 10% of genetic variation for both traits, could be attributed to interactions among these factors along with other variables ignored during simulations. Regarding Type I error; that is, wrongly rejecting pleiotropy, no clear trends related to the different levels of MAF and QTL sizes were observed (Supplementary Table S12). Across all tested conditions, the average Type I error was 0.07, with an standard deviation of 0.04.

Two-dimensional scan to distinguish close linkage from pleiotropy
From the total of 345 markers with significant bivariate associations, a subsample of 251 markers had unique genetic map positions (Fig. 3A). A further 27 markers were discarded because of the lack of surrounding linked markers (r 2 >0.5). Thus, 224 markers were considered for the tests of close linkage versus pleiotropy ( Fig. 3B; Supplementary Tables S4-S9). For these targeted markers, the number of surrounding linked markers ranged from 1 to 283, with an average of 13.6. According to the pairwise LD decay ( Supplementary Fig.  S3), surrounding markers are expected to map between 0 cM and 2 cM away from each targeted marker. From all two-dimensional scans performed, we selected two extreme examples to illustrate the landscape of likelihoods for close linkage versus pleiotropy ( Fig. 4; Supplementary Fig. S4). The two-dimensional scan for the GY-GPE-associated marker Tdurum_contig10194_765 revealed that the maximum bivariate likelihood is clearly placed outside of the diagonal and, thus, the hypothesis of pleiotropy can be rejected (P-value <0.05). In contrast, for marker IACX8108 jointly associated with GY and HD, the maximum bivariate likelihood was found on the diagonal; therefore, the hypothesis of pleiotropy cannot be rejected ( Supplementary Fig. S4). Interestingly, the two-dimensional   Jiang and Zeng (1995) to differentiate close linkage from pleiotropy in linkage-simulated scenarios considering different levels of minor allele frequency (MAF), balanced percentage of explained genetic variation (QTL size) for each of the two simulated traits, linkage disequilibrium (r 2 ), and marker profiles of the 358 European winter plus 15 spring wheat varieties (GABI-WHEAT population). Each value corresponds to the proportion of times in which H 0 : p(1)=p (2) Table S8). Likelihoods pertaining to pleiotropy models were maximized at marker wsnp_Ku_c16432_25320146 (denoted with Δ), whereas likelihoods for linkage models were maximized at the combination of wsnp_Ku_c16432_25320146 and Tdurum_contig13240_523 (highlighted in green), with these last two markers carrying the effects on GY and GPE, respectively. The log-likelihood ratio test of Jiang and Zeng (1995) using maximized likelihoods rejected H 0 : p(1)=p(2) of pleiotropy at the nominal significance level of 0.05.  (Ellis et al., 2002;Qin et al., 2014;Wang et al., 2014). (B) Frequency distribution of the number of surrounding markers in the vicinity (with linkage disequilibrium r 2 >0.5) of associated markers with unique genetic positions. (C) Distribution of frequencies for the minimum r 2 value observed between markers within each vicinity window.

Higher order grain yield-syndrome markers
A total of 78 markers presented significant bivariate associations in more than one GY plus GY-syndrome trait combination ( Fig. 1; Supplementary Tables S4-S9), suggesting that these markers are simultaneously influencing GY and two or more GY-syndrome traits. To elucidate this, marker effects were tested for their significance by sequentially adding more traits to an MT model according to bivariate results. First, three-trait models were considered; however, after multipletest corrections, only one of the 78 markers remained significant for all traits in the model (data not shown). In detail, the SNP marker IACX6214, located at 33.7 cM on chromosome 3B, was significantly and simultaneously associated with GY, TGW, and TW (M eff -corrected P-value <0.1). This marker has negative minor allele effects on each trait, and could explain 0.02, 0.4, and 1.8% of genetic variation in GY, TGW, and TW, respectively, when included alone within a regression model for each trait. Nonetheless, the lack of surrounding linked markers (r 2 >0.5) for IACX6214 precluded us from disentangling the pleiotropic nature of its simultaneous effects on GY, TGW, and TW.

Discussion
The power to distinguish linkage from pleiotropy is driven by minor allele frequencies, QTL size, and linkage disequilibrium Across all simulated close linkage scenarios, it was observed that increasing MAF, along with QTL sizes and decreasing r 2 levels between QTL positions, have a positive effect on the power to differentiate linkage from pleiotropy (Table 2;  Supplementary Table S11). Our findings agree with past simulation studies considering various QTL sizes or different genetic distances/LD values between two QTLs (Lebreton et al., 1998;Knott and Haley, 2000;Varona et al., 2004;Da Costa E Silva et al., 2012;David et al., 2013). To the best of our knowledge, studies directly assessing the influence of MAF on the power and Type I error rate to differentiate close linkage from pleiotropy are lacking. Nevertheless, increments in MAF improve QTL detection power in GWAS (for a review, see Myles et al. 2009) and, from our results, this positive influence will also apply to the ability to distinguish pleiotropy from linkage. In general, past simulation studies have shown that Type I error of close linkage versus pleiotropy testing is unaffected or, at most, only marginally influenced by balanced changes in QTL sizes (Lebreton et al., 1998;Knott and Haley, 2000;Varona et al., 2004;David et al., 2013); an observation also found in our study (Supplementary Table S12). In addition, Lebreton et al. (1998) reported that unequal QTL sizes could increase Type I error by using a confidence interval approach to test pleiotropy versus linkage; however, this was not consistently observed by us. Nonetheless, since pleiotropy and linkage models of Jiang and Zeng (1995) are not exactly nested, asymptotical distributions of the LR test under H 0 may deviate from χ 2 1 ; hence, alternative procedures have been suggested for significant threshold derivation in linkage studies (Almasy et al., 1997;Williams et al., 1999;Calinski et al., 2000;Knott and Haley, 2000;Varona et al., 2004;Da Costa E Silva et al., 2012). Nevertheless, the average Type I error rate of 0.07 ± 0.04 observed during our simulations is comparable with the nominal level of 0.05 and also with the 0.1 value originally observed by Jiang and Zeng (1995), thus reflecting an acceptable test performance by considering standard significant thresholds, at least with our simulated scenarios.

Grain yield is genetically correlated with thousand grain weight, grains per ear, and ear weight
Correlations at the phenotypic level would not necessarily resemble those at the genetic level, and vice versa. This is mainly because phenotypic correlations are not a direct function of genetic correlations, while being also dependent on the environmental correlation along with trait heritabilities Table 3. Marker-trait associations, chromosome location (Chr.), genetic positions (Pos., cM), number of surrounding markers (N), linkage disequilibrium (r 2 ), and genetic distance (Dist., cM) pertaining to cases in which the log-likelihood ratio test of Jiang and Zeng (1995)  Traits involved in bivariate associations: grain yield (GY), thousand grain weight (TGW), grains per ear (GPE), and ear weight (EW). c Genetic positions according to the reference map published by Wang et al. (2014). d Number of markers in the vicinity of markers with bivariate associations (r 2 >0.5). e Genetic distance as well as r 2 values were calculated between M1 and M2. (Falconer and Mackay, 1996). Such discrepancies were also observed in the present study (Table 1). For instance, while GY and PH were negatively correlated at the phenotypic level, they were not genetically correlated. Positive effects of reduced PH by improving harvest index and lodging resistance of high-yielding semi-dwarf wheat plants adapted to intensive agricultural practices have been long recognized (Borlaug, 1968;Austin et al., 1980;Börner et al., 1993;Brancourt-Hulmel et al., 2003;Kuchel et al., 2007;Casebow et al., 2016;Kowalski et al., 2016). The null genetic correlation between GY and PH can be attributed to the fact that varieties in the GABI-WHEAT population were already adapted by wheat breeders to perform in European agroecosystems. Moreover, since HD plays an important role in wheat adaptation as a crop (Worland, 1996), breeding efforts for adaptation can also explain the null correlation between HD and GY (Table 1). In relation to TW, this trait was negatively correlated with GY at the phenotypic level, while the genetic correlation was not significant between these traits. TW is a trait related to milling yield because it gives an indication of the soundness of wheat grains (Matsuo and Dexter, 1980). Nonetheless, its association with GY is still controversial (Huang et al., 2006;Rattey et al., 2009;Kamran et al., 2014;Bordes et al., 2014). Interestingly, despite the fact that GY-PH, GY-HD, and GY-TW genetic correlations were practically zero (Table 1), we could find pleiotropic associations for these trait pairs (Fig. 2). Since pleiotropic QTLs can induce positive/negative genetic co-variance among two traits if QTL effects on these traits have the same/opposite sign, these observed null genetic correlations could be simply the net effect of loci whose effects cancelled each other out. Regarding significant genetic correlations among GY and traits TGW, GPE, and EW, their positive associations confirm their main roles as GY components (e.g. Huang et al., 2006;Kuchel et al., 2007;Kumar et al., 2007;Cuthbert et al., 2008;Rattey et al., 2009;Neumann et al., 2011;Bennett et al., 2012;Simmonds et al., 2014;Sukumaran et al., 2015).
The pleiotropic architecture of wheat yield-syndrome was dissected more as a function of pleiotropy rather than close linkage Past linkage studies on crop plants have shown that disentangling linkage from pleiotropy is quite difficult and sometimes even remains unsolved (Lebreton et al., 1998;Calinski et al., 2000;Balestre et al., 2012). In particular, Calinski et al. (2000) and Balestre et al. (2012) observed that although there was some evidence favoring linkage, this was not enough to discard pleiotropy completely as the underlying mechanism for maize GY-syndrome. Similarly, among all bivariate associations found in our study (Supplementary Tables S4-S9; Supplementary Fig. S1), only three of them were due to close linkage (Table 3). Nonetheless, we presume that the importance of pleiotropy could be overestimated, at least for the traits considered in this study. First, the majority of associated markers explained <1% of trait variation (Supplementary  Tables S4-S9; Supplementary Fig. S2), which, as previously discussed, is expected to lead to low power to differentiate linkage from pleiotropy. Secondly, the distribution of r 2 values between the most distant markers within each vicinity window showed in general values >0.5 (Fig. 3C), thus jeopardizing the fitting of linkage models using pairs of more distant markers. Although relaxing the LD window for the inclusion of less linked markers would be a simple solution, this would increase the risk of falsely declaring linkage in the presence of true pleiotropy, because one main prerequisite underlying our methodology is that each bivariate-associated marker is simultaneously detecting signals of two QTLs in the surroundings. Alternatively, a more reliable solution would be to increase marker density and population size, since these two factors may have limited the ability to differentiate close linkage from pleiotropy in the current study (Lebreton et al., 1998;Varona et al., 2004;David et al., 2013).
Rht-B1 and TaGW2-6B showed pleiotropic effects on yield-syndrome From the 24 assessed functional markers (Supplementary  Table S1), Rht-B1 and TaGW2-6B were significantly associated with GY-syndrome. One and 66 surrounding linked markers (r 2 >0.5) were found for Rht-B1 and TaGW2-6B, respectively (Supplementary Tables S8, S9). Within each vicinity window, the LR test could not reject the hypothesis of pleiotropy for these loci. Therefore, we conclude that the pipeline of MT analyses used in the present study confirmed the pleiotropic nature of Rht-B1 and TaGW2-6B effects on GY-syndrome of wheat. The pleiotropic roles of these two loci are discussed in the following paragraphs.
Dwarfing loci Rht8, Rht-B1, and Rht-D1 have been extensively associated with variation in GY, along with other GY-syndrome traits (Austin et al., 1980;Börner et al., 1993;Miralles and Slafer, 1995;Korzun et al., 1998;Ellis et al., 2002;Brancourt-Hulmel et al., 2003;Kuchel et al., 2007;Zanke et al., 2014bZanke et al., , 2015Casebow et al., 2016;Kowalski et al., 2016;Mohler et al., 2016). Functional markers linked to these genes are available (Korzun et al., 1998;Ellis et al., 2002) and were used to characterize the GABI-WHEAT population in past works (Kollers et al., 2013a, b). A bivariate association was found for Rht-B1, with the dwarfing allele (Rht-B1b) simultaneously increasing GY and GPE (Supplementary Table S8). As jointly concluded from past studies, it seems that although genotypes carrying Rht-B1b can produce more GPE, its positive association with GY would highly depend on effects of Rht-B1b on other GY-syndrome traits, along with counterbalancing effects of higher GPE values on grain weight per se and Rht-B1b interactions with environments (Börner et al., 1993;Kuchel et al., 2007;Casebow et al., 2016). Apparently, increases in GPE produced by Rht-B1b in the GABI-WHEAT population were strong enough to compensate for any negative feedback on GY, resulting in a positive net effect on these two traits.
Based on 11 SNPs, Qin et al. (2014) described four different haplotypes, Hap-6B-1 to Hap-6B-4, for locus TaGW2-6B. In their study Hap-6B-1 was associated with higher average TGW, this superiority being attributed to increased grain length, width, and/or thickness. The GABI-WHEAT population was characterized in Zanke et al. (2015) using a marker designed by Qin et al. (2014) capable of distinguishing Hap-6B-1 from the other three haplotypes. As in the present study, Zanke et al. (2015) found no association of TaGW2-6B with TGW. Recently, Mohler et al. (2016) found a minor TGW QTL at TaGW2-6B in a biparental doubled haploid population, but Hap-6B-2 presented a higher TGW compared with Hap-6B-1. Moreover, in the current study, Hap-6B-1 simultaneously increased GY and EW (Supplementary Table S9). Future studies should clarify the mechanisms that allowed TaGW2-6B to influence GY and EW without causing significant changes in TGW in the GABI-WHEAT population. Nevertheless, although a GY component different from TGW was involved, our findings confirm the positive role of Hap-6B-1 on GY-syndrome.
Pleiotropic effects are more environmentally stable for plant height and heading date than for grain yield In past studies, GY QTLs have been often reported as environmentally unstable (e.g. Kuchel et al., 2007;Kumar et al., 2007;Cuthbert et al., 2008;Reif et al., 2011;Bennett et al., 2012;Zhang et al., 2014). In parallel, Kowalski et al. (2016) showed that while pleiotropic effects of Rht8 are highly stable on PH, this was not the case on GY. In contrast, Simmonds et al. (2014) found two co-located QTLs on chromosome 6A influencing GY and TGW, respectively, which presented different environmental instability patterns for each trait. We further compared the environmental stability of marker pleiotropic effects on traits measured in the same number of environments (i.e. GY, PH, and HD). For this, we fitted a univariate multiple regression model including main environmental and marker effects plus their interactions considering the same data used for MT-GWAS. For GY-PH-and GY-HD-associated markers, the proportion of GY variation explained by main marker effects together corresponded to 1.6 and 0.6 times that attributed to interactions, respectively, while these ratios were 29.4 and 9 for PH and HD, respectively. This clearly indicates that marker×environment interactions played a more important role on GY than on PH or HD, and points to an increased environmental stability of pleiotropic effects on PH or HD than on GY.

Disentangling close linkage from pleiotropy underlying yield-syndrome: implications in applied multiple-trait breeding
Markers Tdurum_contig10194_765, BS00003586_51, and Tdurum_contig30930_184, presenting bivariate GY-GPE, GY-TGW, and GY-EW associations, respectively, were disentangled as cases of close linkage (Table 3). GY and GPE QTLs of Tdurum_contig10194_765 were displaced to markers having an r 2 of 0.12, whereas repositioning of GY and EW QTLs of BS00003586_51 and Tdurum_contig30930_184, respectively, resulted in close linkage cases of markers having r 2 values of 0.54 and 0.58, respectively. We further investigated the frequencies of the four potential haplotypes generated by combining different alleles at these loci ( Supplementary   Fig. S5). For GY-GPE and GY-TGW disentangled cases, haplotypes maximizing both GY along with GPE or TGW were the most frequent, representing 76.5% and 73.8% of the GABI-WHEAT population, respectively, and suggesting that these haplotypes have been positively selected by wheat breeders. In contrast, the most frequent haplotype (52.6% in the GABI-WHEAT population) for the GY-EW disentangled case decreased both traits simultaneously; thus, efforts should be allocated to reverse this situation in MT improvement of these traits. In addition, blindly performing MT-MAS using markers BS00003586_51 and Tdurum_con-tig30930_184 would allow the selection of undesired haplotypes, corresponding to 11.6% and 7.7% of the selected fraction, respectively. Although these last magnitudes are relatively small, they illustrate the potential problems caused by blindly using spurious pleiotropic associations and the importance of disentangling close linkage from pleiotropy in MT-MAS. Moreover, most bivariate associations remained as pleiotropic in our study (Supplementary Tables S4-S9) and, provided they are truly pleiotropic, MT-MAS using markers whose trait induced co-variation (Fig. 2) agrees with breeding goals should be straight forward. Nonetheless, we also found bivariate associations for which this was not the case. For instance, 33% of GY-EW effective associations induced negative trait co-variations. For these cases, using additional markers that complement or mitigate undesired effect(s) of pleiotropic markers could be a solution (Chen and Lübberstedt, 2010). Modeling pleiotropic epistasis as done recently in humans (Zhang et al., 2016) could bring a better understanding of how this complementarity works.

Conclusion
We assessed the GY-syndrome of bread wheat through MT-GWAS approaches in a population of 372 varieties adapted to European environments and genotyped with 18 832 SNPs plus 24 polymorphic functional markers. We conclude that distinguishing pleiotropy from close linkage underlying the pleiotropic architecture of GY-syndrome is very challenging, especially because of the small sizes of QTLs influencing this complex trait. Our results should be considered as a starting point for further complementary biological validations.

Supplementary data
Supplementary data are available at JXB online.
Simulation methods. Details for simulations on power and Type I error of close linkage detection. Table S1. Details of functional molecular markers used to characterize the GABI-WHEAT population. Table S2. Details of markers used for pleiotropy and close linkage simulations. Table S3. General statistics for BLUEs across environments. Table S4. Details for markers with significant GY-PH associations and test of pleiotropy versus close linkage. Table S5. Details for markers with significant GY-HD associations and test of pleiotropy versus close linkage. Table S6. Details for markers with significant GY-TGW associations and test of pleiotropy versus close linkage. Table S7. Details for markers with significant GY-TW associations and test of pleiotropy versus close linkage. Table S8. Details for markers with significant GY-GPE associations and test of pleiotropy versus close linkage. Table S9. Details for markers with significant GY-EW associations and test of pleiotropy versus close linkage. Table S10. Percentage of phenotypic and genetic variation explained in total by associated markers Table S11. Power to differentiate close linkage from pleiotropy in the case of unequal QTL sizes for two simulated traits. Table S12. Type I error of pleiotropy rejection. Fig. S1. Number of effective bivariate associations. Fig. S2. Frequency distributions of R 2 values. Fig. S3. Decay of intrachromosomal pairwise r 2 . Fig. S4. Two-dimensional scan in the surroundings of marker IACX8108. Fig. S5. Haplotype frequencies for cases disentangled as close linkage.