Marker-Trait Associations for Enhancing Agronomic Performance, Disease Resistance, and Grain Quality in Synthetic and Bread Wheat Accessions in Western Siberia

Exploiting genetically diverse lines to identify genes for improving crop performance is needed to ensure global food security. A genome-wide association study (GWAS) was conducted using 46,268 SNP markers on a diverse panel of 143 hexaploid bread and synthetic wheat to identify potential genes/genomic regions controlling agronomic performance (yield and 26 yield-related traits), disease resistance, and grain quality traits. From phenotypic evaluation, we found large genetic variation among the 35 traits and recommended five lines having a high yield, better quality, and multiple disease resistance for direct use in a breeding program. From a GWAS, we identified a total of 243 significant marker-trait associations (MTAs) for 35 traits that explained up to 25% of the phenotypic variance. Of these, 120 MTAs have not been reported in the literature and are potentially novel MTAs. In silico gene annotation analysis identified 116 MTAs within genes and of which, 21 MTAs were annotated as a missense variant. Furthermore, we were able to identify 23 co-located multi-trait MTAs that were also phenotypically correlated to each other, showing the possibility of simultaneous improvement of these traits. Additionally, most of the co-located MTAs were within genes. We have provided genomic fingerprinting for significant markers with favorable and unfavorable alleles in the diverse set of lines for developing elite breeding lines from useful trait-integration. The results from this study provided a further understanding of genetically complex traits and would facilitate the use of diverse wheat accessions for improving multiple traits in an elite wheat breeding program.

Both Kazakhstan and Russia are major wheat producers, selfsufficient in wheat grain (http://faostat.fao.org/), and increasingly ambitious in occupying the export markets. Unlike, modern cultivars grown in similar environments of North America, the spring wheat varieties grown in this region are tall and sensitive to day length (Trethowan et al. 2006). Even without the introduction of Rht (reduced height) and Ppd (photo period sensitivity) genes, or the 1B.1R translocation, Siberian varieties have achieved annual genetic gains of 0.7% since 1900 (Morgounov et al. 2010). In the late 1990s, CIMMYT began a collaborative research and breeding program with Northern Kazakhstan and Western Siberia, and by 2000 this had evolved into the Kazakhstan-Siberia Network for Spring Wheat Improvement (KASIB), comprising 18 breeding and research programs. The network focuses on two major activities: KASIB-YT (multi-location cooperative yield trials to evaluate disease resistance, grain yield, quality, and other traits) and shuttle breeding between the region and CIMMYT to incorporate stem rust (incited by Puccinia graminis f.sp. tritici) and leaf rust (incited by P. triticina) resistance while maintaining local adaptation, drought tolerance, and grain quality. However, to achieve these goals require inclusion of diverse novel source of genetic resources, characterization of their genetic diversity and traits, and identification of beneficial genes for their utilization in a wheat breeding program.
It is well known that bringing rich source of genetic resources can be used for enhancing the beneficial genes/variation in modern wheat. However, plant breeders relying on a few varieties led to the loss of genetic diversity in modern wheat (Lage et al. 2003;Zhang et al. 2005;Bhatta et al. 2018c), resulting in a genetic bottleneck. Therefore, wheat breeders have been recovering the lost genetic diversity by utilizing landraces, wild relatives such as Aegilops spp., internationally exchanged germplasm, or synthetic hexaploid wheat (Ogbonnaya et al. 2013;Dempewolf et al. 2017;Bhatta et al. 2018c;Morgounov et al. 2018). Among these, synthetic hexaploid wheat (2n = 6x = 42, AABBDD: a cross between cultivated T. turgidum L.: 2n = 4x =28, AABB, and wild Aegilops tauschii Coss.: 2n = 2x = 14, DD) represent a particularly important group of novel genetic resources because of their wide genetic diversity, resistance/tolerance to multiple biotic and abiotic stresses, easy crossing compatibility with wheat, and large number of novel favorable alleles compared to wild relatives (Lage et al. 2003;Zhang et al. 2005;Zegeye et al. 2014;Bhatta et al. 2018cBhatta et al. , 2018dBhatta et al. , 2019aMorgounov et al. 2018).
Unraveling and mining novel genes from rich genetic resources are possible from a genome-wide association study (GWAS). Several studies have used GWAS to unravel novel loci associated with agronomic performance, disease resistance, and quality traits in bread wheat (Neumann et al. 2011;Edae et al. 2014;Sukumaran et al. 2015;Velu et al. 2017;Sehgal et al. 2017;Lozada et al. 2017) and synthetic hexaploid wheat (Ogbonnaya et al. 2013;Zegeye et al. 2014;Jighly et al. 2016;Das et al. 2016;Morgounov et al. 2018;Bhatta et al. 2018aBhatta et al. , 2018bBhatta et al. , 2018dBhatta et al. , 2019a. However, there are many genomic regions/genes for these traits yet to be identified using diverse group of wheat germplasm and new wheat genome sequence information. Therefore, this study was conducted to evaluate phenotypic variation in diverse wheat accessions for agronomic performance, disease resistance, and grain quality; and to identify genomic regions and potential candidate genes from a GWAS. The results from this study provided further insights on genetic characterization of complex traits and assisted in wheat genetic improvement.

Experimental materials and design
A total of 143 diverse wheat germplasm used in this study represent four major groups. The first group comprised of 13 spring synthetic lines developed by Kyoto University (Japan) from crosses between Langdon durum (T. turgidum L.) and 13 diverse accessions of Ae. tauschii. The second group comprised of 39 winter synthetics developed by CIMMYT from crosses between five winter durums (Aisberg, Leuc. 84693, Pandur, Ukr-Od. 1530.94, and Ukr-Od. 952.92) and 10 different Ae. tauschii accessions. The third group comprised of 14 bread wheat lines originated from USA breeding programs. The fourth group comprised of 77 bread wheat lines developed by KASIB (Kazakhstan and Siberian) breeding program, where14 lines were from Kazakhstan and 63 from Siberia, Russia (Table S1). Genome-wide diversity analysis conducted in these lines showed considerable genetic diversity among lines from different breeding programs, and the population structure analysis identified that these lines can be divided into three subgroups based on the type of wheat and their geographical origin (Bhatta et al. 2019b). The idea of selecting diverse germplasm from different breeding programs was to assess their adaptation and performance to a new area (Siberia) as well as to better understand the growing region from already adapted lines in the specific region.
Two years of field experiments (2017 and 2018) were conducted at the research farm located at Omsk State Agriculture University (55°0 19122$ N, 73°18946$ E) under rainfed conditions. The experimental design used in this study was a randomized complete block design with four replications.

Phenotypic data collection and analysis
Diverse wheat accessions used in this study were evaluated for four diseases [leaf rust, stem rust, powdery mildew (incited by Blumeria graminis f. sp. tritici), and Septoria tritici blotch (sepf; Mycosphaerella graminicola)]; area under disease progress curve (AUDPC) for leaf rust and stem rust; 27 agronomic (grain yield and yield related traits including root traits) and two grain quality (grain protein and gluten content) traits under rainfed conditions. Grain yield related traits were emergence, days to heading, grain length, grain area, grain circularity, grain perimeter, grain weight per plant, grain weight per spike, grains per spike, number of spikes, spike length, spike weight, spike density, and spikelet numbers at maturity, spike harvest index, peduncle length, plant height at maturity, leaf number, number of productive tillers and plants at maturity, harvest index, thousand kernel weight, dry plant weight with roots, root-traits including root diameter, root volume, and total root length. These traits were measured using standard protocol described previously (Bhatta et al. , 2018e, 2018b(Bhatta et al. , 2019aMorgounov et al. 2018). Root traits were measured under field conditions 3-4 days after flowering using WinRhizo software (WinRhizo reg. 2009c, Regent Instruments Inc., Quebec City, QC, Canada) as described previously (Bhatta et al. 2018b).
An individual analysis of variance for each year was performed due to the significant genotype x year interactions for most of the traits under study (Table 1). Best linear unbiased estimates (BLUEs) were obtained by assuming genotype as a fixed effect and replication as a random effect using PROC MIXED in SAS 9.4 (SAS Institute Inc., Cary, NC). Principal component biplot analysis was performed to understand the association among agronomic, disease resistance, and grain quality traits based on correlation matrix using BLUEs. Variance components were estimated by the restricted maximum likelihood (REML) method assuming a full random model, and entry-mean based broad-sense heritability (H 2 ) was calculated using following model.
where s 2 g , s 2 yr , s 2 gxyr , and s 2 e are the variance components for genotype, year, genotype · year, and error, respectively, and y and r are the numbers of years and replications, respectively.
Genotyping and SNP discovery Genotyping and SNP discovery procedures for all 143 lines were described previously (Bhatta et al. 2019b). In brief, genomic DNA was extracted from 2-weeks old leaves using Biosprint 96 Plant Kit (Qiagen) protocol. Genotyping was performed at the Wheat Genetics and Germplasm Improvement Center at Kansas State University, Manhattan, KS using genotyping-by-sequencing (GBS) method. The GBS libraries were constructed in 96-plex following digestion of DNA with two restriction enzymes (PstI and MsPI), and pooled libraries were sequenced using Illumina HiSeq (Illumina, Inc., San Diego). SNP discovery was performed using TASSEL v.5.2.40 GBS v2 Pipeline (Glaubitz et al. 2014) with a physical alignment to the Chinese spring genome sequence (RefSeq v1.0) (International Wheat Genome Sequencing Consortium 2018). The identified SNPs with minor allele frequency (MAF) of more than 5% and missing data frequency of less than 20% were retained for the analysis (Bhatta et al. 2018c(Bhatta et al. , 2019b.

Genome-wide association study
The multi-locus mixed linear model using FarmCPU (Fixed and random model Circulating Probability Unification) algorithm implemented in R package (Liu et al. 2016a) was used for the association mapping. Population structure analysis performed in our recent study identified that the 143 lines can be divided into three sub-population (Q 1-3 ) using Bayesian clustering algorithm (Bhatta et al. 2019b). Therefore, the first three Q 1-3 matrices were used for controlling the population structure in GWAS analysis. In the FarmCPU model, Q matrix was used as a fixed effect whereas kinship matrix (K, controlling genetic relatedness) was used as a random effect (Liu et al. 2016a). BLUEs calculated for each trait in each year were taken as phenotype and n■ Table 1 Basic summary statistics with mean, standard deviation (SD), minimum (Min), and maximum (Max), broad sense heritability (H 2 ) values, and number of significant marker-trait associations (NMTAs) for 35 traits for 143 diverse synthetic and bread wheat accessions evaluated in 2017 and 2018 at Omsk, Siberia b Geno, genotype. c NMTAs, the total number of significant marker-trait associations identified at the 5% level of significance with a P . 1.08 x 10 26 (-log 10 P = 5.97). (Table S2) were taken as a genotype for the GWAS. The model fit was tested using the quantile-quantile plot considering the deviation of the observed test statistics values from the expected test statistics values ( Figure S1). The identified MTAs were corrected for multiple testing and the MTAs were tested against a Bonferroni correction at the 5% level of significance with a P . 1.08 · 10 26 (-log 10 P = 5.97). The percent of phenotypic variance explained by each significant SNP was estimated using multiple-linear mixed model as described previously (Bhatta et al. 2018d).

Linkage disequilibrium estimation
Linkage disequilibrium (LD) among markers was estimated using the full matrix and sliding window size of 50 in Tassel V5.2.52 (Bradbury et al. 2007), using 46,268 SNPs, for the A, B, D, and entire genomes. LD was calculated as the squared allele frequency correlations (r 2 ). Pairwise LD r 2 values were plotted against the corresponding physical distance, and a non-linear regression model was fitted to estimate the genomewide LD decay (Hill and Weir 1988;Remington et al. 2001). The critical value of r 2 beyond which the LD was likely to be caused by linkage was set at r 2 = 0.2. The average LD decay of the association mapping panel was determined as the point at which LD curve intercepts the critical r 2 .

Putative candidate gene annotation
The putative genes underlying the significant SNPs and their annotations were obtained from the IWGSC RefSeq V1.0 annotations provided for the Chinese spring wheat (International Wheat Genome Sequencing Consortium 2018). The predicted effect of the SNPs on the protein function was obtained from Variant Effect Predictor by EnsemblPlants for wheat (http://plants.ensembl.org/Triticum_aestivum/Tools/VEP/).

Data availability
Supplementary files including genotyping-by-sequencing derived SNP marker data are available in figshare at https://gsajournals.figshare.com/ s/6434bc736b63a242e546. Figure S1 contains Manhattan and quantilequantile (Q-Q) plots for 35 traits in 143 diverse wheat accessions obtained from a genome-wide association study. Figure S2 contains principal component biplot analysis among agronomic, disease resistance, and grain quality traits in 143 diverse wheat accessions. Figure S3 contains the physical distribution of 46,268 genotyping-by-sequencing derived SNPs within a 1-Mb window size on 21 chromosomes (Chr) of 143 diverse accession of wheat. Table S1 contains details of the 143 diverse wheat accessions used in this study. Table S2 contains genotypeby-sequencing derived SNPs identified in 143 diverse wheat lines. Table  S3 contains details of significant markers associated with 35 traits in 143 diverse wheat accessions grown in two seasons (2017 and 2018) in Siberia. Table S4 contains details of potential candidate gene functions harboring SNPs affecting agronomic, disease resistance, and grain quality traits from two years (2017 and 2018) of experiments conducted in Siberia using 143 diverse wheat accessions. Table S5. Genomic fingerprinting of 143 diverse accessions of wheat showing the distribution of favorable (1), unfavorable (0) and heterozygote (0.5) alleles of significant marker-trait associations identified in this study. The synthetic and bread wheat seeds are available upon request. Supplemental material available at figshare: https://doi.org/10.25387/g3.9943682.

Phenotypic evaluation
The combined analysis of variance (ANOVA) revealed significant cross-over genotype x year interaction for most of the traits under study (Table 1). Therefore, individual ANOVA was performed and it revealed significant genetic variation for all traits (Table 1). Variation for grain yield ranged from 54 to 549 g m -2 with an average of 322 g m -2 in 2017 and it ranged from 77 to 657 g m -2 with an average of 391 g m -2 in 2018. Similarly, variation for grain protein content ranged from 14 to 22% with an average of 18% in 2017 and it ranged from 14 to 21% with an average of 17% in 2018 (Table 1).
A principal component analysis (PCA) based on correlation matrix was performed to investigate the relationship among all 35 traits ( Figure  S2). Approximately, 68.4% of the total variation in 143 lines were explained by the first four PCs. The first two PCs were predominantly associated with agronomic and grain quality traits, and it contributed $54% of the total variation in the data. Wheat rust diseases such as leaf and stem rusts represented the third PC, and it contributed 8.9% of the total variation. Similarly, root traits such as total root length and root volume were associated with the fourth PC, and it contributed 5.5% of the total variation.

Marker density and linkage disequilibrium
A total of 192,876 GBS derived SNPs were obtained based on the Chinese Spring RefSeq v1.0. After removing SNPs with MAF ,5% and missing data .20%, 46,268 SNPs were retained for subsequent analysis (Table  S2). The marker density for the A, B, and D genome were 20.2, 20.8, and 20.2 Mb per SNPs, respectively ( Figure S3).
A total of 46,268 markers were used to evaluate LD decay for the A, B, D, and the whole genome. The scatter plot of r 2 against physical distance showed that the LD decay with increasing physical distance. The average LD decayed below r 2 = 0.2 for the entire genome was 1.06 Mb where, the greatest extent of LD was observed on the D genome (3.06 Mb), followed by the B (1.71 Mb) and A (1.06 Mb) genomes ( Figure 1).

Genome-wide association study
A GWAS identified a total of 243 significant SNPs (P . 1.08 · 10 26 ) that were associated with 35 traits including agronomic, disease resistance, and grain quality traits, (Figure 2, Tables 1 and S3). These significant SNPs were distributed across all 21 chromosomes of wheat, and the phenotypic variance explained (PVE) by these SNPs ranged from 0.3 to 25.0%. The highest number of MTAs were identified on the B genome (85 MTAs) followed by the A (72 MTAs), and the D (62 MTAs) genomes of wheat. Results of specific group of traits will be discussed hereafter.
Agronomic performance traits A total of 185 MTAs were identified for the agronomic performance traits evaluated in this study (Figure 2). The MTAs were distributed across all chromosomes with PVE ranging from 0.3 to 25% (Table S3). Of the 185 MTAs, 72, 68, and 45 MTAs were detected on the A, B, and D genome, respectively. The highest number of MTAs were observed for harvest index (12) and spikelet number (12), followed by grain length (11), thousand kernel weight (10), spike length (10), and number of plants (10), while the lowest number of MTAs were observed for emergence (2) and grains per spike (2).

Disease resistance traits
A total of 42 MTAs were identified for six disease-related traits distributed on all chromosomes except for chromosomes 4B and 5D (Figure 2) with PVE ranging from 1.0 to 14.0% (Table S3). In detail, 12, five, seven, three, eight, and seven MTAs were associated with leaf rust, stem rust, powdery mildew, Septoria, leaf rust AUDPC, and stem rust AUDPC scores, respectively.

Grain quality traits
Grain protein concentration and gluten content are two grain quality traits measured in this study. A total of 16 MTAs were identified for grain quality-related traits on chromosomes 1A, 1D, 2A, 2D, 3A, 3B, 4A, 5A, 5B, and 7D ( Figure 2) with PVE ranging from 1.9 to 13.9% (Table S3). Of these, nine and seven MTAs were associated with grain protein and gluten content, respectively.

Putative candidate gene annotation
The functional annotation of genes underlying significant SNPs was identified through the IWGSC RefSeq v1.1 annotation and the impact of the genes was identified through variant effect predictor from EnsemblPlants (http://plants.ensembl.org/Tools/VEP). A total of 116 MTAs were found within genes and of these, 21 MTAs were annotated as a missense variant and had a moderate impact (Table S4). Moderate impact genes were found for several traits such as grain length, grain perimeter, heading date, leaf rust AUDPC, number of productive tillers, number of spikes, peduncle length, plant weight with roots, protein content, Septoria, spike length, spike weight, spikelet number, and thousand kernel weight. The detailed annotations of these gene-IDs are provided in Table 2.

Multi-trait and stable marker trait associations
Multi-trait MTAs are the common genomic regions controlling multiple traits. The present study identified 23 multi-trait (group affects 2-4 traits) MTAs located on chromosomes 1A, 1B, 2A, 2D, 3B, 4A, 4B, 5A, 5B, 5D, 6B, 7A, and 7D (Table 3) and of these 23 MTAs, 18 were within genes. For example, a SNP at S1A_440183259 on chromosome 1A was significantly associated with number of plants and spikes and was present in rhomboid family protein (TraesCS1A01G248200.1) gene. Also, a SNP at S2A_734585818 bp on 2A chromosome was significantly associated with grain area, length, and perimeter and was present in 70 kD heat shock protein (TraesCS2A01G506900.1) gene. Stable markers (marker consistently identified in both years) were identified for thousand kernel weight (S4A_732725628) on chromosome 4A at 733 Mb, powdery mildew (S6D_5824400) on chromosome 6D at 582 Mb, and stem rust AUDPC (S7D_29550951) on chromosome 7D at 296 Mb (Table S3).
Genomic fingerprinting and favorable alleles distribution in synthetic and bread wheat accessions Genomic fingerprinting for 243 MTAs with the distribution of favorable, unfavorable, and heterozygote alleles in 143 diverse wheat lines is shown in Figure 3 and Table S5. Favorable alleles in this study were defined as the alleles of MTAs that had an increasing effects on agronomic performance, disease resistance, and grain quality traits in wheat accessions. For grain yield and yield components, 18 out of 27 fingerprinted MTAs had favorable alleles in more than 55% of lines ( Figure 3A). Similarly, 15/38, 24/48, 20/54, 5/18, 13/42, and 5/16 fingerprinted MTAs, respectively, for grain related-, spike productivity-, phenological-, root-, disease resistance-, and grain quality-traits, had favorable alleles in more than 50% of lines ( Figure 3 and Table S5). Top five lines (2017ENTRY # 91, 94, 112, 158, and 164) recommended in this study had 92 to 102 MTAs with favorable alleles.
A wide range of distribution of favorable alleles between synthetic and bread wheat accessions was observed (Figure 4 and Table S5). For example, synthetic wheat originated from CIMMYT and Japan had favorable alleles ranging from 83 to 102 and 86 to 95, respectively. Bread wheat originated from Kazakhstan, Russia, and USA had favorable alleles ranging from 76 to 110. This study identified a total of 62 MTAs on the D genome where 42 of them are novel MTAs. Of these 42 novel MTAs, synthetic wheat originated from CIMMYT and Japan had favorable alleles ranging from 12 to 26 and 20 to 28, respectively (Table S5).

DISCUSSION
Multiple traits such as grain yield and disease related traits are complex and quantitative in nature, and are largely governed by multiple genes and environmental factors . Genetic dissection of these traits is still a challenge, however, the recent advances in sequencing technologies has provided better resolution in deciphering these traits. Furthermore, the utilization of recently published fully annotated wheat reference genome will enhance wheat genetic improvement. The current study used diverse accessions of bread and synthetic wheat to understand the genetic architecture of multiple traits that are collected in most of the breeding programs.

Phenotypic evaluation
In this study, diverse wheat accessions showed significant level of variation for agronomic, disease, and quality traits with low to high heritability. Significant cross-over genotype by year interaction was observed for most of the traits, indicating high influence of environmental variation and trait instability/complexity (meaning the genes are greatly affected by the environment). Previous studies have also identified these traits as complex traits with similar heritability (Ogbonnaya et al. 2013;Bhatta et al. , 2018bSehgal et al. 2017). The grain yield heritability ranged from 0.20 to 0.75 (Bhatta et al. 2018b;Sun et al. 2019) in previous studies was also in line with our findings. We have recommended five top performing lines with high yield, multiple disease resistance, and better grain quality for its direct use in a breeding program.

Linkage disequilibrium
A recent study on a diverse panel of 166 wheat lines from China have identified the LD decay at 6 Mb, 4 Mb, 11 Mb, and 8Mb for the A, B, D, and the entire genomes, respectively (Liu et al. 2017b). The LD decay identified in our study was lower than this recent study in wheat (Liu et al. 2017b). However, the highest LD decay on the D genome observed in our study was consistent with many previous reports (Edae et al. 2014;Sukumaran et al. 2015;Liu et al. 2017b), indicating that fewer markers are needed for GWAS on the D genome compared to either A or B genome. Relatively higher LD decay on the D genome may be attributed to the inclusion of diverse accessions of Ae. tauschii primarily from synthetic hexaploid wheat.

Genome-wide association study
Genome-wide association study was performed on the 35 traits with the use of $46,000 GBS-derived SNPs based on the multi-locus mixed linear model implemented in FarmCPU algorithm that remove the confounding between testing markers and kinship that results in false negatives, prevents model overfitting, and control false positives simultaneously (Liu et al. 2016a). The present study identified many significant SNPs (on all chromosomes) and related putative candidate genes associated with agronomic, disease resistance, and grain quality traits by employing the GWAS approach.

MTAs for agronomic performance traits
Previous GWAS and QTL (quantitative trait loci) mapping studies found QTL/MTAs for grain yield on chromosomes 3A (Neumann et al. 2011;Bordes et al. 2014;Hoffstetter et al. 2016;Sehgal et al. 2017;Bhatta et al. 2018b), 3D (Bordes et al. 2014;Bhatta et al. 2018b), 5B (Pinto et al. 2010;Neumann et al. 2011;Bordes et al. 2014; Figure 2 Significant marker-trait associations (P . 1.08 · 10 26 ) identified on each chromosome for agronomic, disease resistance, and grain quality traits obtained from the genome-wide association study of 143 synthetic and bread wheat lines grown in 2017 and 2018 growing seasons in Siberia. Edae et al. 2014;Sukumaran et al. 2015), and 6D (Bhusal et al. 2017). However, it is difficult to align our findings with previous studies due to the absence of precise physical location information in published studies, use of different marker systems (90K SNP, short sequence repeat, and diversity arrays technology marker), and different versions of wheat reference genome used than the recent IWGSC RefSeq v1.0 (Liu et al. 2017b;Bhatta et al. 2018b). Despite this fact, the identification of MTAs on the same chromosome as earlier studies could provide confidence on these associations. Since, we cannot directly compare with QTL reported by other previous studies, we have assumed the significant MTAs identified in this study as the novel marker if it was not reported on the same chromosome in previous studies. For example, to the best of our knowledge, an MTA identified for grain yield on chromosome 7D in this study has not been previously reported and it is potentially a novel MTA responsible for controlling grain yield in wheat.
Understanding the genetic architecture of grain is very important for determining the grain yield of any crop. However, there is limited study on understanding genetic architecture and complexity of grain related traits (Arora et al. 2017). This study evaluated several grain related traits and identified several associated MTAs for grain related traits such as grain length, perimeter, area, grains per spike, grain weight per spike, etc. Specifically, this study identified 17 novel MTAs associated with improving architecture traits in wheat on chromosomes 3A and 6D for grain length, on chromosomes 3A and 4A for grain perimeter, on chromosomes 1D and 3A for grain area, on chromosome 7D for grain weight per spike, on chromosomes 1B, 2A, and 3B for grain weight per plant, and on chromosomes 1B, 2B, 4B, and 5A for grain circularity. Remaining MTAs identified for grain related traits were previously reported on chromosomes 1D (Xiao et al. 2011 (Zhang et al. 2010) and 5D (Liu et al. 2006;Zhang et al. 2010) for grains per spike. This study provided better understanding of genetic architecture of grain related traits and the results of these traits will be useful in improving grain yield in wheat and related cereal crops.
Spike-related traits play an important role in determining the yield potential in wheat (Reynolds et al. 2009). This study evaluated six spikerelated traits to unravel the genetic basis of spike architecture in wheat. We identified 28 novel MTAs for spike-related traits such as number of spikes (3), spikelet numbers (6), spike density (3), spike length (7), spike weight (4), and spike harvest index (5). Remaining identified MTAs for spike productivity traits were previously reported on chromosomes 3D (Liu et al. 2006) and 7A (Guan et al. 2018) for number of spikes; on chromosomes 4D (Chu et al. 2008), 5A (Li et al. 2002), and 7A (Li et al. 2002) for spikelet numbers; on chromosomes 1B (Li et al. 2002) 3D (Chu et al. 2008), and 4B (Liu et al. 2006) Table 2 List of important genes underlying significant marker-trait associations for several traits under study whose annotation showed moderate impact and consequence as missense variant weight have rarely been studied for QTL study. All the MTAs (40) identified for these traits have not been previously reported and they are potentially novel MTAs controlling these traits. For other phenological traits such as plant height and heading date, most of the QTL identified in this study have been reported previously on chromosomes 1B, 3A, and 5A (Chu et al. 2008;Sukumaran et al. 2015;Guan et al. 2018). However, seven MTAs identified for plant height (chromosomes 1D and 4A) and heading date (chromosomes 1D, 2B, 4A, 5D, and 7A) have not been previously reported, and are potentially novel MTAs governing these traits. Root traits are complex, labor intensive, and expensive traits to measure. Limited studies have been conducted on the field based root traits analysis in wheat (Bhatta et al. 2018b) and a few QTL have been identified (Canè et al. 2014;Maccaferri et al. 2016;Bhatta et al. 2018b). The current study examined the roots traits such as root length, root volume, and root diameter after 3-4 days of flowering. This study identified nine novel MTAs associated with root traits on chromosomes 2D, 3B, 4A, 4D, 5D, 6A, and 7D. However, several MTAs identified in our study have been previously reported for root length on chromosomes 3B (Maccaferri et al. 2016;Bhatta et al. 2018b), 5A (Maccaferri et al. 2016), 6B (Maccaferri et al. 2016), 7A (Maccaferri et al. 2016;Bhatta et al. 2018b), and 7B (Maccaferri et al. 2016), and for root volume on chromosomes 5A and 7A (Maccaferri et al. 2016).

MTAs for disease resistance traits
Genetic resistance is one of the sustainable approaches for controlling disease. This study have identified several MTAs for leaf rust, stem rust, powdery mildew, and Septoria similar to previous studies, where they have identified QTL/genes for rusts on all 21 chromosomes (Rosewarne et al. 2012;McIntosh et al. 2014;Kertho et al. 2015;Jighly et al. 2016); for powdery mildew QTL on chromosomes 2B (Liu et al. 2001;Mingeot et al. 2002;Bougot et al. 2006;Bai et al. 2012), 2D (Bougot et al. 2006;Bai et al. 2012), 4A (Mingeot et al. 2002;Bougot et al. 2006;Chen et al. 2009), 5B (Bougot et al. 2006;Liu et al. 2017a), and 6D (Chen et al. 2009); and for Septoria QTL on chromosomes 1B (Chartrain et al. 2005;Tabib Ghaffary et al. 2011), 2D (Simón et al. 2004;Tabib Ghaffary et al. 2011), and 7B n■ Table 3 Multi-trait marker-trait associations with phenotypic variance explained (PVE) and SNP effect identified from genome-wide association study on 143 diverse accessions of wheat    Simón et al. 2004). However, an MTA identified for powdery mildew on the chromosome 3A has not been previously reported and it is potentially a novel MTA. Area under disease progress curve (AUDPC) for leaf rust and stem rust was measured based on the repetitive disease severity assessments and is used to assess quantitative disease resistance in wheat accessions (Jeger and Viljanen-Rollinson 2001). Two MTAs identified for leaf rust AUDPC were also reported in earlier studies on chromosomes 1B (Schnurbusch et al. 2004) and 3B (Lu et al. 2017). However, six MTAs identified for leaf AUDPC on 1A, 1D, 5A, 6A, 6B, and 6D, and all seven MTAs for stem rust AUDPC have not been reported previously and are potentially novel MTAs controlling these traits.
MTAs for grain quality traits Several MTAs identified for grain quality traits such as grain protein and gluten content have been previously reported in the literature. For instance, the previous QTL similar to our study for grain protein concentration were on chromosomes 1A (Li et al. 2013), 1D (Pushpendra et al. 2007;Li et al. 2013), 2A (Pushpendra et al. 2007;Li et al. 2013), 2D (Pushpendra et al. 2007), 3A (Pushpendra et al. 2007), 3B (Sun et al. 2008a;Heo and Sherman 2013), 4A (Pushpendra et al. 2007;Li et al. 2013), and 5B (Heo and Sherman 2013). Similarly, QTL for gluten content on chromosomes 3B (Liu et al. 2016b) and 7D (Li et al. 2013) were identified previously. However, five MTAs identified for gluten content on the chromosomes 2A, 5A, and 5B have not been reported previously and are potentially novel MTAs responsible for gluten content in wheat.

Putative candidate gene annotations
The variant effect predictor available on Ensemble was used to determine the functional consequences of the variant within 5 kb of the SNP location. In this study, 116 significant SNPs were identified within genes based on in silico candidate gene analysis. Of which, 21 SNPs were annotated as a missense variant and had moderate impact on genes. These missense variants cause an amino acid change and such changes may alter the function of genes, which makes these annotated gene as a strong candidate gene for future functional characterization studies in wheat. Some of the genes such as NBS-LRR disease resistance protein (Bhatta et al. 2018a(Bhatta et al. , 2018b(Bhatta et al. , 2018d, Protein DETOXIFICATION (Yokosho et al. 2016;Bhatta et al. 2018b), Zinc finger protein (Ma et al. 2009;Bhatta et al. 2018b), F-box protein (Lechner et al. 2006;Bhatta et al. 2018b), and Kinase family protein (Yan et al. 2017;Bhatta et al. 2018b) were also previously identified as important genes in wheat for agronomic and disease related traits (Bhatta et al. 2018a(Bhatta et al. , 2018b(Bhatta et al. , 2019a. We have identified MTAs associated with the same or associated traits located within genes whose annotation is exactly same. For example, harvest index, spike harvest index, number of plants, total root length, and grain perimeter were located within genes annotated as F-box family protein. This result indicated that these putative gene families may play an important in controlling these traits.

Multi-trait MTAs and favorable alleles
Several multi-trait MTAs identified in this study were also positively associated with each other. This result suggested that multiple traits may have similar genetic basis and simultaneous improvement of multiple traits could be possible. Additionally, most of the co-located MTAs were present in genes. We have identified several favorable alleles that increases the agronomic performance, disease resistance, and quality traits in wheat accessions. Although the number of favorable alleles distribution in different breeding program did not significantly vary from each other, alleles present in each breeding program were different for different traits, which indicates the use of diverse accessions of wheat for different breeding objectives. Genomic fingerprinting for MTAs with favorable and unfavorable allele distribution in 143 diverse accessions could assist in building strategies while integrating the useful traits in an elite breeding lines. This study have recommended top performing five lines with higher number of favorable alleles for multiple traits, indicating the potential use of these lines in a breeding program. The favorable alleles identified for multiple traits in this study could be useful for pyramiding superior alleles in elite wheat germplasm upon validation in an independent population.

CONCLUSIONS
The present study found considerable useful genetic variation for improving multiple traits and recommended five superior lines for directly use in a breeding program. A total of 243 genomic regions for improving agronomic performance, disease resistance, and grain quality related traits were identified. Of these, 120 favorable genomic regions were found to be novel MTAs, where large numbers were distributed on the D genome, indicating the usefulness of diverse lines in improving the D-genome diversity in elite wheat germplasm. Furthermore, this study identified co-located multi-trait markers within important gene families. The present study fingerprinted 143 diverse wheat lines for trait-associated markers and provided a detailed understanding of genetic architecture of agronomic, disease resistance and quality-related quantitative traits using diverse accessions of synthetic and bread wheat lines. However, further investigation on identified genomic regions could assist multi-trait marker-assisted breeding program.

ACKNOWLEDGMENTS
International Maize and Wheat Improvement Center (CIMMYT) at Turkey is supported by CRP WHEAT; Ministry of Food, Agriculture and Livestock of Turkey; Bill and Melinda Gates Foundation, and UK Department for International Development, grant OPP1133199, and Omsk State Agrarian University is supported by the Russian Science Foundation Project No. 16-16-10005. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, and in the decision to publish the results.