Root trait diversity, molecular marker diversity, and trait-marker associations in a core collection of Lupinus angustifolius

Highlight Roots in narrow-leafed lupin (Lupinus angustifolius) exhibit large phenotypic and genetic diversity. An association between root traits and DArT markers demonstrates potential for a marker-assisted selection programme for this species.


Introduction
Narrow-leafed lupin (Lupinus angustifolius L.) has been a predominant grain legume crop and an important component of sustainable farming systems in southern Australia since its domestication was completed in Western Australia in the 1960s and 1970s (Gladstones et al., 1998;Buirchell, 2008;Berger et al., 2012a). It makes up about 50% of the total grain legume production in Australia, with approximately 80% produced in Western Australia (French and Buirchell, 2005;ABARES, 2014). As a legume species, narrow-leafed lupin provides substantial benefits to farming systems via the symbiotic fixation of nitrogen from the air and by acting as a disease break crop in cereal rotations. Although lupins grown in Australia are primarily used as animal feed, they also provide health benefits to humans, being glutenfree, high in protein, and low in fat and carbohydrates (Coffey, 1989;WADAF, 2007;Berger et al., 2013).
Despite their economic, agricultural, and dietary importance, grain yields and planting areas of narrow-leafed lupin have declined globally in the last decade (FAO, 2013) because of low productivity and poor market value. Unlike wheat (Triticum aestivum), barley (Hordeum vulgare), and other dominant crops in Mediterranean regions, current commercial cultivars of narrowleafed lupin incorporate only a small fraction of their genetic diversity because of the short and fragmented domestication history (Zohary, 1999;Berger et al., 2013). However, the established germplasm pool of L. angustifolius at the Australian Lupin Collection, which comprises 2056 accessions (1327 wild, 214 cultivars, 431 advanced lines, 22 landraces, and 62 mutated lines) from diverse climatic and geographic locations, provides a broad genetic basis to improve crop breeding in this species (Clements and Cowling, 1991;Berger et al., 2013). In that respect, studies on phenotypic diversity and its relationship with genetic diversity in the wild germplasm offer opportunities for discovering unexploited traits that can be used to increase the yield of Australian cultivars across a range of environmental conditions. Diversity in root system architecture (RSA) across a substantial subsample of the world collection of narrow-leafed lupin was characterized in a recent study (Chen et al., 2012). However, how the phenotypic diversity reflects the genetic diversity remains unknown. There is increasing interest in a genetic analysis of RSA and function, although the focus on the links between genes and root traits is primarily on the effects of genes that directly mediate small-scale phenomena (e.g. Little et al., 2005;Bengough et al., 2006, Cai et al., 2012Canè et al., 2014;Mlodzinska et al., 2015). One of the most effective approaches to dissecting complicated quantitative traits is an analysis of quantitative trait loci (QTL), which helps to identify specific genes responsible for trait variation (Beebe et al., 2006;Weih et al., 2006;Acuna et al., 2014;Burton et al., 2014). Several types of molecular markers, such as Diversity Arrays Technology (DArT), simple sequence repeats, amplified fragment length polymorphisms, single nucleotide polymorphisms, and sequence-tagged sites, have been developed for analysing genetic diversity in various crop species (e.g. Kwon et al., 2012;El-basyoni et al., 2013;Aitken et al., 2014;Maccaferri et al., 2015;Moore et al., 2015;Uga et al., 2015;Zurek et al., 2015). The development of microarray hybridization-based technology, such as DArT, provides a useful tool to identify DNA variation at hundreds of genomic loci in parallel regardless of the sequence information (Jaccoud et al., 2001). Array-based marker technology permits the detection of population structure and relative kinship within collections (Wenzl et al., 2004;Maccaferri et al., 2015). In L. angustifolius, DArT marker analysis revealed the low genetic diversity present in the domesticated forms (Berger et al., 2012b), highlighting the need to identify and exploit useful diversity in the wild germplasm. The present study analysed genetic diversity in a core collection of wild narrow-leafed lupin using DArT markers and revealed correlations between root trait diversity (phenotype) and genetic diversity (molecular markers).

Plant material and phenotyping
A set of 111 accessions of narrow-leafed lupin (L. angustifolius), consisting of 108 wild genotypes, one landrace, and two cultivars from 13 countries and four regions (Supplementary Table S1), was evaluated for phenotypic diversity in RSA traits (Table 1). They were studied under glasshouse conditions in Perth (31°58′S, 115°49′E) using a recently developed semi-hydroponic phenotyping system (Chen et al., 2011a(Chen et al., , 2012. Detailed plant growth conditions, measurements, and calculations are described in Chen et al. (2012). Root parameters were measured 6 weeks after planting. Taproot lengths were measured 2, 4, and 6 weeks after planting and root growth rates (RGR) were calculated and classified according to incremental increases in taproot length within a given growth period. Root subsamples were scanned in greyscale using a desktop scanner (Epson Expression 1680; Epson, CA, USA) and root images were processed in WinRHIZO v2009 Pro (Regent Instruments, QC, Canada) for root length, root surface area, volume, average root diameter, and diameter class length (DCL, root length in a diameter class). The upper 0−20 cm section (separated from the plant collar) of the root system was referred to in this study as 'topsoil' and the lower part as 'subsoil'. There were 38 root traits included in this study, 17 of which had not been reported previously (Chen et al., 2012).

DArT genotyping
A set of 191 DArT markers, including 37 mapped markers, was included in the assay (Berger et al., 2012b;Kroc et al., 2014). Genotyping was performed by Diversity Arrays Technology Pty Ltd (Canberra, Australia) using the protocols described by Kilian et al. (2012). Briefly, DNA samples of each genome were subjected to the PstI/BanII complexity reduction method (Jaccoud et al., 2001). Fluorescent nucleotides were used to label the resulting genomic representations that were hybridized on a microarray printed with the DArT clones. Following hybridization and washing, the microarrays were scanned for analyses. The DArT markers were scored either '1' (if a fragment present) or '0' (if absent).
Statistical analysis of trait data IBM SPSS Statistics (Version 19, IBM Corp., Armonk, NY, USA) was used to analyse root trait data for genotype main effects with a general linear model (GLM) multivariate analysis after identifying non-significant differences between bins and harvesting times (Chen et al., 2012). Descriptive statistics were computed for each trait across all genotypes in IBM SPSS Statistics 19 (IBM Corp.). The coefficient of variation (CV) was calculated by dividing SD by the mean value. Pearson correlation coefficients (r) were used to determine the general relationship between root trait pairs (P ≤ 0.05) and to generate an agglomerative hierarchical clustering (AHC) dendrogram tree. Variability in root traits across genotypes was determined by principal component analysis (PCA; Jolliffe, 2002). Rotation converged in 30 iterations using Varimax with the Kaiser Normalization method; principal components (PCs) with eigenvalues >1.0 were considered significant (Tabachnik and Fidell, 1996).

Marker diversity analysis
The polymorphic information content (PIC) value indicates the informativeness of a marker locus or marker system. PIC was determined as follows: where P i is the frequency of the i th allele in the examined genotypes (Weir, 1990). PIC values and the marker present frequency of each DArT marker were computed in PowerMarker 3.25 (Liu and Muse, 2005). The quality parameter Q for each marker was calculated by dividing the variance of the hybridization level for the marker between the two clusters (i.e. present and absent) by the total variance of the hybridization level of the marker.

Population structure analysis
The genetic diversity structure of the 111 genotypes was analysed using a distance-based method (Schlüter and Harris, 2006) and a model-based approach (Pritchard et al., 2000). Principal coordinate analysis (PCoA) was generated using Jaccard similarity matrices in FAMD 1.25 software (Schlüter and Harris, 2006). Two-dimensional scores were calculated and used to produce scatter plot matrices of scores. Jaccard's similarity coefficient is defined as: where n xy denotes the number of markers for which the indicated combination of character states is found for a pair of samples i and j. Character states are band presence (1), band absence (0), and missing data (?). Jaccard's coefficient was used for the clustering analysis with the neighbour-joining (NJ) method. Members (accessions/genotypes) in subgroups were identified using a model-based approach for dominant DArT markers implemented in the STRUCTURE software (Pritchard et al., 2000). We used an admixture co-ancestry model with independent and correlated allele frequencies and a burn-in time of 50 000. The number Mean, median, and CV values for each trait are given. CV values >0.5 are in bold. Number of significantly correlated traits at P < 0.05 is given for each trait according to Pearson correlation coefficient analysis. Branch length and number refer to first-order branches unless specified. Topsoil = 0−20 cm depth; subsoil = 20−120 cm depth.
of Markov Chain Monte Carlo replications after burn-in was set at 100 000 (Pritchard and Wen, 2004), with a K (number of populations) of up to 15 on the entire dataset (111 genotypes). The software provides the likelihood (the posterior probability) of the data for a given number of assumed populations K, and the value of K with the highest likelihood can be interpreted to correspond to an estimate for the underlying number of clusters. An ad hoc quantity (ΔK) based on the rate of the log probability of data between successive K values was used to determine the best K (Evanno et al., 2005):   Table 1. This figure is available in colour at JXB online. Ritland, 1996). This K estimate approximates identity by descent via adjusting the probability of identity by state between two individuals using the average probability of identity by state between random individuals (Yu et al., 2006). Using the best K, STRUCTURE computed a pairwise matrix, the allele-frequency divergence (i.e. the net nucleotide distance, δ), which was used to construct a phylogenetic tree topology according to an NJ method (Saitou and Nei, 1987) in MEGA 2.1 (Tamura et al., 2011). The analysis of molecular variance (AMOVA) was performed using standard Jaccard's coefficients and a distance transformation ( d s = − 1 ) to identify significant differences among populations and within populations (Excoffier et al., 1992).
Shannon's index of diversity ( '), H variance, and SD were calculated to measure the diversity of populations in the core collection (Shannon, 1948): Fig. 2. Dendrogram of AHC using the Pearson correlation coefficient on 38 root traits in XLSTAT (v2013.1). The 111 genotypes were assigned into one of six general groups (G1 to G6) at 0.75 similarity level (upper dashed line) containing 16 subgroups at 0.9 similarity level (lower dashed line). For more details on genotypes see Supplementary Table S1. This figure is available in colour at JXB online. where s is the number of populations observed, n i is the number observed from the i th population, and N is the total number of individuals observed in the sample. P-values of t-tests based on H ' and variances were computed using both Bowman's and the bootstrap (10 000 times) method (Bowman et al., 1969). To further assess the existence of a genetic structure between identified clusters ( (Weir and Cockerham, 1984). The total population that was used to calculate ∅ ST among Pop1 and Pop2 conformed to Pop1 + Pop2 using the software STRUCTURE and was tested by permutation. The F st values range from 0 to 1. A zero value implies complete panmixia (the two populations are interbreeding freely), whereas a value of 1 implies that all genetic variation is explained by the population structure (the two populations do not share any genetic diversity).

Trait-marker association analysis
A mixed linear model (MLM) association test of root traits incorporating population structure (Q) and relative kinship (K r ) matrices was performed using the TASSEL (v. 2.1) software package (Yu et al., 2006;Bradbury et al., 2007). We also performed GLM (Bradbury et al., 2007) and structured association (SA) (Thornsberry et al., 2001) analyses with the same data, incorporating population structure information as a covariate and using 1000 permutations for the correction of multiple testing. Given that the MLM method performs better in controlling spurious associations (Yu et al., 2006;Aulchenko Fig. 3. PCoA of 111 L. angustifolius genotypes based on 191 DArT markers. The graphs show the position of each accession in the space spanned by coordinate 1 versus coordinate 2 (a), and coordinate 1 versus coordinate 3 (b) of a relative Jaccard similarity matrix with FAMD. For root trait notations see Table 1. et al., 2007), we first ranked significant association from the MLM (P ≤ 0.05) and then compared the significance of these markers (P ≤ 0.05) in the permutation-based GLM and SA association tests.

Root trait variation and correlations
A total of 38 root traits, including 17 previously not described, were obtained from the phenotyping experiment (Chen et al., 2012;   of traits significantly correlated to an individual trait ranged from 12 to 34 at P ≤ 0.05 (Table 1; correlation matrix not shown). To account for these correlations, multivariate traits were constructed using PCA, resulting in nine components (PCs) with eigenvalues >1 (Supplementary Fig. S1). The number of root traits allocated to an individual PC varied from 1 to 14, with PC4 containing only root growth rate at 0-2 weeks (RGR_2wk), and PC1 having 14 root traits including branch length (BL), branch number (BN), root length (RL), and root mass (RM) ( Table 1). The scree plot of the PCA exhibited the total variance explained for each component. Nine components accounted for 90.7% of the variance (Supplementary Fig. S1). Among these, the first three components (PC1, PC2, and PC3; Fig. 1) represented 41.1%, 16.1%, and 12.5% of the variance, respectively, to explain a total of 69.7% of the variance.

Phenotypic diversity among the collection
An AHC similarity dendrogram constructed with the Pearson correlation coefficients of root trait data showed a large diversity in root architecture traits among the core collection (Fig. 2). Six general groups of genotypes with relatively homogeneous root traits were identified at a similarity level of 0.75. The number of group members (genotypes) varied widely among groups. The smallest group (G2) contained two genotypes whereas the largest group (G4) consisted of 64 genotypes. At a similarity level of 0.9, groups G1, G3, G4, and G6 were further divided into two, four, five, and three subgroups, respectively. The grouping outcomes for genotypic variability and similarity in root traits did not reflect geographic origin (cf. Supplementary Table S1).

DArT marker variation
A total of 191 DArT markers were polymorphic among the 111 accessions. The present set of DArT markers contained between 0 and 28% missing observations. The PIC values of these markers varied from 0.086 to 0.375, with an average PIC value of 0.330 (Table 2). Marker present frequency of each marker ranged from 0.11 to 0.86 with an average of 0.41 (data not shown).

Genetic diversity in the collection
The genetic diversity of the 111 genotypes was assessed by PCoA using 191 DArT markers. PCoA identified 65 principal coordinates with positive eigenvalues, including 28 with values >1, indicating large diversity in the collection. The generated Jaccard similarity matrix was used to construct principal coordinate plots deciphering the genetic relationships among the genotypes. The first two principal coordinates derived from the scores jointly explained 23.3% of the total variance (Fig. 3). NJ tree topology constructed on the basis of the inter-individual genetic similarity (Jaccard's coefficient) against 191 DArT markers showed a clear separation for most of the genotypes, suggesting significant diversity in this collection of accessions (Fig. 4).

Population structure of the collection
The genetic relationship among the 111 genotypes was analysed based on the DArT dataset. The true number of groups (populations) was determined using the admixture model in STRUCTURE and an ad hoc statistics (ΔK) calculation resulted in 10 distinct populations (Fig. 5). Differences among the 10 populations and within populations were  significant based on AMOVA (P < 0.001) ( Table 3). The genetic distances among populations were illustrated by an NJ tree using Jaccard's coefficient (Fig. 4), and were consistent with the analysis using the allele-frequency divergence (net nucleotide distance, δ; data not shown). The average distance between individuals within each population ranged from 0.172 (Pop10) to 0.301 (Pop06) ( Table 4). The composition of each population varied between 3 (Pop08) and 28 (Pop09) genotypes and consisted of collections from two to eight countries of origin. The two Australian cultivars in population 9 (Pop09) had a close genetic relationship with 26 other genotypes originating from Spain (12), Greece (7), Morocco (2), Germany (2), Belarus (1), France (1), and Italy (1). Forty-one genotypes from Spain were grouped into eight populations, indicating large diversity even within the same country of origin. There was no clear correlation between genetic relationship and geographic origin.

Genetic variation among populations
Shannon's index of diversity (H′) and the associated variance and SD (Table 5) were computed for 10 populations generated by PCA. Both Bowman's and bootstrap methods generated similar H′ values for specific populations and geographic regions. However, Bowman's method produced larger variances for each category than the bootstrap method (Table 5). Pop06 had the lowest H′ value with the largest variance among all populations in both analyses. In contrast, Pop02 had the highest H′ value with the smallest variance. Variations in Shannon's index of diversity were related to the size of populations, reflecting variation in the geographic locations of each population (Tables 4 and 5). T-test analyses on population data exhibited significant differences between most population pairs (P ≤ 0.05; Table 6). For example, Pop01 significantly differed from all other populations (seven populations at P ≤ 0.01 level and one population at P ≤ 0.05) except for Pop10. Pop03 was significantly different from Pop01 and Pop06 (P < 0.01) and Pop10 (P ≤ 0 .05), but did not statistically differ from the other six populations. Populationpopulation distances based on the Bayesian method ranged from 0.077 (Pop03 versus Pop05) to 0.146 (Pop02 versus Pop 03), indicating varied genetic relationships among populations (Supplementary Table S2). The genetic structure between identified populations was further assessed using pairwise F st analysis. The F st values estimated for population pairs ranged from 0.129 to 0.398, confirming pronounced genetic differentiation among populations (Supplementary Table S3).

Trait-marker association
The MLM association test of root traits revealed associations between root traits and DArT markers. All 38 root traits showed significant (P ≤ 0.05) associations with DArT markers, while the number of markers associated with an individual trait ranged from 2 to 13 (Table 7). At a significance level of 0.01, 30 traits were associated with one to four marker(s) ( Tables 7 and  8). Of these, the branch number topsoil to subsoil ratio (BNR) was associated with four markers (lPb−328947, lPb−329087, lPb−329141, and lPb−332488) (Table 8), and average root diameter (RD) was associated with four different markers . Thirty of the 191 markers showed a significant association with root traits (α = 0.01). Among them, 16 were associated with multiple traits (two to eight), whereas each of the remaining 14 was associated with a single trait. Marker IPb-333104 had the highest association with root traits, including branch density (BD), branch number (BN), subsoil branch length (BL_ sub), and root length in diameter class <0.75mm (DCL_thin) ( Table 8). The percentage of phenotypic variation explained by a marker (Marker R 2 ) ranged from 6.4 (branch length topsoil to subsoil ratio, BLR) to 21.8 (root tissue density, RTD), with 15 associations having Marker R 2 values >10%. Genetic variation values ranged from 0 to 7994, with 23 associations having values >240. A wide range of values was observed for residual variation (0−17 897).

Discussion
A wide genetic diversity in a range of root traits was identified in a collection of narrow-leafed lupin (L. angustifolius) comprising 108 wild types from around the world (Table 1; Fig. 4). Exploiting the diverse genetic and adaptive resources of this species is critical for its future (Berger et al., 2013) because the production of narrow-leafed lupin in Australia is hampered by terminal drought and a range of subsoil constraints (e.g. soil compaction, acidity, and aluminium toxicity; Turner and Asseng, 2005). These constraints limit root growth into deep horizons and thus restrict root access to water and nutrients (Adcock et al., 2007;Chen et al., 2014).
Although the present study focused on characterizing genetic diversity in root traits, additional above-ground traits were measured in the phenotyping experiment. These included leaflet number, shoot height, shoot dry mass, total dry mass, the ratio of root dry mass to shoot dry mass, and the ratio of root dry mass to total dry mass (Chen et al., 2012). Pearson correlation analysis revealed a strong correlation (mostly at P < 0.01) between 15 root traits (e.g. root length, branch length, branch number, specific root length, and root tissue density) and a number of above-ground traits (e.g. leaflet number and shoot dry weight) (Chen et al., 2012). RSA critically influences foraging and the capture of water and nutrients, and it thus determines crop productivity (Lynch, 1995). Studies have flagged root length, branching at depth, and seminal root angle as key traits likely to underpin further increases in the yield of crops such as wheat  ). An increased capacity to take up water from deep soil horizons has been linked to increased yield potential in sugar beet (Beta vulgaris) (Ober et al., 2005;Lynch and Wojciechowski, 2015); a similar connection was made for wheat in western and southern Australia (Wong and Asseng, 2006;Manschadi et al., 2010) and rice (Oryza sativa; Kondo et al., 1999;Kamoshita et al., 2000). Recently, we observed better performance in 2 of 10 selected wild L. angustifolius genotypes when compared with local cultivars at a Western Australian farm with subsoil compaction (Chen et al., 2014). Specifically selecting for improved root traits, such as root proliferation at depth, may result in yield increases, especially in drier soil conditions. This is particularly important because attempts to increase root density at depth using agronomic approaches (e.g. deep fertiliser placement and deep ripping) have been largely unsuccessful (e.g. Baddeley et al., 2007). Therefore, it may be possible to improve the ability of lupin genotypes to adapt to subsoil constraints by selecting for proxy root traits from new and exotic germplasm sources. The subset of the world collection of L. angustifolius evaluated in this study exhibited large phenotypic and genetic diversity in a range of root traits (Table 1). Genetic material from a wide latitudinal range, involving 108 wild types, was used in our study to ensure the identification of genotypic variability in various RSA traits. Large morphological diversity in relation to geographical origins has been observed previously in narrow-leafed lupin accessions from the western Mediterranean (Gladstones and Crosbie, 1979) and Aegean (Clements and Cowling, 1994) regions. Crop cultivars with proxy RSA traits may have improved desirable agronomic traits such as yield, drought tolerance, and resistance to nutrient deficiencies (Tuberosa et al., 2002;Beebe et al., 2006;Steele et al., 2007). Developing high-throughput screening techniques for accurate and efficient phenotyping is critical for characterizing root-related traits in a widescale germplasm pool (De Dorlodot et al., 2007). We have recently established a novel semi-hydroponic phenotyping system to determine genetic variation in intrinsic RSA in the world collection of narrow-leafed lupin. Based on the results of a glasshouse phenotyping experiment (Chen et al., 2012), 10 genotypes with contrasting root characters were further examined in two different types of soils (Chen et al., 2011b) and in the field (Chen et al., 2014). There was relatively consistent ranking of genotypes between the two separate phenotyping experiments, and between phenotyping experiments and two different soil media in the glasshouse and the field (Chen et al., 2011b(Chen et al., , 2012(Chen et al., , 2014. Eco-geographical studies and field phenotyping on above-ground traits have previously been evaluated (Clements and Cowling, 1994). Because root phenotypic data reported here were obtained from the phenotyping experiment under carefully controlled environmental conditions, field phenotyping of the same set of the lupin collection for root traits is required to explore the potential gene-by-environment interactions. The genotypic variability in root traits and potential traits of interest identified in our glasshouse phenotyping experiment form a basis for field study.
This study used a set of DArT markers for genetic analysis and demonstrated a high level of polymorphism and high quality as assessed by the call rate, scoring reproducibility, and PIC values of these markers (Table 2). Genetic markers with high-level polymorphism are critical for use in fingerprinting and marker-assisted selection (MAS) programmes (Smith et al., 2000;Mace et al., 2008). Diversity arrays have been widely used for rapid and economical genotyping to any genome or complex genomic mixtures (Jaccoud et al., 2001;Akbari et al., 2006). The DArT markers used in this study comprised 37 markers mapped on the genome of narrowleafed lupin (Table 8). Marker technology is developing rapidly and future research will be able to incorporate 50 000 DArTseq markers (Matthew Nelson, unpublished data).
Our study showed significant correlations between root traits and molecular markers using genome-wide association  analysis (Tables 7 and 8). These results have a potential application in the selection of suitable root traits for targeted edaphic environmental adaptation. Short regions of conserved synteny between L. angustifolius and two model legume species (Medicago truncatula and Lotus japonicus) have been identified (Nelson et al., 2006(Nelson et al., , 2010Kroc et al., 2014), and a low-density survey sequence of the L. angustifolius genome was described with a small proportion of scaffolds and largeinsert library clones assigned to linkage groups (Lesniewska et al., 2011;Yang et al., 2013). An improved reference genetic map of L. angustifolius comprising 1475 primarily gene-based marker loci was recently reported (Kamphuis et al., 2015). The recent progress in genome mapping in narrow-leafed lupin provides useful tools for MAS and QTL cloning for RSA in wild L. angustifolius by exploiting genomic resources, candidate genes, and the knowledge gained from model species, particularly Arabidopsis (Sergeeva et al., 2006), M. truncatula and L. japonicus (Choi et al., 2004;Nelson et al., 2010), rice (Horii et al., 2005;Steele et al., 2007), and maize (Zea mays) (Giuliani et al., 2005). Combining phenotypic data of RSA features and genetic marker/QTL analysis will enable us to explore the inheritance of RSA traits in narrow-leafed lupin and to identify proxy traits, such as deeper roots and lateral root proliferation at depth, for enhancing adaptation to different edaphic environments, particularly drying soil conditions.

Supplementary data
Supplementary data are available at JXB online. Table S1. Breeding status and country of origin of 111 L. angustifolius genotypes used in this study. Table S2. Population-population distances: chord distance from allele-frequency estimates based on the Bayesian (nonuniform prior from among-population information) method (FAMD). Table S3. Estimates of pairwise F st values for populations based on random allelic permutation testing of the DArT dataset (P < 0.01). Figure S1. Scree plot of the PCA of all 38 root traits across 111 genotypes of L. angustifolius showing the total variance explained for each component (PC). Trait-marker association was performed with an MLM model incorporating population structure (Q-matrix) and kinship (K r ) in TASSEL 2.1. Marker R 2 is the percentage of phenotypic variation explained by the marker. Only significant trait (α = 0.01)-trait-marker associations were included. Each trait is assigned to one of the nine PCs based on PCA with eigenvalues >1. The number of DArT markers found for each trait at α = 0.01 and 0.05 is presented.