Genome-wide association mapping in a sweet cherry germplasm collection (Prunus avium L.) reveals candidate genes for fruit quality traits

Abstract In sweet cherry (Prunus avium L.), large variability exists for various traits related to fruit quality. There is a need to discover the genetic architecture of these traits in order to enhance the efficiency of breeding strategies for consumer and producer demands. With this objective, a germplasm collection consisting of 116 sweet cherry accessions was evaluated for 23 agronomic fruit quality traits over 2–6 years, and characterized using a genotyping-by-sequencing approach. The SNP coverage collected was used to conduct a genome-wide association study using two multilocus models and three reference genomes. We identified numerous SNP–trait associations for global fruit size (weight, width, and thickness), fruit cracking, fruit firmness, and stone size, and we pinpointed several candidate genes involved in phytohormone, calcium, and cell wall metabolisms. Finally, we conducted a precise literature review focusing on the genetic architecture of fruit quality traits in sweet cherry to compare our results with potential colocalizations of marker–trait associations. This study brings new knowledge of the genetic control of important agronomic traits related to fruit quality, and to the development of marker-assisted selection strategies targeted towards the facilitation of breeding efforts.


Introduction
Sweet cherry (Prunus avium L.), a diploid species (2n = 2x = 16 chromosomes), is one of the most economically important perennial fruit species growing in temperate regions [1,2].The world production of sweet cherries has increased in the last two decades to reach 2.5 million tons per year (FAOSTAT, http://faostat.fao.org/, 2022).The top producers are Turkey, USA, Chile, China, and Iran, and production in Europe is dominated by Italy and Spain, followed by France.However, the general trend in cherry production in France has been declining markedly since the 1980s, from 112 000 in 1980 to 35 000 tons per year nowadays, partly due to climate change impacts, the general drop in agricultural production, and social change, as well as increased biotic pressure from pathogens such as the fruit f ly Drosophila suzukii [3].
For consumers, important sweet cherry quality traits are large fruit, light or dark red skin color, the balance between sugar and acid content, and f lesh firmness and low cracking susceptibility [4].In addition, for producers, resistance to fruit cracking due to rainfall in spring is particularly important as it may be responsible for high production losses [5][6][7][8].However, the creation of new sweet cherry cultivars from traditional breeding is still a long process, because of the extended juvenile phase, the size of the area required for evaluation, and complex polygenic traits [9].
However, QTL detection in biparental populations can suffer from limitations [22] and success in finding QTLs depends on several elements, such as the heterozygosity level of the two parents of the progeny and the phenotypic variability of the trait studied between the two parents.Fortunately, several studies with the use of many years of data have allowed a significant reduction of QTL confidence intervals for traits of interest in sweet cherry [11,19].Genome-wide association study (GWAS) is complementary to QTL mapping and includes the genetic diversity of individuals with little or no familial relatedness [23].However, association mapping has some crucial factors to be accounted for, such as the need for high-density genotyping data, the level of linkage disequilibrium (LD), the phenotypic variability of the individuals represented within the population, the population structure and kinship, the complexity of the trait, and the inclusion of rare alleles, all conditioning the detection of marker-trait associations [24,25].In sweet cherry, GWAS was recently implemented to characterize the genetic architecture of fruit-size-related traits, which confirmed the previous locus involved in fruit firmness on chromosome 4 detected by linkage mapping [26].
New approaches, such as genotyping-by-sequencing (GBS), allow the sequencing of many samples at the same time and can be used to obtain high-density sets of SNPs at a lower cost [27].During recent years, the usefulness of GBS in perennial tree fruit species has been well documented [28][29][30][31], including the creation of a high-density linkage map in Prunus species [32,33] and GWAS in peach [34,35].These GBS approaches in Prunus species have been made possible thanks to the availability of peach [36] and sweet cherry [37,38] reference genomes.
In the present work, we report the results of a GWAS for fruit quality traits in a germplasm collection of sweet cherry accessions genotyped using the GBS approach.Specifically, the main objectives of this study were to (i) examine the existing variability of 23 fruit quality traits within a sample of 116 genotypes of sweet cherry, (ii) identify genome-wide SNP markers associated with these traits by GWAS using three reference genomes and two mixed linear models, (iii) compare our results with those previously reported, and (iv) investigate the putative functions, based on searches for orthologs, of the candidate genes associated with these SNPs.

Phenotypic variation of sweet cherry fruit quality traits
For each trait, descriptive statistics, including minimum and maximum values, means and standard deviations, are given in Table 1.A high phenotypic variation was observed for most of the 23 fruit quality traits in the 116 accessions.For instance, concerning fruit-related traits, fruit weight (FW) varied from 1.70 to 16.61 g depending on the year and the accession.On the contrary, pH had a low range of variation, from 2.93 to 4.04 in 2015 (Table 1).As expected, strong positive correlations were found between fruitrelated traits [FW, fruit height (FH), fruit thickness (FT), and fruit width (FWi)], and extended from 0.88 to 0.98 (Fig. 1).The same was true between stone-related traits [stone weight (SW), stone height (SH), stone thickness (ST), and stone width (SWi)], which were positively correlated (0.49-0.88).Fruit-related traits were also positively correlated with stone-related traits, indicating that a bigger fruit tends to have a bigger stone.Fruit-cracking-related traits were slightly positively correlated (from 0.36 to 0.49); on the contrary, pH tended to be negatively correlated with both fruitand stone-related traits.
Broad-sense heritability (H 2 ) was high for most of the traits (Table 1).For instance, H 2 varied from 0.81 to 0.92 for traits related to fruit size (FW, FH, FWi, and FT).Cracking-related traits, however, tended to be less inheritable (between 0.36 and 0.64), as were juice-related traits (between 0.31 and 0.75).The least inheritable trait was fruit suture (FSU), with H 2 = 0.29.Some traits did not show a normal distribution (Supplementary Data Fig.S1).For example, cracking-related traits had a left-skewed distribution, and both fruit juice color (FJC) and fruit pistil end (FPE) had a nonnormal distribution.

SNP discovery through genotyping-by-sequencing
Sequencing of the prepared GBS libraries of the panel resulted in 389 682 215 raw reads with an average of 7.9 million reads per sample (Supplementary Data Fig.S2), which were aligned to three reference genomes.Alignment on the P. avium var.'Regina' reference genome and variant calling yielded a total of 574 556 SNPs (Supplementary Data Table S1).After removing SNPs based on minimum quality, depth, and <20% missing data to ensure a highquality SNP set for GWAS, we retained 75 916 SNPs.After filtering for >5% minor allele frequency (MAF), GWAS was conducted with 28 198 SNPs.The same filtering method was applied after alignment on P. avium var.'Satonishiki' and Prunus persica 'PLov2-2n' genomes, and we retained 34 864 and 33 760 SNPs, respectively (Supplementary Data Table S1).

Population structure and linkage disequilibrium
Based on a cross-validation procedure and considering the grouping threshold of Q > 0.70 in structure analyses, the most likely number of subpopulations depended on the SNP dataset used: K = 10, K = 9, and K = 6 for the GBS sequences aligned to the 'Regina' genome (Supplementary Data Fig.S3-a1), 'Satonishiki' genome (Supplementary Data Fig.S3-b1), and 'PLov2-2n' genome (Supplementary Data Fig.S3-c1), respectively.The six subpopulations found using the SNP obtained with the 'PLov2-2n' genome were also detected with the two other SNP datasets, although they were not strictly identical (Supplementary Data Fig.S3a-2, b-2, and c-2).For instance, the subpopulation (colored in yellow in the Fig. S3) that consisted of 12 accessions when using the SNP dataset obtained with the 'Regina' genome were all found within the corresponding subpopulations identified when using SNPs from 'Satonishiki' (19 accessions) and SNPs from 'PLov2-2n' (34 accessions).Overall, a large proportion of genotypes were classified as admixed (e.g.59% using SNPs obtained from 'Regina').The different subgroups identified in structure analyses including the admixed genotypes were projected onto the first two components of the principal component analysis (PCA) performed (Supplementary Data Fig.S3a-3, b-3, and c-3), in order to characterize the genetic relatedness among genotypes.The subpopulations identified in structure analyses were in line with the clustering pattern of the PCA.Moreover, the subpopulation containing well-known landraces (colored in yellow in the Fig. S3), appeared as substantially distant from the other subpopulations, when using SNPs from 'Satonishiki' and 'PLov2-2n' genomes in particular.
The LD estimates (measured as r 2 ) and extent of LD decay were calculated using SNPs including those for which MAF was <5% (Supplementary Data Table S1).The genome-wide LD decayed, at 50 kb, to r 2 ∼ 0.11 using the 'Regina' genome (Supplementary Data Fig.S4a), r 2 ∼ 0.08 using the 'Satonishiki' genome (Supplementary Data Fig.S4-b) and r 2 ∼ 0.03 using the 'PLov2-2n' genome (Supplementary Data Fig.S4c).As a consequence, we observed a faster LD decay when alignment of reads was performed on the P. persica genome compared with the two P. avium genomes.

SNP-trait associations for fruit size and fruit-cracking-related traits
We compared a multilocus mixed model (MLMM) and the Fixed and random model Circulating Probability Unification (FarmCPU) method for all analyses.Using the 'model selection' option implemented in GAPIT, both kinship matrix and structure (principal component, PC) were included for all traits.However, according to the Bayesian information criterion (BIC), the best number of PCs to include for FW, for example, was zero (Supplementary Data Table S2).We checked whether the structure was inf luencing the 23 traits.We performed a PCA on the 23-BLUP (best linear unbiased prediction) phenotypic dataset, and we colored the individuals by their corresponding structure group (Supplementary Data Fig.S5).By obtaining overlapping ellipses of structure groups in the PCA of the traits, we observed no clear pattern of correlation between the phenotypes and the structure.
A total of 65 unique SNP-trait associations were found for eight traits, depending on the reference genome and the GWAS model (Table 2).We found SNPs significantly associated with  the variation of fruit-cracking-related traits, fruit size (FW, FWi, and FT), stone size (SW and ST), and stone shape (SS).We did not find any SNPs associated with 15 traits, such as titratable acidity (TA), fruit side cracking (FS), fruit skin color (FC) and FJC.We found the highest number of associated SNPs with FW (Table 2).We obtained associated SNPs only considering the FarmCPU model, but we identified strong similar 'hotspots' in the Manhattan plots when comparing both models and the three reference genomes (Fig. 2).For instance, the two SNPs associated with FW at the beginning and end of chromosome 1 (11 561 777 and 50 388 238 bp) considering FarmCPU/'Regina' were also detected using MLMM, but were not significant.Moreover, the SNP associated in the middle of the chromosome 4 (12 024 925 bp) was detected in both models for 'Regina' and 'Satonishiki', but only significantly when considering FarmCPU/'Regina'.Similarly, the hotspot at the end of the chromosome 2 was found in all configurations, but was only significant when considering FarmCPU/'PLov2-2n'.
The SNP Chr1-50 388 238 bp, detected considering FarmCPU/ 'Regina', explained the largest part of FW variance (18.8%), with the major allele A as favorable (vs allele G with an estimated effect of −0.57 on the corresponding BLUP value).This SNP was also significantly associated with two correlated traits: FWi (15.0% of the variance explained) and FT (17.3%).The boxplots of the alleles of this SNP, for the three traits, are shown in Fig. 3.In addition, the SNP Chr2-4 882 204 bp, detected considering FarmCPU/'Regina', explained 5.8 and 7.2% of FW and FT variances, respectively, with the major allele G as favorable (vs allele A with an estimated effect of −0.31 and −0.68 on the corresponding BLUP values, respectively).Similarly, the SNP Chr7-18 276 189 bp, detected considering FarmCPU/'Regina', was associated with FWi and FT.Altogether, these three SNPs explained 32.0% of FT variance (Table 2).
We identified another association with global fruit size considering FarmCPU/'PLov2-2n'.The SNP Chr3-21 433 100 bp was associated with FW and FT, but explained <0.1% of the variance for both traits.For FW, we detected another associated SNP considering FarmCPU/'PLov2-2n' and the SNP Chr2-28 325 332 bp, explaining 7.3% of the variance.We did not find associations for any of these three fruit size-related traits using 'Satonishiki'.
Finally, concerning fruit-cracking-related traits, we detected associations for both fruit stem end cracking (SE) and fruit pistillar end cracking (PI) (Table 2).As previously observed for fruit sizerelated traits, we found close hotspots in the Manhattan plots when comparing the different models and genomes, but with different significant SNP-trait associations (Supplementary Data Fig.S6).All together, we found 9 associated loci on chromosomes 1, 4, 7, and 8 for SE and 11 associations on chromosomes 1, 4, 5, 6, and 7 for PI (Table 2; Supplementary Data Fig.S6).The associated SNP Chr4-4 381 656 bp was significant in both GWAS models and explained the largest effect (30.9%) for SE.Considering FarmCPU/'Satonishiki', the total SE variance explained by the three best SNPs reached 67.8%.

SNP-trait associations for stone-related traits
For stone-size-related traits, we found seven and eight SNP-trait associations for SWi and ST, respectively.Contrary to fruit-sizerelated traits, we did not find any locus in common between these two highly correlated traits (Table 2), while we observed similar hotspots between both traits (Supplementary Data Fig.S7).For example, we identified two significant loci at the beginning of chromosomes  Global representation of the genetic architecture of fruit quality traits in sweet cherry.Each marker-trait association shape correspond to a study: Holušová et al. [26], GWAS using SNPs mapped on 'Tieton'; Crump et al. [14], GWAS and QTL mapping using SNPs mapped on 'PLov2-2n'; Cai et al. [10], QTL mapping on multiple progenies and SNPs mapped on 'PLov2-2n'; [13], QTL mapping on multiple progenies and SNPs mapped on 'Satonishiki' (major loci); Calle et al. [12], QTLs on one progeny and SNPs mapped on 'PLov2-2n'; Zhang et al. [17], QTLs on one progeny and SSRs coming from Prunus species (QTLs in centimorgan placed on 'Regina'); Rosyara et al. [16], QTLs on multiple progenies and SSRs and SNPs coming from Prunus species (QTLs with intervals given in centimorgan placed on 'Regina'); Campoy et al. [11], QTLs on multiple progenies and SNPs mapped on 'PLov2-2n' (QTLs with intervals given in centimorgan and major loci); Quero-García et al. [18], QTLs on multiple progenies and SNPs mapped on 'PLov2-2n' (QTLs with intervals given in centimorgan and major loci).The QTLs are placed to the right of the corresponding chromosome.We divided all traits into three main categories, colored in the figures as follows: cracking-related traits in yellow, global fruit-size-related traits in blue, and firmness-related traits in red.

Comparison of GWAS results including or not minor allele frequency filtering
When we compared the GWAS results including or not including MAF filtering, we found associations for the same traits (SE, FW, FWi, FT, SWi, and ST), and we also identified SNPs associated with fruit firmness (FF) (Supplementary Data Table S3).For each trait, only one or two SNP-trait associations were strictly identical or at a highly close position to those previously identified.For instance, for SE we identified 10 and 11 associated SNPs, all GWAS models and genomes combined, including and not including MAF filtering, respectively, but only the SNP Chr1-26 464 729 bp was found in both lists, considering FarmCPU/'Satonishiki'.However, as was the case when comparing GWAS models, the Manhattan plots showed similar strong hotspots when we compared the results including or not including MAF filtering.For example, for FW considering the 'Regina' genome, we observed in both cases hotspots in chromosomes 1, 2, 3, 4, and 6, using the MLMM model (Supplementary Data Fig.S8).
Concerning FF, we identified seven associated SNPs using the 'Satonishiki' genome (Supplementary Data Table S3), while we did not obtain any significant SNP when we included MAF filtering.Using the MLMM model, we found the SNP Chr4-15 932 813 bp, which explained 20.6% of the variance, with an allelic effect of 10.7 for the favorable A major allele.Using the FarmCPU model, we identified additional loci on chromosomes 3, 4, 5, 6, and 8.

Colocalizations of SNP-trait associations identified between different reference genomes
In order to identify potentially colocalizing SNP-trait associations of fruit quality traits in sweet cherry, we summarize results from this study and nine previously published articles (Fig. 4).Interestingly, we found an SNP on chromosome 4 at 15.93 Mbp using 'Satonishiki' for FF (no MAF filtering results in Supplementary Data Table S3), where Crump et al. [14] identified an SNP associated with the same trait using the 'Tieton' genome at 16.00 Mbp.We searched for colocalizations of close SNPs associated with the same trait (or strongly correlated traits) between two genomes by using BLAST with the surrounding sequence (Supplementary Data Table S4).For FF, the two positions previously cited on chromosome 4 are not so close since the sequence around 15.93 Mbp of 'Satonishiki' corresponded to a sequence around 17.99 Mbp on 'Tieton'.Consequently, these two loci, thought at first to be colocalizing, are actually more physically distant.This was the case for the largest part of the SNP-trait associations potentially colocalizing that we focused on, particularly for FF and fruit sizerelated traits (Supplementary Data Table S4).However, although they did not perfectly colocalize, the SNP-trait associations for fruit-cracking-related traits can be considered as physically close.We also identified three colocalizations, including an SNP for SE on chromosome 4 at 4.38 Mbp using 'Satonishiki', corresponding to a locus at 3.46 Mbp on 'PLov2-2n',which colocalized with a QTL identified by Quero-García et al. [18] using the 'PLov2-2n' genome between 1.98 and 3.92 Mbp.Interestingly, if the potentially colocalizing loci are physically distant, the framing genes have in most cases the same annotation (Supplementary Data Table S4).

Identification of genes putatively involved in fruit quality
Based on the association results, we searched for candidate genes and their predicted functions in the upstream and downstream f lanking regions of all significant SNP-trait associations, defined by the LD blocks and considering MAF filtering.The size of LD block intervals varied significantly, with an average of 20 772 bp (from no block identified to a block of 134 216 bp).First, we explored associated SNPs within gene coding sequences, here referred to as 'putatively best candidate genes' (Table 3).
Regarding fruit-cracking-related traits, we identified 12 candidate genes: six for SE, such as a cyclin-dependent kinase inhibitor protein, a MADS-box transcription factor, and a SAURlike auxin-responsive protein, and six for PI , which encoded e.g. a nucleotide-sugar transporter protein, a pectin lyase-like protein, and an RTL2 RNAse THREE-like protein.We finally pinpointed 12 additional candidate genes for stone-size-related traits, such as three protein kinases, including a cell-wall-associated kinase.
More globally, we also considered additional candidate genes, corresponding to all genes identified within the defined LD blocks for the SNP-trait association (Supplementary Data Table S5).For a given trait, we identified some candidate genes on different chromosomes but with identical or similar functional annotation using the different reference genomes.For instance, for SE we found genes encoding calcium-dependent protein kinase and putative calcium-transporting ATPase using 'Regina', while a gene encoding a probable calcium-binding protein was identified using the 'PLov2-2n' genome.Additionally, for SE we observed genes encoding different types of glycosyltransferases using the three genomes (putative xylogalacturonan β-1,3-xylosyltransferase for 'Regina', UDP-glucosyltransferase for 'Satonishiki', and probable galacturonosyltransferase-like for 'PLov2-2n').
Concerning FW, we found genes encoding various transporters (MATE, sugar, and solute), transcription factors (MYB, Hap3/ NF-YB, and DOG), and RLK/Pelle kinases (for four of the nine associated LD blocks) using 'Regina'.Several candidate genes were specific for traits FWi and FT, such as genes encoding a fasciclin-like arabinogalactan protein and a terpenoid synthase, and we found genes encoding ethylene response proteins potentially involved in global fruit size variation (putative ethyleneresponsive binding factor-associated repression ninja family for FW using 'Regina' and ethylene-responsive transcription factor for FT using 'PLov2-2n').
Finally, we found genes encoding a putative LRR receptor-like serine/threonine protein kinase potentially involved in ST and stone shape using 'Satonishiki' and 'PLov2-2n' genomes, and Fbox domain protein-encoding genes using the 'PLov2-2n' genome for the same traits.Then, for global stone size (SWi and ST) we found genes previously identified for fruit-related traits, such as genes encoding a RLK/Pelle kinase, a GDSL lipase, and an AP2 transcription factor.

Contrasted heritabilities are associated with categories of traits
Large phenotypic variability was found for most traits, and was comparable to that found in previously published GWAS on sweet cherry.For instance, the range of FH values, from 12.76 to 24.34 mm (average 18.73 mm), was comparable to those obtained for fruit length by Holušová et al. [26], from 16.70 to 26.98 mm (average 21.31 mm).Regarding FW, our values ranged from 1.70 to 16.61 g (average 5.68 g), while Holušová et al. [26] obtained a smaller range, from 2.88 to 10.98 g (average 5.97 g), proving the large variability of the phenotypes of our germplasm collection.Some traits could be considered as normally distributed, i.e. suitable to be efficiently exploited in GWAS studies.For example, FF was normally distributed, as observed in Crump et al. [14].However, a few distributions, such as for cracking-related traits, were found to be skewed.Between 2014 and 2019, our germplasm collection location experienced climatic conditions with low values of rainfall, resulting in a large number of genotypes with zero cracked fruits.Moreover, the variable transformation of these phenotypes (square root of the arcsine) did not improve their distributions.Left-skewed distributions were already observed for cracking-related traits by Quero-García et al. [18], whereas Crump et al. [14] showed a right-skewed distribution of cracking incidence in 2020.This was likely due to the increased total rainfall and the less accurate fruit cracking assessment method performed by Crump et al. [14].However, no distribution showed a multimodal pattern, except for FC, FJC, and FPE.Similar asymmetric phenotypic distributions were reported in other fruit species, such as apple [39].
We found that FW and fruit-size-related traits were positively correlated (from 0.88 to 0.98), as reported in the work of Holušová et al. [26], where FW was positively correlated with fruit length (0.90), FWi (0.94), and FT (0.93).In our study, FW tended to be slightly positively correlated with FF (0.45), as already noted in sweet cherry by Piaskowski et al. [40], with a positive correlation of 0.51.However, Campoy et al. [11] found negative correlations using two mapping progenies, probably due to the unique genetic background of the three parents involved ('Regina', 'Lapins', and 'Garnet').The correlation between FW and FF shows inconstant behavior within Rosaceae: the correlation was found negative in apple [41], close to zero in strawberry [42], and positive in peach [43].Moreover, a negative correlation between TA and fruit size was reported by Piaskowski et al. [40], while we found a negative correlation between pH and fruit size, juice pH and TA being negatively correlated.Globally, besides the low values observed for a few traits, heritabilities were moderate to high, which is indicative of good reproducibility of the phenotyping and a relevant genetic contribution to the observed variability in the measured traits, thereby offering an excellent opportunity for improving these traits through selection.Our calculated broad-sense heritabilities showed high values for traits related to fruit and stone size, as observed in previous studies on sweet cherry [11,[15][16][17]40].Furthermore, our results were also consistent with previous work, which reported high values of H 2 for FF (H 2 = 0.73-0.97or h 2 = 0.70) in sweet cherry [10,11,14,40].
For fruit cracking traits, our findings demonstrated a moderate to low broad-sense heritability (H 2 values ranged from 0.36 for FS to 0.64 for PI).A similar trend was found by Quero-García et al. [18] using mapping progenies, with a higher value for PI, and our H 2 values were also in line with the work of Crump et al. [14], which obtained h 2 = 0.54 for cracking incidence.Nevertheless, the values estimated in our study for such traits were higher compared with the ones obtained in other fruit species, such as apple where H 2 = 0.03-0.22[39,44].Low heritability values are known for this category of traits that are strongly inf luenced by the environment [7].For the traits concerned, the lower values of heritability might also be partially explained by the availability of only two years of data.

Phenotypic variances are not affected by sweet cherry population structure
The two approaches used to analyze the structure of the sweet cherry panel provided complementary information that was in line with previous results obtained by Campoy et al. [45] in the INRAE sweet cherry germplasm collection (210 individuals) using the RosBREED 6 K array.These authors first identified two groups, mainly constituted by bred cultivars on the one hand and landraces on the other hand.Our analyses confirmed this first structure since the distinct group obtained using SNPs from the 'PLov2-2n' genome contained exclusively landraces and mainly the same as those identified by Campoy et al. [45].Regarding the substructure, as in our case, Campoy et al. [45] and Barreneche et al. [46] both identified a subgroup constituted by particular interesting French landraces, such as 'Saint Georges', 'Xapata', and 'Durette'.The choice of the best K value (number of inferred subpopulations or clusters within the dataset) can be inf luenced by the reference genome considered to obtain the set of SNPs, because the reference genome is used to align and identify genetic variations in the sample genomes.Different reference genomes may have variations in the regions they cover, leading to the identification of different SNPs (and different missing data) and differences in the genetic diversity captured by the SNPs, and potentially inf luencing the inferred structure.
We checked whether the structure was inf luencing the phenotypic variance of the 23 traits and we observed no clear pattern of correlation between the phenotypes and the structure.As a consequence, all the traits did not seem to be correlated with the structure.The inf luence of population structure on phenotypic variance was weak, maybe because kinship accounted partially for the structure.This was observed in other studies using GAPIT in sweet cherry [14,26] and in walnut [47], and explained why no PC was considered as a structure covariable in further GWAS analyses using the BIC.
For all reference genomes, the LD was estimated as r 2 and dropped below 0.2 within a distance of <50 kb.This value is in agreement with previous studies, where r 2 < 0.2 was found at ∼100 kb in 210 sweet cherry accessions using 1215 SNPs not LD-pruned [45] or with 35 SSRs [48], and in Xanthopoulou et al. [49], where LD decayed to 90% at 55 kb using whole-genome resequencing.Differences in the number of markers used (>70 000 SNPs for the two sweet cherry genomes vs >160 000 SNPs for the P. persica genome) may affect the LD value, as reported in the literature, where LD estimates tended to be higher with denser SNP panels [50,51].In this study we used GBS, where the alignment accuracy of short-read sequences is affected by the choice of the reference genome [52], which would explain the differences obtained for the SNP sets, and consequently, the LD decay estimations.Globally, the fast LD decay in our sweet cherry population is helpful in the search for candidate genes, but requires a large number of SNPs.As a rough guide, with a sweet cherry genome size estimated at 270 Mb and r 2 = 0.2 estimated at ∼5 kb in our panel, we would need an SNP set of ∼50 000 SNPs.With 30 000 SNPs on average in our study, we probably did not capture the entirety of loci potentially involved in the variation of a trait and the number of SNPs was lower than in Holušová et al. [26] (1 767 106 SNPs), but much higher than in Crump et al. [14] (3302 SNPs).

SNP-trait associations are dependent on data dimension, GWAS model, reference genome and MAF filtering
In addition to structure and relatedness, our study highlights several other factors that inf luence significantly the loci associated with a trait: GWAS model, reference genome, and MAF filtering.Firstly, the number of accessions and the degree of their phenotypic variations can inf luence the reliability of GWAS analysis.The number of accessions used in this study may seem low in order to achieve statistical power [22,53].However, several studies in fruit species have recently detected significant SNPtrait associations with panels of <200 accessions: 172 genotypes in apple [54] and even 132 in peach [34].In Crump et al. [14], 259 individuals were retained for sweet cherry GWAS analyses but they were seedlings from 22 breeding parents and may represent a lesser level of phenotypic variability.Our sampling could therefore be suitable for GWAS analyses, according to our H 2 values and phenotypic ranges.
Moreover, although our study highlighted different results when comparing different models and different reference genomes, the patterns of hotspots were very close in our study and gave weight to our significant SNP-trait associations.In our study, the FarmCPU model identified more SNP-trait associations than the MLMM.The performance of FarmCPU was already highlighted in a soybean simulated dataset, as it consistently identified a single highly significant SNP close to known published genes for six qualitative traits [55].Furthermore, we chose to work with three genomes because the role of the reference genome used for the mapping of short reads is crucial [56].This is also in accordance with previous work in maize that proved the usefulness of genotyping datasets obtained from multiple reference genomes [57].We used sweet cherry and peach genomes, which share a high level of synteny [58], and for years the P. persica genome was the Prunus reference genome.Our study showed that the set of SNPs retained after alignment and filtering were different between genomes.
Finally, considering MAF filtering or not can also be crucial.In our analysis, when we included SNPs with MAF <5%, and even if the original set of SNPs with MAF >5% was conserved, we had mostly different SNPs associated with traits (even on different chromosomes).In other words, the increase in the number of SNPs included in our analyses strongly inf luenced the GWAS significance.In Holušová et al. [26] and Crump et al. [14], MAF filtering was set to 1% but both used >200 accessions.Since in our study we may have had a limited number of individuals and a limited number of SNPs, we rather considered results including MAF filtering primarily.However, it could be interesting to focus on the SNP-trait associations obtained without MAF filtering, as proposed by Tabangin et al. [59].

Complexity of the genetic architecture of fruit quality traits in sweet cherry
Our study highlighted the complex genetic determinism of fruit quality traits, which are highly quantitative.For instance, we identified 11 SNPs associated with FW at distinct loci using all GWAS models and genomes, and 6 additional SNPs, when considering SNPs with MAF <5%.
The global size of cherry fruit (including FW, FWi, FH, and FT) is a trait that researchers have particularly focused on.As in different QTL studies focusing on FW [11,16], we found several SNPs associated with FW and its dimensions at distant positions within chromosome 2.A major locus was previously found at the beginning of chromosome 2 at 8.0 Mbp for FWi [26] and FW [11,16,17].Our analyses confirmed this locus as we found five SNPs associated with FW, FWi, and/or FT using the three genomes within an interval between 4.9 and 12.0 Mbp at the beginning of chromosome 2.
We also confirmed another major locus identified at the end of the chromosome 2 for FW and size in previous studies [13,26].Our study highlighted six SNPs associated with FW, FWi, and/or FT using the three genomes within an interval between 23.1 and 36.5 Mbp at the end of chromosome 2.Moreover, when we considered the SNPs with MAF <5%, the major loci were those at the end of chromosome 2, as reported by Holušová et al. [26].When applying MAF filtering, the locus explaining the largest percentage of FW variance was at the end of chromosome 1.As a consequence, considering no MAF filtering could be more conceivable when results are in line with those of previous published articles.
Additionally, we found several SNPs close to previously identified loci for these fruit size-related traits, particularly on chromosomes 1 and 6 (Fig. 4).Campoy et al. [11] found SNPs associated with FW on all chromosomes, except chromosomes 4 and 7, and Holušová et al. [26] found loci involved in FW and fruit size-related traits on all chromosomes, except chromosome 7.Our study showed the complex genetic determinism of global cherry size as we found associated SNPs on each chromosome, except chromosome 5.We found several identical SNPs involved in positively correlated fruit-size-related traits on chromosomes 1, 2, 3, and 7 when MAF filtering was applied (chromosomes 1, 2, 4, and 6 when MAF filtering was not applied), while Holušová et al. [26] identified similar SNPs also involved in positively correlated fruit-size-related traits on chromosomes 1, 2, 4, and 5.As a consequence, our results suggest that including correlated traits in GWAS can give clues for the identification of major loci shared by these traits, but can also identify loci more specific for one trait.This finding can be useful in post-GWAS prioritization, first focusing on very close loci associated with multiple correlated traits.
For FF, we identified associated SNPs only when including those with MAF <0.5.A major locus is mentioned in the literature at the beginning of chromosome 4 [10,13,14] but, surprisingly, we did not find any SNP-trait association at this locus.We also did not find an FF-associated locus on chromosome 1, yet one was mentioned in the literature [10,11,14,26].However, we identified one SNP using 'Satonishiki' on chromosome 4 at 15.9 Mbp explaining 20.6% of the variance, and Holušová et al. [26] mentioned that their locus explaining the largest part of the FF variance measured by hand (1, very soft; 9, very firm) was also on chromosome 4 at 16.0 Mbp using 'Tieton'.In the same way, we identified an SNP using 'Satonishiki' on chromosome 8 at 19.6 Mbp, whereas Holušová et al. [26] found a significant SNP at 19.2 Mbp using 'Tieton', explaining the largest part of the FF variance measured by a texturometer.
Regarding cracking-related traits, we identified SNP-trait associations colocalizing with QTLs associated with PI [18].We also found three loci colocalizing with QTLs reported by Quero-García et al. [18] using 'PLov2-2n', but for different cracking positions: (i) we identified an SNP associated with PI at 7.9 Mbp on chromosome 7, while Quero-García et al. [18] identified a QTL for FS between 4.3 and 12.1 Mbp; (ii) we identified an SNP associated with SE at 14.7 Mbp on chromosome 5, while Quero-García et al. [18] identified a QTL for PI between 10.7 and 15.5 Mbp; and (iii) we identified an SNP associated with PI at 17.7 Mbp on chromosome 7, while Quero-García et al. [18] identified a QTL for SE between 17.0 and 20.8 Mbp.Moreover, a promising QTL was also found controlling up to 20-25% of SE variance between 1.98 and 3.92 Mbp on chromosome 4 using 'PLov2-2n' [18], and we identified an SNP using 'Satonishiki' highly associated with this trait within the QTL interval.As expected, several loci related to cracking explained part of the phenotypic variance for PI and FS, or for SE and FS, since it has been proven that FS is mostly an extension of PI and SE [60].On the other hand, PI and SE occur at opposite regions of the cherry fruit, but given the highly complex and polygenic nature of cracking tolerance, it is likely that loci controlling PI and SE may coexist in proximal genomic regions.Although the methodology for evaluating cracking tolerance in our study was the same as the one reported by Quero-García et al. [18], the nature of the studied material and their location and observation periods clearly differed.In particular, high levels of rainfall at harvest during several study years were reported for the first study [18], while our observations were made under drier and more continental weather in spring and summer.
We also pinpointed two other loci on chromosomes 1 and 5 using 'PLov2-2n' associated with SE that colocalized with QTLs identified by Crump et al. [14] using the same reference genome: chromosome 1, 13.5-21.0Mbp (18.1 Mbp in our study); and chromosome 5, 12.9-15.5Mbp (14.7 Mbp in our study).Interestingly, for PI we also found an SNP on chromosome 1 using 'PLov2-2n' (13.7 Mbp) colocalizing with the aforementioned QTL identified by Crump et al. [14].Even if Crump et al. [14] determined the incidence of cracking without specifying the location of cracking, it could be hypothesized that Crump et al. [14] observed a sufficiently high amount of cracked fruit at different positions, and in particular at PI or SE, so that several of their significant associations for 'general' cracking tolerance correspond in fact to specific loci related to either PI or SE, as highlighted in our study and in Quero-García et al. [18].Moreover, the search for colocalizations from the literature permitted us to identify additional ones, confirming the reliability of several marker-trait associations in this study concerning cracking-related traits.
Finally, the comparison of potentially colocalizing loci from the literature showed that, in most cases, the surrounding genes have the same annotation, even if they are physically distant.This would mean that searching for colocalization using physical positions may be meaningless if different genomes have been used to conduct the analyses.Even if a high level of synteny is expected between two genomes of the same species, the structural variations may explain such differences in physical positions.

The molecular processes involved in sweet cherry fruit quality may be conserved in fruit species
A large proportion of the candidate genes identified are expected to be involved in fruit quality.We identified two genes encoding 2-oxoglutarate and Fe(II)-dependent oxygenase proteins at two different loci potentially involved in FW.In tomato, these proteins play crucial roles in the biosynthesis of steroidal glycoalkaloids, of polyphenol class of f lavonoids, and of hormones such as gibberellins and ethylene during tomato growth and ripening [61].This may explain why we also found ethylene metabolismrelated genes linked with fruit size, such as a putative ethyleneresponsive binding factor-associated repression ninja family and an ethylene-responsive transcription factor.Moreover, abscisic acid is another hormone known to be involved in fruit ripening [62,63] and we identified a gene encoding an ABI 5 binding protein known to be involved in strawberry development and ripening [64].
For FW, we also identified a gene encoding a glucan endo-1,3-βglucosidase, and such glycosyl hydrolases are enzymes linked to cell wall polysaccharide regulation [65] and cell wall remodeling [66,67] during fruit ripening.Holušová et al. [26] also identified a gene encoding a β-glucosidase as a candidate gene for fruit size, making it a strong candidate gene.The fasciclin-like arabinogalactan proteins are also involved in cell wall biosynthesis in hemp and cotton [68,69], and we identified a gene encoding its precursor associated with FWi and FT.Finally, we highlighted various transcription factors, such as MADS-box, MYB, and Hap3/NF-YB, that are known to be involved in fruit ripening, as in strawberry [70], banana [71], papaya [72], apricot [73], and tomato [74].Similarly, Holušová et al. [26] also pinpointed genes encoding a cellwall-associated receptor kinase and an endoglucanase, showing that the cell wall is undoubtedly directly involved in fruit size.More surprisingly, similarly to Holušová et al. [26], we did not identify any gene involved in cell number regulation associated with global fruit size, although it was identified by De Franceschi et al. [15].
Concerning fruit cracking, we identified a gene encoding a cyclin-dependent kinase inhibitor protein, which is usually considered an important component of cell cycle arrest in senescent and quiescent cells [75].We also identified a programmed cell death regulatory protein that was shown to be involved in pear russet fruit skin [76].In various fruits, including litchi, pomegranate and tomato, transcriptomics studies highlighted that the decomposition of pectin and the subsequent cell wall disassembly are crucial mechanisms leading to fruit cracking [77][78][79].We found candidate genes encoding a putative xylogalacturonan β-1,3-xylosyltransferase, a probable galacturonosyltransferase-like, and a pectin methylesterase that are all linked to pectin metabolism and may play a role in the integrity of the sweet cherry cell wall [80].Calcium is also an important nutrient in plants and has a role in cell wall integrity in case of disturbed metabolism [81,82].We found several candidate genes involved in calcium metabolism that may explain differences in susceptibility in SE, such as genes encoding a calcium-dependent protein kinase, a putative calcium-transporting ATPase, and probable calcium-binding protein.Our results are in line with the work of Wang et al. [78] in litchi, where calcium-related genes were downregulated in a cracking-susceptible cultivar compared with a resistant one.The same authors also found downregulation of MYB and AP transcription factors in the susceptible cultivar, and we identified both candidate genes.Moreover, the identification of a gene encoding a SAUR-like auxin-responsive protein tends to show that auxin could be involved in fruit cracking, maybe through cell wall acidification activity, as reported in Stortenbeker and Bemer [83].Finally, we identified a gene encoding an asparagine synthetase and such an enzyme is expected to be involved in sweet cherry cracking [84].To go further concerning fruit cracking, we looked for SNPs within coding sequences of genes that have been demonstrated to be involved in cell wall and cuticle formation and possibly with cracking, such as lipid transfer protein (PaLTPG1) and 3-ketoacyl-CoA synthase (PaKCS6) genes [85,86].We did identify variations in these genes in our population; however, these SNPs were not linked significantly to the variation in cracking-related traits.
If abscisic acid has a preponderant role in the development of non-climacteric fruits such as sweet cherry [63], our study tends to show that other hormones are probably also crucial for this physiological process, such as auxins [87], gibberellins, and, more surprisingly, ethylene.Together with hormones and primary metabolisms, fruit quality may be dependent on cell wall integrity, calcium response, and transcription factors.At the crossroads of these metabolisms, MATE transporters have an essential role, e.g. in f lavonoid and phytohormone transport [88], and we found one MATE-encoding gene, together with other sugar and solute transporters, as global fruit size candidate genes.Finally, we identified several candidate genes, such as genes encoding a βgalactosidase, another cell wall hydrolytic enzyme, for an MYB, for a calcium-binding protein, and for a RLK/Pelle family of kinases, similar to genes potentially involved in global nut-size-related traits in walnut [47].In summary, global fruit quality in sweet cherry could be partly the result of fine tuning of hormones, calcium, and cell wall metabolisms, and metabolite transporters, as already observed in other fruits.

Conclusions
Although much is already known about sweet cherry fruit quality, there is still need for better understanding of the genetic control of such traits, in the context of breeding strategies for producer and consumer demands.This study provides the first attempt to evaluate genome-wide SNP-trait associations in sweet cherry for fruit quality traits, including fruit size, fruit firmness, and fruit cracking, using multiple reference genomes and multiple years of phenotyping.We successfully applied the GBS approach to generate a powerful set of SNPs that covered the entire cherry genome, and several SNP-trait associations related to different agronomically relevant traits were detected.We confirmed previously identified major loci involved in global fruit size and fruit cracking, together with numerous new loci, which constitutes the first step in the development of marker-assisted selection, in order to facilitate sweet cherry breeding.Finally, we pinpointed potential candidate genes mainly involved in hormone and cell wall metabolisms.Interestingly, our study showed the importance of the choice of the reference genome used to conduct a GWAS, together with consideration or not of MAF filtering.

Plant material
Plant material consisted of a panel of 116 accessions belonging to the sweet cherry collection maintained by the INRAE Prunus Genetic Resources Center at Bourran (Lot and Garonne, France).The panel was chosen to (i) include the maximum genetic diversity from a collection of 210 accessions based on a previous study [45], and (ii) cover a wide range of phenotypic variability in fruit quality traits.The panel gathered accessions from France and other 15 countries in America, Asia, and Europe.It comprised old landraces and cultivars released by breeding programs more or less recently.The French accessions represented about half (48%) of the total panel, 78% of them being old landraces from different regions of the country (Supplementary Data Table S6).

Phenotyping and phenotype modeling
Fruits were collected at maturity, randomly from one tree per accession.Assessment of ripening was subjective and involved mainly skin color, texture, and taste.A summary of the 23 traits evaluated is provided in Supplementary Data Table S7.All fruit traits were evaluated during the harvest day for each accession, on a homogenous sample of 10 fruits except for fruit-crackingrelated traits, which were evaluated on 50 fruits.Each fruit in each sample was identified with a code number in order to score individually the data for each fruit and its corresponding stem and stone.Traits were measured with digital tools [balance for fruit weight (FW) and caliper for fruit height (FH), width (FWi), and thickness (FT)] and then averaged.Fruit firmness (FF) was measured using a Durofel ® (Setop Giraud Technologie, Cavaillon, France) texture analyzer as described by Campoy et al. [11].After removing the stone from the fruits of each accession, fruits were pooled and the fruit juice was extracted with a hand press.Soluble sugar content (SSC) of the juice was measured with a digital refractometer (Atago Co., Ltd, Tokyo, Japan).The pH and titratable acidity (TA) were evaluated by using an equal volume of juice from 10 fruits as described by Dirlewanger et al. [89].
Fruit pistillar end cracking (PI), fruit stem end cracking (SE), and fruit side cracking (FS) were evaluated from a homogenous batch of 50 fruits.Each fruit was visually inspected and any observable crack was recorded by differentiating between three distinct fruit regions [18].As proposed by these last authors, cracking tolerance was computed with percentages of count data, a priori following a multinomial distribution.Hence, an arcsine (square root) transformation was applied in order to stabilize the variance and to estimate genetic variances and heritabilities.
Fruit cracking susceptibility (FCS), fruit skin color (FC), and fruit juice color (FJC) were evaluated visually, according to the recommendations of the European Cooperative Programme for Plant Genetic Resources (ECPGR) Prunus working group (European Prunus Data Base: http://www.bordeaux.inra.fr/eucherrydb/),following an ordinal scale from 1 to 9. The Guidelines for the Conduct of Tests for Distinctness, Uniformity and Stability for sweet cherry recommended by the International Union for the Protection of New Varieties of Plants (UPOV) were used to determine fruit pistil end (FPE), fruit suture (FSU), and stone shape in central view (SS), following an ordinal scale from 1 to 3. Finally, productivity (PROD) was estimated according to an ordinal scale from 0 to 9, following the recommendations of INRAE Plant Genetic Resources Information System (Siregal) for sweet cherry (https://urgi.versailles.inrae.fr/siregal/siregal/grc.do).Nine traits were evaluated during 6 years (2014-19) and the remaining traits during 2 years (2014-15).Each year of evaluation was performed by a unique pair of evaluators.
The phenotypic data were analyzed using the 'lme4' R package [90].For each trait, the means of genotypic effects were obtained for each accession using BLUPs as mixed linear models and by adjusting for the random year effect (2 or 6 years depending on the trait), as described in Bernard et al. [47].
The variance components were then used to estimate broadsense heritability (H 2 ) for all traits and Pearson correlation coefficients of each pair of the analyzed traits were calculated using the BLUP values with the 'corrplot' R package.

DNA isolation, genotyping-by-sequencing library construction, and sequencing
Genomic DNA was extracted from young leaf tissues using the DNeasy Plant Mini Kit (Qiagen, Hilden, Germany).The DNA quality and concentration were measured with a NanoDrop™ 2000 spectrophotometer (Thermo Fisher Scientific, Waltham, USA) and a Qubit ® 2.0 f luorometer (Thermo Fisher Scientific, Waltham, USA).
GBS library construction for multiplexed individuals was done at the CIRAD genotyping platform, Montpellier, France (https://www.gptr-lr-genotypage.com),according to a protocol adapted from Elshire et al. [27] with the use of the ApeKI restriction enzyme.Subsequently, the final pooled library was transferred to the GeT-PlaGe core facility (GenoToul GIS, Toulouse, France, https:// get.genotoul.fr/)for sequencing using the Illumina ® paired-end protocol on a single lane in a HiSeq3000 sequencer (2 × 150 cycles).

Genotyping-by-sequencing data handling, processing, and SNP calling
Raw fastq reads of all accessions were split into separate fastq files, based on their barcodes, using Sabre software (https:// github.com/najoshi/sabre).The read quality of each fastq was checked with FastQC v.0.11.2 [91] and visualized with MultiQC v.1.7 [92].All unligated adapters and low-quality read sequences were trimmed with Trimmomatic version 0.33 [93] with the following options: ILLUMINACLIP 2:30:10, LEADING 3, TRAILING 3, SLIDINGWINDOW 4:15, MINLEN 36.The trimmed and cleaned fastq files sequences were aligned to two of the available sweet cherry genome sequences, P. avium var.'Regina' [37] and P. avium var.'Satonishiki' [38], and the peach genome sequence P. persica var.'PLov2-2n' v2 [36], using the Burrows-Wheeler Aligner 0.7.5a-r405 [93] and SAMtools [95].The peach genome was used because of the high quality of both structural and functional annotations and the high synteny existing among Prunus species.SNPs were called using the HaplotypeCaller algorithm as recommended by the Genome Analysis Toolkit v.3.7-1[96].Several filters were applied to minimize the number of false-positive SNPs using VCFtools 0.1.14[97].Only SNPs with all the following characteristics were retained for analysis: (i) missing data <20%; (ii) only one alternative allele; (iii) minor allele frequency (MAF) not less than 5%; (iv) minimum read depth of 10; and (v) a Phred scale mapping quality >60 (QUAL value >60).For some traits, we conducted analyses including the SNPs with MAF <5%.These analyses concerned the traits for which we found significant associations in GWAS analyses with the MAF filter to compare the results, and the traits that were highly inheritable for which we did not find significant associations with the MAF filter but for which known genomic regions were linked with in other studies.
Finally, we imputed the remaining 20% of missing data based on a matrix factorization approach using the 'impute' function from the 'LEA' R package [98].Imputation of missing data is strongly recommended in genetic association studies [99] as it limits the discovery of false-positives.

Population structure analysis and linkage disequilibrium decay
PCA was conducted with the 'snpgdsPCA' function of the 'SNPRelate' R package [100].We also performed genetic clustering (structure) analysis using the sparse non-negative matrix factorization (sNMF) software, implemented in the 'LEA' R package [98].For the sNMF analysis, 15 potential numbers (1-15) of ancestral populations (K) were tested using a cross-validation procedure, and the one with the lowest cross-entropy criterion error was chosen as the best K value [101].
LD decay was calculated with PopLDdecay v3.27 (https:// github.com/BGI-shenzhen/PopLDdecay,accessed 28 October 2019) with SNPs with a maximum distance of 300 kb.LD decay was plotted as pairwise LD versus pairwise distance between SNP using the 'smooth.spline'function of the 'stats' R package.

Genome-wide association analyses
The filtered SNP datasets from the 116 individuals of the sweet cherry panel, obtained after alignment with the three genomes, were used to perform a GWAS for the 25 traits studied.The GWAS was carried out by applying two different multilocus and mixed linear models implemented in the 'GAPIT v3.0' R package [102]: the multilocus mixed model (MLMM) of Segura et al. [103] and the Fixed and random model Circulating Probability Unification (FarmCPU) of Liu et al. [104].These two models are known to increase the power of detecting significant marker-trait associations while controlling false-positives due to confounding factors [55].Familial relatedness was accounted for using the VanRaden method [105] and population structure was accounted for using the 'model selection' option, which defines the best number of PC to include in the model according to a Bayesian information criterion (BIC), both implemented in GAPIT.Significant marker-trait associations were adjusted by multiple test correction using Bonferroni correction with an α value of 1% and were then inspected in Manhattan plots.The cutoff value (green solid line in the Manhattan plots) was calculated as −log 10 (α/k), where α is the alpha value (0.01) and k is the number of SNPs.A quantile-quantile (Q-Q) plot was used to verify if the model was correctly accounting for confounding variables.We also conducted analyses including the SNPs with an MAF <5% for fruit size-related traits, cracking-related traits, and fruit firmness, to observe whether such filtering has a strong inf luence.

Search for colocalizations of SNP-trait associations between reference genomes
As we performed GWAS using multiple reference genomes, we searched for potential colocalizations of SNP-trait associations that are relatively close in our results between two genomes, or close to those included in the results of nine previously published articles regarding fruit quality traits in sweet cherry.Consequently, we took sequences of 100-150 bp around the SNP associated with one trait of the first genome and we searched for the position of this sequence on the second genome.We then determined whether this sequence colocalized with the SNP associated with the same trait on the second genome, using the Basic Local Alignment Search Tool (BLAST) from the National Center for Biotechnology Information (NCBI) database (https:// www.ncbi.nlm.nih.gov/).

Search of candidate genes within the linkage disequilibrium blocks of associated loci
LD block analysis for the chromosomes identified with the SNPs significantly associated with a trait was performed using Haploview 4.2 software [106].We obtained our Haploview files using PLINK software and the command '-recodeHV' [107].An exploration around each physical position of these SNPphenotype associations was conducted to identify the other SNPs that are in linkage disequilibrium and to determine the genomic regions to search for candidate genes.Haplotype blocks in the region were defined with the 'solid spine of LD' method.The identified LD blocks were then searched for candidate genes using the 'GFF3' annotations of the three genomes.

Figure 2 .Figure 3 .
Figure 2. GWAS results for fruit weight.Manhattan plots followed by Q-Q plots using the MLMM (left) and FarmCPU (right) models for the three reference genomes.The horizontal line corresponds to the 1% Bonferroni threshold automatically implemented in GAPIT.

Figure 4 .
Figure 4. Global representation of the genetic architecture of fruit quality traits in sweet cherry.Each marker-trait association shape correspond to a study: Holušová et al.[26], GWAS using SNPs mapped on 'Tieton'; Crump et al.[14], GWAS and QTL mapping using SNPs mapped on 'PLov2-2n'; Cai et al.[10], QTL mapping on multiple progenies and SNPs mapped on 'PLov2-2n';[13], QTL mapping on multiple progenies and SNPs mapped on 'Satonishiki' (major loci); Calle et al.[12], QTLs on one progeny and SNPs mapped on 'PLov2-2n'; Zhang et al.[17], QTLs on one progeny and SSRs coming from Prunus species (QTLs in centimorgan placed on 'Regina'); Rosyara et al.[16], QTLs on multiple progenies and SSRs and SNPs coming from Prunus species (QTLs with intervals given in centimorgan placed on 'Regina'); Campoy et al.[11], QTLs on multiple progenies and SNPs mapped on 'PLov2-2n' (QTLs with intervals given in centimorgan and major loci); Quero-García et al.[18], QTLs on multiple progenies and SNPs mapped on 'PLov2-2n' (QTLs with intervals given in centimorgan and major loci).The QTLs are placed to the right of the corresponding chromosome.We divided all traits into three main categories, colored in the figures as follows: cracking-related traits in yellow, global fruit-size-related traits in blue, and firmness-related traits in red.

Table 1 .
Continued Figure 1.Pearson correlation matrix of the 23 studied traits.

Table 2 .
Summary of SNP-trait associations related to sweet cherry fruit quality.

Stone shape in central view
a R 2 is the percentage of variance explained corrected for genome-wide background.bThefirstallelementioned is the minor allele.cTheallelic effect is the difference in mean of BLUPs between genotypes with one or other allele (sign: minor allele).

Table 3 .
Summary of LD block intervals and putatively best candidate genes.

Table 3 .
Continued Physical positions in italics indicate identical marker-trait associations between GWAS models.Physical positions in bold indicate identical marker-trait associations between traits.Candidate genes in bold are identical.aLD block intervals determined using Solid spine of LD method implemented in Haploview software.