Quantitative trait locus mapping of fruit aroma compounds in cucumber (Cucumber sativus L.) based on a recombinant inbred line population

Abstract The fresh and unique flavor of cucumber fruits, mainly composed of aldehydes and alcohols, is one of its most important fruit qualities. However, little is known about the genetic basis of aroma compounds in cucumber fruit and the related quantitative trait loci (QTLs). In this study, genomic screening of QTLs underlying aroma compounds was performed based on the genetic linkage map constructed using 1301 single-nucleotide polymorphism (SNP) markers from genotyping-by-sequencing of a recombinant inbred line (RIL) population developed from Q16 × Q24. Significant genetic variations of aroma compounds in the RIL population were observed, and a total of 28 QTLs were screened. A major QTL (qol8-2.1) related to (E,Z)-2,6-nonadien-1-ol was detected with a markedly high LOD score (10.97 in 2020 and 3.56 in 2019) between mk190 and mk204 on chromosome 2. Genome scans identified a cluster of nine lipoxygenase genes in this region. A significant positive correlation was detected between CsaV3_2G005360 (CsLOX08) and (E,Z)-2,6-nonadien-1-ol, and five amino acid variations were detected between the CsLOX08 protein sequences of the two parental lines. Based on the genome variation of CsLOX08, we developed an InDel marker. Genotyping of InDel markers was consistent with the content of (E,Z)-2,6-nonadien-1-ol in RILs, which were also verified in nine cucumber inbred lines. The results will give breeders guidance for obtaining better flavor in cucumber.


Introduction
Cucumber is cultivated and consumed worldwide, and is valued by consumers for its fresh and unique flavor, especially for eating raw [1]. More than 70 aroma compounds have been identified in cucumber fruit, including aldehydes, alcohols, ketones, terpenes, esters, and furans, which all contribute to the typical f lavor of cucumber fruits [2,3]. Of these, C6 and C9 aldehydes and alcohols are the main compounds of cucumber f lavor [4,5]. With the fast development of aroma measurement methods, solid-phase microextraction (SPME) combined with gas chromatography-mass spectrometry (GC-MS) has been proved to be an effective method for the qualitative and quantitative analysis of aroma compounds [6]. More aroma compounds have been detected, and measurements of aroma compounds have become more convenient and accurate.
It was reported that the content of aroma compounds in cucumber fruit was inf luenced by multiple genes and the environment. The heritability of characteristic aroma compounds in cucumber fruit was low, indicating that the content of aroma compounds was affected dramatically by environmental factors [15]. Understanding the genetics of aroma compounds is highly relevant for f lavor breeding [16]. The genetics of fruit aroma compounds in plants has been demonstrated in several studies. Quantitative trait loci (QTL) related to the content of S-methyl-thioacetate in melon fruits were investigated by QTL mapping in a recombinant inbred line (RIL) population, and the CmThAT1 gene was identified as a candidate gene [17]. A total of 166 QTLs for 82 different volatile organic compounds were identified based on a melon RIL population [18]. In tomato, a total of 102 QTLs for 39 aroma compounds were located on different chromosomes using 169 RIL populations [19]. A major QTL locus associated with the content of phenylalanine-derived volatiles in tomato fruit was mapped on chromosome 4. Fine mapping showed that FLORAL4 was a candidate gene, and the gene function was verified using the CRISPR-Cas9 system [20]. QTL mapping of fruit aroma compounds has also been studied in some fruit trees. In apple, MdoOMT1 and AAT1 were identified as candidate genes for methylated phenylpropenes and esters, respectively, by QTL mapping [21,22]. A cluster of terpene synthase (TPS) genes was identified at a QTL associated with the content of 1,8-cineole on linkage group 29a in kiwifruit, and AcTPS1b was confirmed as a critical gene for the production of 1,8-cineole by enzyme character analysis and gene expression analysis [23]. However, few reports have identified the loci that control cucumber aroma compounds using QTL analysis.
In this study, QTL mapping of cucumber fruit aroma compounds was performed using 129 RILs developed by crossing lines Q16 and Q24. A major QTL (qol8-2.1) related to (E,Z)-2,6-nonadien-1-ol was screened, and nine lipoxygenase genes were identified in this QTL region. Correlation analysis between (E,Z)-2,6-nonadien-1-ol content and expressions of nine CsLOXs was carried out, and sequence analysis verified that CsaV3_2G005360 (CsLOX08) might be the candidate gene. An InDel marker was developed based on the genome variation of CsLOX08, and the InDel marker could genotype the high and low (E,Z)-2,6-nonadien-1-ol content individuals in cucumber fruits, which will provide cucumber breeders with some basis for molecular breeding.

Phenotypic variation of aroma compounds
The aroma compound phenotypes of the two parental lines were determined in two years. The result revealed that the total content of C6 aroma compounds (sixcarbon aldehydes and alcohols) and C9 aroma compounds (nine-carbon aldehydes and alcohols) was significantly higher in line Q16 than in line Q24 ( Table 1). The content of most aroma compounds of the two parental lines overlapped in both years. The difference in aroma compounds between two parental lines was analyzed in the two years. The contents of 2-hexanal, (E)-4-nonenal, and (E,Z)-2,6-nonadien-1-ol were stably and significantly higher in Q16 than in Q24 in both years. However, the differences in some aroma compounds between the two parental lines were inconsistent between the two years, such as (Z)-3-hexen-1-ol, (Z,Z)-3,6-nonadienal, (E)-2-nonenal, (Z)-3-nonen-1-ol, and (E)-6-nonen-1-ol.
The frequency distribution of aroma compounds in the RIL population in the two years is illustrated in Fig. 1. Most aroma compounds exhibited a normal or a nearnormal distribution in the RIL population ( Fig. 1). This suggests that these aroma compounds are quantitative traits. The violin plots showed that most aroma compounds were stable in the two years except for a few of them, such as (E,E)-2, 4-nonadienal, (Z)-2-nonenal, (Z)-6nonen-1-ol, and 1-hexanol.

Multivariate analysis and correlations among different aroma compounds
To determine the main variables accounting for the phenotypic variation of aroma compounds in the RIL population, principal component analysis (PCA) was employed to perform multivariate analysis. The first two principal components, PC1 and PC2, accounted for 30.32 and 20.37% of the total phenotypic variation (Fig. 2). The RIL population was distributed evenly without forming clear clusters. The two parental lines were clearly separated from each other, indicating that they contain relatively different aroma compound profiles. The variables that positively contributed to the first principal component    The correlations between different aroma compounds were calculated and are depicted in Fig. 3 and Supplementary Data Tables S1 and S2. It is noteworthy that significant positive correlations (P < .01) were observed between (E,Z)-2,6-nonadienal and (E,Z)-2,6nonadien-1-ol in 2019 (r = 0.604) and 2020 (r = 0.293). In addition, there was a significant positive correlation between (E)-6-nonenal and (E)-6-nonen-1-ol in 2019 (r = 0.654) and 2020 (r = 0.634), indicating that there was a significant correlation between aldehydes and their corresponding alcohols. The strongest positive correlations were found between (E)-6-nonen-1-ol and (E,Z)-2,6-nonadien-1-ol (r = 0.95) and between nonanal and (E)-6-nonenal (r = 0.91).

Genotyping by sequencing of RIL population and construction of a high-density linkage map
The RIL population and two parental lines were resequenced using the genotyping-by-sequencing approach. A total of 70.23 G clean bases with high-quality (Q20 ≥ 96.16% and Q30 ≥ 89.32%) and 484.90 Mb clean reads were obtained by aligning with the cucumber reference genome 9930 V3. Based on the genotyping results of two parental lines, 241 166 single-nucleotide polymorphisms (SNPs) were found, and a total of 163 900 SNPs were identified among the RIL population. After deleting segregation distortion (P < .001), we obtained 1301 effective SNPs. Based on 1301 SNPs, a high-density SNP linkage map of the RIL population was constructed, which contained 1301 SNP markers distributed in seven linkage groups and covered a total genetic distance of 1111.41 cM with 0.85 cM per marker (Fig. 4, Supplementary Data Table S3).

QTL mapping of C6 and C9 aroma compounds in the RIL population
We performed QTL analysis of C6 and C9 aroma compounds with a 2-year phenotypic dataset from the RIL population based on the linkage maps constructed in this study. The CIM calculations were performed for wholegenome scanning of QTLs based on the phenotype data of aroma compounds. A total of 28 QTLs were screened for 19 aroma compounds and distributed on six chromosomes, except for chromosome 4 (Fig. 4, Table 2). The QTLs explained 2.11-39.30% of the phenotype variance of aroma compounds (Table 2). Most QTLs were distributed on chromosomes 2, 3, and 5. Among them, four QTLs associated with C6 aldehydes and alcohols were located on chromosome 5, nine QTLs associated with C9 aldehydes were located on chromosome 3, and six QTLs related to C9 alcohols were located on chromosome 2.
Genes in the lipoxygenase pathway, especially genes for enzymes, were searched in the region from mk190 to mk204 on chromosome 2. The region f lanked by these two markers in the cucumber (Chinese Long) V3 reference genome spanned an interval of chromosome 2 from 1 251 767 to 3 966 418 bp and included a total of 381 genes. To search for the candidate gene for (E,Z)-2,6-nonadien-1-ol, genome-wide in silico analysis was performed to detect the functional annotation of genes, especially genes related to the synthesis of aldehydes and alcohols through the lipoxygenase pathway in cucumber. In this QTL region, there were nine lipoxygenase genes located in a physical region of 177 kb, and they were flanked by markers mk190 and mk200 (Fig. 5b).
The nine lipoxygenase genes identified in this region were CsaV3_2G005350, CsaV3_2G005360, CsaV3_2G006370, CsaV3_2G006380, CsaV3_2G006390, CsaV3_2G006420, CsaV3_2G006440, CsaV3_2G006460, and CsaV3_2G006470 (Fig. 5b). CsaV3_2G005350 and CsaV3_2G005360 were located physically together and within 14 kb, and were upstream of the clustering genes CsaV3_2G006370, CsaV3_2G006380, and CsaV3_2G006390. CsaV3_2G006420, CsaV3_2G006440, CsaV3_2G006460, and CsaV3_2G006470 were located in a region of 56 kb. To understand the structure and function of the nine lipoxygenase genes, we conducted a phylogenic analysis of the nine cucumber lipoxygenase genes with lipoxygenase genes of other species, including AtLOXs from Arabidopsis, LeLOXs from tomato, and NtLOXs from tobacco. This showed that all nine cucumber lipoxygenase genes were in the same clade as AtLOX1, AtLOX5, and LeLOX2, which previously all characterized the 9-LOX gene (Fig. 5c). The enzymes encoded by the 9-LOX gene can cleave linolenic or linoleic acid into 9-hydroperoxylinolenic acid.

Sequence variation of nine lipoxygenase genes
The sequences of the nine lipoxygenase genes were analyzed based on the sequencing dataset of the two parental lines. It was found that there were no significant variations in the sequence of these lipoxygenase genes between the two parental lines except for CsaV3_2G005360 and CsaV3_2G006370 (Supplementary  Data Table S4). Compared with the CsaV3_2G005360 amino acid sequence in Q16, five amino acid variations (V27L, F114S, S365T, L381V, and R628Q) were found in Q24 (Supplementary Data Fig. S1). Furthermore, an 8-bp insertion was found in the fifth intron in the Q24 DNA genome compared with that of Q16. Twenty-two variable SNPs were detected between the CsaV3_2G005360 promoters of the parents (Supplementary Data Fig. S2). Nine different cis-elements were found between the promoters of the two parental lines, including B3, C2H2, Dof, G2-like, EIL, MYB, TCP, WRKY, and YABBY (Supplementary Data  Table S5). CsaV3_2G005360 encoded a protein with 864 amino acids. There was high sequence similarity between CsaV3_2G005360 and CsLOX08 by amino acid sequence alignment of CsaV3_2G005360 and the predicted lipoxygenase in GenBank. Thus, the CsaV3_2G005360 gene was named CsLOX08 in this study. Compared with the coding DNA sequence of CsaV3_2G006370 in Q16, there were six synonymous SNPs in the coding DNA sequence of Q24. There were 38 variable SNPs and 2 InDels detected between the CsaV3_2G006370 promoters of the parents. CsaV3_2G006370 encoded a protein with 860 amino acids. The amino acid sequence alignment of CsaV3_2G006370 and the predicted lipoxygenase in GenBank revealed high sequence similarities between CsaV3_2G006370 and CsLOX06. Thus, the CsaV3_2G006370 gene was named CsLOX06.

Expression pattern of nine CsLOX candidate genes and correlation analysis with aroma compounds
The expression patterns of the nine CsLOX genes clustered on chromosome 2 were analyzed at 0, 3,6, 9, 12, and 15 days post-anthesis (dpa) of Q16 and Q24 fruits (Fig. 6b). In Q16, the expression of CsaV3_2G005350, CsLOX08, CsLOX06, CsaV3_2G005380, and CsaV3_2G006420 first increased and then decreased during fruit development, and expression of the other three decreased during fruit development. In Q24, the expression of CsaV3_2G005370 and CsaV3_2G006420 first increased and then decreased during fruit development; expression peaked at 6 dpa. The expression of CsaV3_2G005350 and CsaV3_2G005380 increased with fruit development, and the expression of others decreased with fruit development in Q24. Aroma compound content was analyzed at 0, 3,6, 9, 12, and 15 dpa of Q16 and Q24 fruits (Fig. 6a). The content of (E,Z)-2,6-nonadien-1-ol was higher at the late stage of cucumber fruit development and lower at the early stage of cucumber fruit development in the two lines.
To find which CsLOX gene could be associated with the content of (E,Z)-2,6-nonadien-1-ol, a correlation analysis was performed between the CsLOX genes and the content of (E,Z)-2,6-nonadien-1-ol (Fig. 7a). This showed that the expression of CsLOX08 was significantly and positively correlated with the content of (E,Z)-2,6-nonadien-1-ol (R 2 = 0.76, P < 0.05) in Q16, but there was no significant correlation between them in Q24. The expression patterns of CsLOX08 in Q16 and Q24 were different. The expression pattern of CsLOX08 was the same as the accumulation pattern of (E,Z)-2,6-nonadien-1-ol in Q16, which was upregulated at stages from 0 to 12 dpa, but expression was low in Q24 at stages from 0 to 12 dpa (Fig. 6b).

Development of InDel marker and genotyping of RIL population
According to the 8-bp DNA insertion in the fifth intron of CsLOX08, an InDel marker was developed. The RILs were genotyped using the InDel marker, and three genotypes ('a', 'b' and 'h') were present in RILs. Genotypes 'a', 'b', and 'h' stood for dominant homozygous, recessive homozygous, and heterozygous, respectively; 'a' represented the loci derived from Q16, 'b' represented the loci derived from Q24, and 'h' represented loci that were heterozygous. The content of (E,Z)-2,6-nonadien-1-ol of individuals with the genotype 'b' was significantly lower than that of individuals with the genotype 'a' in both years (Fig. 9). The genotype results in RIL individuals showed that this InDel marker was closely linked with a high or low content of (E,Z)-2,6-nonadien-1-ol. Individuals genotyped 'a' had a high content of (E,Z)-2,6-nonadien-1-ol, and individuals genotyped 'b' had a low content of (E,Z)-2,6-nonadien-1-ol (Figs 8 and 9).

Verification of the InDel marker in some inbred lines
Nine cucumber inbred lines were selected to verify the validity of the InDel marker (Fig. 10). Four inbred lines, S63, XABP, 8681, and DA33, which had a higher content of (E,Z)-2,6-nonadien-1-ol, were identified as genotype 'a', and inbred line JY4, which had a lower content of (E,Z)-2,6-nonadien-1-ol, was identified as genotype 'b'.  However, the genotyping of 9930, Gy14, 26, and 14 was inconsistent with the content of (E,Z)-2,6-nonadien-1ol; 9930 was identified as genotype 'a' but had a low content of (E,Z)-2,6-nonadien-1-ol, and Gy14, 26, and 14 were identified as genotype 'b' but had a high content of (E,Z)-2,6-nonadien-1-ol. Figure 8. Association between the allelic variation of CsLOX08 and the content of (E,Z)-2,6-nonadien-1-ol in extreme individuals from the RIL population. The InDel marker was used to determine the genotypes of CsLOX08 by polyacrylamide gel electrophoresis of the PCR products. The content of (E,Z)-2,6-nonadien-1-ol in extreme individuals was analyzed by GC-MS. Data are means ± standard error (n = 3).  . Association between the allelic variation of CsLOX08 and the content of (E,Z)-2,6-nonadien-1-ol in nine cucumber inbred lines. The InDel marker was used to determine the genotypes of CsLOX08 in nine cucumber inbred lines by polyacrylamide gel electrophoresis of the PCR products. The content of (E,Z)-2,6-nonadien-1-ol was analyzed by GC-MS. Data are means ± standard error (n = 3).

Different kinds and contents of important aroma compounds give fruits distinct flavors
Nineteen characteristic aroma compounds identified by GC-MS were analyzed in parents and RIL populations. All these aroma compounds were known to occur in cucumber fruit, and many were proved to contribute to cucumber fruit f lavor [3,4,24,25]. (E,Z)-2,6-nonadienal and (E,Z)-2,6-nonadien-1-ol give fruits a typical cucumberlike f lavor, which is the most important constituent of the cucumber f lavor. Nonanal and nonanol give citrus and rose notes, and (Z)-6-nonenal gives melon-like notes [12,26]. Marked differences in C6 and C9 aroma compound content were found among different genotypes of the RIL population, which was consistent with previous studies in apple and strawberry, where the variation of aroma compounds depends on genotype [27,28]. C6 and C9 aroma compounds are synthesized by two different branches of the lipoxygenase pathway, in which 13-LOX and 13-HPL catalyze α-linolenic acid and linoleic acid into C6 aroma compounds, and 9-LOX and 9-HPL catalyze α-linolenic acid and linoleic acid into C9 aroma compounds [29]. In both years, significant positive correlations were found among aroma compounds from the same metabolic pathway. For instance, hexanal, 2hexenal, and 1-hexanol showed significant pairwise correlations in the two years. Significant correlations were also found among C9 aldehydes and alcohols.

QTLs controlling aroma compounds of cucumber fruit
The content of aroma compounds is a complex quantitative trait controlled by multiple genes and environmental factors. QTL mapping is an efficient tool to screen candidate genes that control quantitative traits. However, few reports have identified the loci that control cucumber aroma compounds using QTL analysis. In our QTL mapping analysis, 28 genomic positions were distributed on six chromosomes (chromosomes 1, 2, 3, 5, 6, and 7) associated with C6 and C9 aroma compounds, and these regions may contain the primary candidate genes that control aroma compound variation. Among them, only 21% of QTLs were reproducible in both years, and the remaining 79% of QTLs were detected in one year. A possible explanation for most QTLs being detected only in one environment is that the aroma compounds of different individuals were affected by the environment in different ways.
QTLs associated with C6 aldehydes and their corresponding alcohols were mainly distributed on chromosome 5, and QTLs associated with C9 aldehydes and their corresponding alcohols were mainly distributed on chromosome 2 and chromosome 3. Several QTL clusters were found on chromosome 2, chromosome 3, and chromosome 5. For example, a QTL cluster (qol3-2.1, qol4-2.1, qol6-2.1, and qol7-2.1) associated with C9 alcohols was found on chromosome 2, and another QTL cluster (qol1-5.1 and qol2-5.1) associated with C6 alcohols was found on chromosome 5. The detection of clusters of QTLs could be due to compounds of similar chemical structure being formed from a common biochemical pathway, as had been previously reported in Arabidopsis [30], strawberry [31], peach [32], and tomato [33].

Variation of important aroma compound content may be controlled by lipoxygenase genes
C6 and C9 aroma compounds are synthesized through the lipoxygenase pathway, and lipoxygenases are considered to be the key limiting enzymes controlling the synthesis of C6 and C9 compounds [10,34]. In our study, a major QTL (qol8-2.1) related to (E,Z)-2,6-nonadien-1-ol was repeatedly detected on chromosome 2, and two consecutive QTL peaks were observed in 2020. So, we speculated that an additional QTL linked to qol8-2.1 might be present in this QTL locus. Genome scans were performed mainly for the genes involved in the enzymatic action of the lipoxygenase pathway, and a cluster of nine CsLOXs was identified in this region. Sequence variation analysis showed that the amino acid sequence of CsLOX08 was polymorphic between two parental lines, and no polymorphism was found in the amino acid sequences of the remaining eight lipoxygenase genes between the two parental lines. Meanwhile, 22 SNPs were detected between the CsLOX08 promoters of the parents. The expression of CsLOX08 was significantly and positively correlated with the content of (E,Z)-2,6-nonadien-1-ol (R 2 = .76, P < .05) in Q16, but there was no significant correlation between them in Q24. We speculated that the variation of CsLOX08 between the two parental lines might be related to the variation of (E,Z)-2,6-nonadien-1-ol content. Previous studies have shown that CsLOX08 has an obvious expression signal in flowers, fruits, and roots [35], and the expression of CsLOX08 in response to the treatments of wounding, ACC, MeJA, ABA, NaCl, and KCl [36]. Five amino acid variations (V27L, F114S, S365T, L381V, and R628Q) were found between the two parental lines. There have been many reports that the amino acid variation of lipoxygenase protein leads to changes in lipoxygenase enzyme activity. For example, the variation of the amino acid (A215V) in Anabaena minilipoxygenase causes 9-LOX to be converted to 13-LOX [37], and soybean sLOX-1 amino acid change of sF557 was found to increase the activity of the enzyme [38]. So, the amino acid variation of CsLOX08 protein in Q24 may lead to a change in lipoxygenase enzyme activity. The promoter variation of CsLOX08 in Q16 and Q24 resulted in nine different cis-elements between the promoters of the parents, including B3, C2H2, Dof, G2-like, EIL, MYB, TCP, WRKY, and YABBY. The promoter variation in Q16 and Q24 may affect transcription factor binding ability, which leads to different expression levels of CsLOX08. Previous studies reported that MYB transcription factors (FaMYB9, FaMYB10, and SlMYB75) and MADS transcription factors (RIN and SlMBP8) directly regulated the lipoxygenase pathway to affect the accumulation of aroma compounds in strawberry and tomato [39][40][41][42][43].

The molecular marker linked to aroma compound content
The content of aroma compounds was a key factor affecting the flavor of cucumber fruit. Marker-assisted selection (MAS) has been widely used in fruit quality breeding. However, there is no report on molecular markers of aroma compounds in cucumber. In this study, we developed an InDel molecular marker closely linked to (E,Z)-2,6-nonadien-1-ol. There were two types of InDel markers in the eight cucumber varieties, which indicated that this variation of the LOX gene was widespread in natural populations. The genotype pattern of InDel marker and aroma content of several cucumber varieties are not consistent, which may be due to the fact that aroma is a trait controlled by multiple genes.

Plant materials
Two inbred lines, Q16 and Q24, were used in this experiment. Q16 belonged to a Northern China ecotype of cucumber, and Q24 was a landrace cucumber. RILs of the F 6 generation, composed of 148 progeny developed from Q16 × Q24, were used in the autumn of 2019 and spring of 2020 field trials. Nine cucumber inbred lines-9930, Gy14, 26, 14, S63, JY4, XABP, 8681, and DA33-were used in this experiment, and the origin and market class of the nine inbred lines are listed in Supplementary Data Table S6.

Sampling of plant materials
Plant materials were planted in a plastic tunnel at the Yangling Experimental Demonstration Station (34 •  Q16, Q24, and 129 RILs were planted in the autumn of 2019 and the spring of 2020. Young cucumber leaves were collected and stored at −20 • C for DNA extraction. For RILs and parental lines, 12 plants of each line were planted. A total of six to nine well-developed, diseasefree fruits from different individuals for each line were sampled at 8-9 a.m. at 12 dpa, and only one fruit was collected per plant. From most lines we collected nine fruits, and the nine fruits were divided into three parts on average, each containing three fruits as a biological repetition. For a few lines we collected six fruits, and the six fruits were also divided into three parts on average, each containing two fruits as a biological repetition. The collected fruits were immediately frozen in liquid nitrogen and then powdered using a grinder (IKA A11 basic, Germany), and stored at −80 • C until analysis. Uniform and disease-free fruits from parents at 0, 3, 6, 9, 12, and 15 dpa were collected, and three biological replicates were performed; the fruits were stored at −80 • C for GC-MS and gene expression analysis. Uniform and diseasefree fruits from nine cucumber inbred lines at 12 dpa were sampled and pooled into three biological replicates for DNA extraction, GC-MS and gene expression analysis.

Measurement of aroma compound contents
The content of aroma compounds was measured according to Chen et al. [11]. The aroma of 5 g frozen fruit powder was extracted by a solid-phase microextraction fiber. After extraction, the contents of the volatile compounds were determined by GC-MS (Thermo Fisher Scientific, Waltham, MA, USA) fitted with an HP-INNWAX column (0.25 mm the inner diameter, 60 m the length, and 0.25 μm the film thickness; Agilent, Shanghai, China).

SNP identification by genotyping by sequencing and linkage map development
Young leaves collected from Q16, Q24, and 129 RIL individuals were stored at −80 • C. Genomic DNA was isolated from leaves using the modified CTAB method [44]. The DNAs of Q16, Q24, and 129 RIL individuals were digested by EcoRI and HaeII restriction enzymes, and then genotyping by sequencing (GBS) libraries were constructed. The GBS library was sequenced using Illumina HiSeq2000 (Novogene, Beijing, China). After filtering the raw sequencing data, high-quality reads were aligned to the cucumber (Chinese Long) reference genome V3 (http://cucurbitgenomics.org/organism/20); the average mapping rate was 98.21% and the average sequence coverage depth was 17.59×. SAMtools was used to detect SNPs from sequencing data. The SNPs were screened by filtering out SNPs that showed distorted segregation (P < .001, χ 2 test). Genetic linkage maps were constructed with AsMap [45].

QTL mapping
QTL mapping was carried out with the R/qtl package (http://rqtl.org/) using the composite interval mapping (CIM) method. The permutation times and significance level were set at 1000 and 0.05, respectively. The QTLs displayed in this study included the chromosome, marker, position, confidence interval, LOD value, and the proportion of aroma phenotypic variance explained by a single QTL (r 2 ). For each LOD peak, the 1.5-LOD support interval was determined. The QTLs with LOD scores of 3.0 or >3.0 are listed in Table 2. QTLs were named according to chromosome location and trait names, in which 'al' represents the trait as 'aldehyde' and 'ol' represents 'alcohol'. The first number refers to the kind of volatile compound, the second number refers to the name of the chromosome, and the third number refers to the number of QTLs. For example, qol8-2.1 refers to the first QTL for ol8 [(E,Z)-2,6-nonadien-1-ol] on chromosome 2, and qal9-3.1 refers to the first QTL for al9 [(E,Z)-2,6-nonadienal] on chromosome 3. QTL positions and 1.5-LOD confidence intervals were drawn using MapChart 2.2 for Windows.

Screening of candidate genes in qol8-2.1 region
The candidate genes associated with (E,Z)-2,6-nonadien-1-ol in the qol8-2.1 genomic region were screened referring to the cucumber 9930 V3 genome, especially for the genes that participated in the lipoxygenase pathway. To analyze the variation of nine CsLOXs between the two parental lines, Q16 and Q24 were re-sequenced on the Illumina sequencing platform by Novogene Co., Ltd (Beijing, China).
The sequences of the CsaV3_2G006370 and CsaV3_ 2G005360 genes were amplified from the cDNA and DNA of Q16 and Q24, respectively, using gene-specific primers (Supplementary Data Table S7), which were cloned into the pMD19-T vector (Takara) and sequenced. A 2000-bp promoter fragment of CsaV3_2G006370 and CsaV3_2G005360 was cloned from the genomic DNA of Q16 and Q24, respectively. The promoter fragments were cloned into the pMD19-T vector and sequenced. Sequence alignment was performed using Biowire Jellyfish software (version 1.5, Biowire.com). CsLOX promoters were analyzed for transcription factor binding sites (TFBSs) using the PlantTFDB website (http://planttfdb. cbi.pku.edu.cn/). According to the results of sequence alignment, CsaV3_2G006370 was named CsLOX06, and CsaV3_2G005360 was named CsLOX08.
The homologous sequences of CsLOXs were searched and downloaded from the NCBI database, and further analysis of the protein sequences was conducted and screened by BLASTP in the cucumber genome database (http://cucurbitgenomics.org/). The redundant sequences and sequences lacking the LOX domain were removed manually. The phylogenetic tree was analyzed by MEGA 5.0 software (http://www.megasoftware.net/) with 1000 bootstrap replications.

Correlation between candidate gene expression level and alcohol content
Total RNA isolation was performed according to Wan et al. [46]. Gene-specific primers for RT-qPCR were designed using Primer 5, and the primer sequence is given in Supplementary Data Table S7. Gene expression analysis of CsLOXs was performed using SYBR Green in a Bio-Rad IQ5 PCR thermal cycler (Bio-Rad, USA). The associations between CsLOX gene expression and alcohol content were investigated using Graphpad Prism 5.0 (Graph-Pad Software Inc., La Jolla, CA, USA).

Polymorphism analysis of candidate genes in parents and InDel marker development
Alignments of amino acid sequences of CsLOXs were performed using ClustalX (version 2.1). A pair of InDel primers was developed based on the 8-bp insertion using Primer Premier 5.0. Polymorphism analysis of InDel markers was carried out by polyacrylamide gel electrophoresis [47].

Statistical analysis of main aroma compounds
Statistical analysis of compound contents was conducted to estimate the mean, minimum, maximum, and standard deviation in RILs. Homogeneity of variances was tested before ANOVA analyses. Significant differences in aroma compound content between parents were confirmed using Duncan's multiple range test at a level of P < .05. The normality of the distribution of the main aroma compounds in RILs was evaluated by the Shapiro-Wilk test (P < .05). The correlation between the main aroma compounds was analyzed using Pearson's correlation. PCA was employed to analyze the relationship between CsLOX expression and aroma compound content during cucumber fruit development. The variables for PCA were normalized by the Z-score method. All statistical analyses were carried out using SPSS 22.0 software (Statistical Package for the Social Sciences, Chicago, IL, USA). Violin plots of the distribution of the main aroma compounds in RILs were made using Python programs (3.8 version). Line graphs were generated by GraphPad Prism (version 5; GraphPad Software, USA). The correlation heat map was plotted using MeV software (MultiExperiment Viewer 4.7.4, TIGR, China).