-
PDF
- Split View
-
Views
-
Cite
Cite
Ryoichi Yano, Kyoko Takagi, Saeko Tochigi, Yukiko Fujisawa, Yuhta Nomura, Hiroki Tsuchinaga, Yuya Takahashi, Yoshitake Takada, Akito Kaga, Toyoaki Anai, Chigen Tsukamoto, Hikaru Seki, Toshiya Muranaka, Masao Ishimoto, Isolation and Characterization of the Soybean Sg-3 Gene that is Involved in Genetic Variation in Sugar Chain Composition at the C-3 Position in Soyasaponins, Plant and Cell Physiology, Volume 59, Issue 4, April 2018, Pages 797–810, https://doi.org/10.1093/pcp/pcy019
Close -
Share
Abstract
Soyasaponins are specialized metabolites present in soybean seeds that affect the taste and quality of soy-based foods. The composition of the sugar chains attached to the aglycone moiety of soyasaponins is regulated by genetic loci such as sg-1, sg-3 and sg-4. Here, we report the cloning and characterization of the Sg-3 gene, which is responsible for conjugating the terminal (third) glucose (Glc) at the C-3 sugar chain of soyasaponins. The gene Glyma.10G104700 is disabled in the sg-3 cultivar, ‘Mikuriya-ao’, due to the deletion of genomic DNA that results in the absence of a terminal Glc residue on the C-3 sugar chain. Sg-3 encodes a putative glycosyltransferase (UGT91H9), and its predicted protein sequence has a high homology with that of the product of GmSGT3 (Glyma.08G181000; UGT91H4), which conjugates rhamnose (Rha) to the third position of the C-3 sugar chain in vitro. A recombinant Glyma.10G104700 protein could utilize UDP-Glc as a substrate to conjugate the third Glc to the C-3 sugar chain, and introducing a functional Glyma.10G104700 transgene into the mutant complemented the sg-3 phenotype. Conversely, induction of a premature stop codon mutation in Glyma.10G104700 (W270*) resulted in the sg-3 phenotype, suggesting that Glyma.10G104700 was Sg-3. The gmsgt3 (R339H) mutant failed to accumulate soyasaponins with the third Rha at the C-3 sugar chain, and the third Glc and Rha conjugations were both disabled in the sg-3 gmsgt3 double mutant. These results demonstrated that Sg-3 and GmSGT3 are non-redundantly involved in conjugation of the third Glc and Rha at the C-3 sugar chain of soyasaponins, respectively.
Introduction
Soybean [Glycine max (L.) Merr.] is one of the most economically important crop plants in the world, as its seeds are a major source of proteins and fats for humans and domestic animals. In addition to the primary metabolites, soybean seeds also accumulate a variety of specialized metabolites, including triterpenoid saponins. Saponins account for >2% of the hypocotyl dry weight in mature seeds (Fenwick et al. 1991). In soybeans, saponins are broadly classified into two types based on their chemical structure: group A saponins and 2,3-dihydro-2, 25-dihydroxy-6-methyl-4 H-pyran-4-one (DDMP) saponins. Group A saponins are bisdesmoside-type saponins that have two sugar chains at the C-3 and C-22 hydroxyl groups of the aglycone moiety, soyasapogenol A (3β, 21β, 22β, 24-tetrahydroxyolean-12-ene) (Shiraiwa et al. 1991) (Fig. 1). DDMP saponins are monodesmoside-type saponins that contain a sugar chain at the C-3 hydroxyl group of the aglycone moiety in which a DDMP residue forms a hemiacetal link with soyasapogenol B (3β, 22β, 24-trihydroxyolean-12-ene) at the C-22 hydroxyl group (Kudou et al. 1992, Kudou et al. 1993). Although both soyasapogenol A and soyasapogenol B are synthesized from 2,3-oxidosqualene via β-amyrin through cyclization and oxidation (Shibuya et al. 2006, Takada et al. 2013), the C-21 hydroxyl group is specific to soyasapogenol A (Takada et al. 2013, Yano et al. 2017). Saponins have important agronomic functions such as pathogen defense (Papadopoulou et al. 1999, Geisler et al. 2013), and some soybean saponins are associated with physiological responses in humans. For example, DDMP saponins and their degraded derivatives (group B and group E saponins) have health-promoting functions such as prevention of dietary hypercholesterolemia (Fenwick et al. 1991, Murata et al. 2005, Murata et al. 2006), suppression of colon cancer cell proliferation (Ellington et al. 2005, Ellington et al. 2006), prevention of lipid peroxidation, and liver-protecting activities induced by accelerated thyroid hormone secretion (Ishii and Tanizawa 2006). In contrast, acetylated group A saponins can cause bitter and astringent aftertastes in soy products (Okubo et al. 1992). Because these properties depend on the chemical structure and concentration of the saponins, differences in the saponin components of soybean seeds could confer different health-promoting effects on the final product (Tsukamoto and Yoshiki 2006).
Soybean genes involved in soyasaponin biosynthesis. (A) Cartoon illustrating the chemical structures of DDMP saponins (left) and group A saponins (right). Genes identified by biochemical enzyme assays and genetic studies are shown in pink and yellow boxes, respectively. Chromosomes are also mentioned, with gene names. Glyma 2.0 gene IDs for CYP93E1, Sg-5 (CYP72A69), Sg-1 (UGT73F2), GmSGT2 (UGT73P2) and GmSGT3 (UGT91H4) are Glyma.08G350800, Glyma.15G243300, Glyma.07G254600, Glyma.11G053400 and Glyma.08G181000, respectively (Sayama et al. 2012, Shibuya et al. 2006, Shibuya et al. 2010, Yano et al. 2017). Sg-4 has not been cloned yet (Takada et al. 2012, Tsukamoto et al. 1993). The glucose residue on the terminal sugar moiety at the C-3 position of soyasapogenols is controlled by the Sg-3 locus, while the rhamnose residue of the third sugar is glycosylated by GmSGT3. (B) List of soyasaponins analyzed in this study. The sugar moieties attached to the R1 and R2 sites are indicated.
Soybean genes involved in soyasaponin biosynthesis. (A) Cartoon illustrating the chemical structures of DDMP saponins (left) and group A saponins (right). Genes identified by biochemical enzyme assays and genetic studies are shown in pink and yellow boxes, respectively. Chromosomes are also mentioned, with gene names. Glyma 2.0 gene IDs for CYP93E1, Sg-5 (CYP72A69), Sg-1 (UGT73F2), GmSGT2 (UGT73P2) and GmSGT3 (UGT91H4) are Glyma.08G350800, Glyma.15G243300, Glyma.07G254600, Glyma.11G053400 and Glyma.08G181000, respectively (Sayama et al. 2012, Shibuya et al. 2006, Shibuya et al. 2010, Yano et al. 2017). Sg-4 has not been cloned yet (Takada et al. 2012, Tsukamoto et al. 1993). The glucose residue on the terminal sugar moiety at the C-3 position of soyasapogenols is controlled by the Sg-3 locus, while the rhamnose residue of the third sugar is glycosylated by GmSGT3. (B) List of soyasaponins analyzed in this study. The sugar moieties attached to the R1 and R2 sites are indicated.
The diversity of sugar moieties in soybean saponins (soyasaponins) is genetically regulated by seven naturally occurring alleles at three loci, Sg-1, Sg-3 and Sg-4 (Shiraiwa et al. 1990, Tsukamoto et al. 1993, Kikuchi et al. 1999,Takada et al. 2010, Takada et al. 2012). Among them, Sg-1 (Glyma.07G254600) has been shown to encode a UDP-sugar-dependent glycosyltransferase of the UGP-glycosyltransferase 73 (UGT73) family (Sayama et al. 2012). A single amino acid residue at position 138 of the Sg-1 protein sequence determines substrate specificity; the Sg-1a allele encodes the xylosyltransferase UGT73F4, while Sg-1b encodes the glucosyltransferase UGT73F2. The loss-of-function sg-10 allele results in the absence of a second sugar moiety at the C-22 position of SA, thus preventing the generation of acetylated group A saponins. The sg-10 allele was therefore used to develop the ‘Kinusayaka’ commercial soybean cultivar that does not produce bitter and astringent group A saponins (Kato et al. 2007). The sg-3 and sg-4 alleles have been identified as genetic factors involved in the conjugation of the third glucose (Glc) moiety or the second arabinose (Ara) moiety at the C-3 position of both DDMP and group A saponins (Takada et al. 2012). Homozygous sg-3 seeds lack saponins Ab and αg, both of which have the third Glc moiety at the C-3 position (Fig. 1), while homozygous sg-4 seeds lack saponins with the second Ara moiety at the C-3 position. In addition to these factors, GmSGT2 (UGT73P2) and GmSGT3 (UGT91H4) have also been identified by their in vitro enzyme activities. Recombinant GmSGT2 and GmSGT3 have been shown to conjugate the second galactose (Gal) moiety and the third rhamnose (Rha) moiety at the C-3 position in saponins, respectively (Shibuya et al. 2010).
In this study, we identified Glyma.10G104700 as the Sg-3 gene. The natural sg-3 mutant cultivar ‘Mikuriya-ao’ has a large deletion in the genomic DNA around the Glyma.10G104700 locus. This gene encodes a glycosyltransferase enzyme of the UGT91 family (UGT91H9). We used an enzyme activity assay to show that the soybean Sg-3 protein could conjugate with the third Glc moiety at the C-3 position in saponins. The sg-3 mutant phenotype (absence of saponins Ab and αg) was restored by the functional Sg-3 transgene. Furthermore, induction of mutations in Glyma.10G104700 (R339H) phenocopied the natural sg-3 mutant, indicating that Glyma.10G104700 is responsible for the sg-3 phenotype. The sg-3 gmsgt3 double mutant did not produce Ab, Ac, αg and βg saponins, all of which carry the third sugar moiety; however, it accumulated saponin γg, which lacks the third sugar moiety at the C-3 position. Therefore, Sg-3 and GmSGT3 are necessary and sufficient for the conjugation of the third sugar moiety at the C-3 position in saponins. Interspecies comparisons of UGT91 proteins in soybean and common bean (Phaseolus vulgaris) suggested that Sg-3 and GmSGT3 originally evolved by tandem gene duplication in the common ancestors of these plants.
Results
Identification of Glyma.10G104700 as a candidate gene for Sg-3
In a previous study, we had mapped the Sg-3 locus between the simple sequence repeat (SSR) markers Satt633 and Satt241 on chromosome 10 of the soybean genome (Takada et al. 2012). According to the Glyma2.0 soybean gene annotation (Schmutz et al. 2010), there are at least three putative glycosyltransferase genes in this region (Glyma.10G104700, Glyma.10G107900 and Glyma.10G108400) (Fig. 2A). Among these, the predicted amino acid sequence of the product of Glyma.10G104700 had a strong homology with that of GmSGT3 (BLAST search, E-value = 0.0, bit score = 659, sequence identity = 72%), which has been previously reported to show UDP-Rha glycosyltransferase activity against soyasaponin III (Shibuya et al. 2010). Southern blot analysis indicated that the sg-3 cultivar ‘Mikuriya-ao’, which is characterized by an absence of soyasaponins Ab and αg (Fig. 1) (Takada et al. 2012), lacks a genomic DNA fragment around the Glyma.10G104700 locus, when compared with the Sg-3 cultivars ‘Williams 82’, ‘Enrei’, ‘Jack’ and ‘Fukuyutaka’ (Fig. 2B). PCR also failed to amplify a Glyma.10G104700 genomic fragment in ‘Mikuriya-ao’, while a product was amplified in the Sg-3 cultivars (Fig. 2C). In contrast, genomic fragments of the neighboring genes Glyma.10G104500 and Glyma.10G104800 were successfully amplified by PCR in both ‘Mikuriya-ao’ and Sg-3 cultivars. As for the other two putative UDP-glycosyltransferase genes in this region (Glyma.10G107900 and Glyma.10G108400), the DNA fragments of these genes were also amplified by PCR in ‘Mikuriya-ao’ and Sg-3 cultivars. These results suggested Glyma.10G104700 as the most plausible candidate for the Sg-3 gene.
Identification of Glyma.10G104700 as a candidate gene for Sg-3. (A) Graphical map illustrating the Sg-3 locus on soybean chromosome 10. In the top portion of the chart, filled and open horizontal arrow boxes indicate Glyma.10G104700 and other putative glycosyltransferase genes (Glyma.10G107900 and Glyma.10G108400), respectively. In the middle of the chart, numbered positions from 1 to 5 correspond to the genomic regions amplified by PCR, shown in (C). Filled and open horizontal arrows indicate Glyma.10G104700 and its neighboring predicted genes, respectively. In the bottom section of the chart, the filled and open boxes indicate coding regions and untranslated regions, respectively, of Glyma.10G104700. The dotted bar indicates the genomic region used as the probe for the Southern blot analysis shown in (B). (B) Southern blot analysis of Glyma.10G104700 in the Sg-3 genotypes (‘Enrei’, ‘Jack’, ‘Williams 82’ and ‘Fukuyutaka’) and the sg-3 genotype ‘Mikuriya-ao’. Genomic DNA was digested with EcoRI. The probe was amplified using PCR with primers specific for Sg-3. The filled arrow indicates an undetected band in ‘Mikuriya-ao’ (sg-3). (C) Genomic PCR analysis of candidate genes (left) or positions (right) in the Sg-3 genotypes (‘Williams 82’, ‘Enrei’ and ‘Fukuyutaka’) and the sg-3 genotype ‘Mikuriya-ao’. Details of positions 1–5 are illustrated in (A).
Identification of Glyma.10G104700 as a candidate gene for Sg-3. (A) Graphical map illustrating the Sg-3 locus on soybean chromosome 10. In the top portion of the chart, filled and open horizontal arrow boxes indicate Glyma.10G104700 and other putative glycosyltransferase genes (Glyma.10G107900 and Glyma.10G108400), respectively. In the middle of the chart, numbered positions from 1 to 5 correspond to the genomic regions amplified by PCR, shown in (C). Filled and open horizontal arrows indicate Glyma.10G104700 and its neighboring predicted genes, respectively. In the bottom section of the chart, the filled and open boxes indicate coding regions and untranslated regions, respectively, of Glyma.10G104700. The dotted bar indicates the genomic region used as the probe for the Southern blot analysis shown in (B). (B) Southern blot analysis of Glyma.10G104700 in the Sg-3 genotypes (‘Enrei’, ‘Jack’, ‘Williams 82’ and ‘Fukuyutaka’) and the sg-3 genotype ‘Mikuriya-ao’. Genomic DNA was digested with EcoRI. The probe was amplified using PCR with primers specific for Sg-3. The filled arrow indicates an undetected band in ‘Mikuriya-ao’ (sg-3). (C) Genomic PCR analysis of candidate genes (left) or positions (right) in the Sg-3 genotypes (‘Williams 82’, ‘Enrei’ and ‘Fukuyutaka’) and the sg-3 genotype ‘Mikuriya-ao’. Details of positions 1–5 are illustrated in (A).
Recombinant Glyma.10G104700 protein exhibits UDP-Glc glycosyltransferase activity against soyasaponin Bb
To investigate the enzyme activity of Glyma.10G104700, we performed an in vitro enzyme assay for glucosyltransferase activity. The recombinant Glyma.10G104700 protein was expressed in Escherichia coli as a fusion protein with the chaperone trigger factor (TF) and a poly-histidine tag (His-tag) and then purified. TF was chosen for fusion as it enhances the expression and solubility of recombinant glucosyltransferase protein in E. coli. When the TF–Glyma.10G104700 recombinant protein was incubated with soyasaponin Bb' (also called soyasaponin III; it lacks the third sugar residue at the C-3 position) as an acceptor molecule and UDP-Glc as a sugar-donor, a substantial amount of the reaction product was observed (Fig. 3). This product had the same retention time (10.3 min) and molecular mass (m/z = 958) as a soyasaponin Ba standard that has the Glc moiety at the third position of the C-3 sugar chain (Fig. 3B, D). TF alone, which served as a negative control protein, failed to produce this product. In addition, when UDP-Gal or UDP-GlcA were used as the sugar donor, no such reaction product was observed. We then tested another acceptor, soyasaponin Bc', to investigate whether recombinant Glyma.10G104700 could utilize other acceptor molecules. When recombinant Glyma.10G104700 was incubated with soyasaponin Bc' and UDP-Glc, a reaction product with a retention time of 11.0 min was observed (Fig. 3B). The molecular mass of this product was consistent with that of soyasaponin Bx (m/z = 928), which carries the Glc moiety at the third position of the C-3 sugar chain. UDP-Gal and UDP-GlcA were not suitable sugar donors for soyasaponin Bc' in the presence of recombinant Glyma.10G104700. However, a substantial amount of soyasaponin Bc' appeared to remain unreacted even when it was incubated with recombinant Glyma.10G104700 and UDP-Glc (Fig. 3B), suggesting that the enzymatic reaction of Glyma.10G104700 is more efficient for soyasaponin Bb' than for Bc'. Soyasapogenol B 3-O-monoglucuronide (SBMG) and soyasapogenol B also failed to serve as sugar acceptors for recombinant Glyma.10G104700 and UDP-Glc (Supplementary Fig. S1). These results suggested that Glyma.10G104700 utilizes UDP-Glc as a sugar donor substrate to conjugate a Glc residue to the terminal end of the C-3 sugar chain of soyasaponin.
UDP-sugar glycosyltransferase assay of recombinant Glyma.10G104700 protein activity. (A) Chemical structures of group B soyasaponins. (B) Liquid chromatography–mass spectrometry (LC-MS) chromatograms for the enzyme reaction products. Soyasaponin Bb' (left) or Bc' (right) were used as acceptor molecules, along with three distinct UDP-sugar donors: UDP-glucose (+UDP-Glc), UDP-galactose (+UDP-Gal) and UDP-glucuronic acid (+UDP-GlcA). The full-length coding sequence for Glyma.10G104700 was cloned from the Sg-3 cultivar ‘Williams 82’ and the chaperone trigger factor (TF)-fused recombinant protein was subjected to UDP-sugar glycosyltransferase assay (TF-Glyma.10G104700). TF alone was used as a negative control (TF). A chromatogram of group B saponin standards is shown at the top; Ba (soyasaponin V), Bb (soyasaponin I), Bb' (soyasaponin III) and Bc' (soyasaponin IV). (C) SDS–PAGE analysis of the TF–Glyma.10G104700 recombinant protein. The molecular weight of the TF–Glyma.10G104700 recombinant protein is higher than that of the negative control (TF alone). (D) Mass spectrum of the enzyme reaction products shown in (B). The mass spectrum of the reaction product generated by TF–Glyma.10G104700, soyasaponin Bb' and UDP-Glc (middle) is identical to that of authentic soyasaponin Ba (top). The m/z value of the parental ion of the reaction product generated by TF–Glyma.10G104700, soyasaponin Bc' and UDP-Glc is identical to the theoretically expected molecular weight of soyasaponin Bx (m/z = 928).
UDP-sugar glycosyltransferase assay of recombinant Glyma.10G104700 protein activity. (A) Chemical structures of group B soyasaponins. (B) Liquid chromatography–mass spectrometry (LC-MS) chromatograms for the enzyme reaction products. Soyasaponin Bb' (left) or Bc' (right) were used as acceptor molecules, along with three distinct UDP-sugar donors: UDP-glucose (+UDP-Glc), UDP-galactose (+UDP-Gal) and UDP-glucuronic acid (+UDP-GlcA). The full-length coding sequence for Glyma.10G104700 was cloned from the Sg-3 cultivar ‘Williams 82’ and the chaperone trigger factor (TF)-fused recombinant protein was subjected to UDP-sugar glycosyltransferase assay (TF-Glyma.10G104700). TF alone was used as a negative control (TF). A chromatogram of group B saponin standards is shown at the top; Ba (soyasaponin V), Bb (soyasaponin I), Bb' (soyasaponin III) and Bc' (soyasaponin IV). (C) SDS–PAGE analysis of the TF–Glyma.10G104700 recombinant protein. The molecular weight of the TF–Glyma.10G104700 recombinant protein is higher than that of the negative control (TF alone). (D) Mass spectrum of the enzyme reaction products shown in (B). The mass spectrum of the reaction product generated by TF–Glyma.10G104700, soyasaponin Bb' and UDP-Glc (middle) is identical to that of authentic soyasaponin Ba (top). The m/z value of the parental ion of the reaction product generated by TF–Glyma.10G104700, soyasaponin Bc' and UDP-Glc is identical to the theoretically expected molecular weight of soyasaponin Bx (m/z = 928).
Isolation and genetic complementation of an additional sg-3 mutant allele
To investigate whether Glyma.10G104700 is responsible for the sg-3 phenotype (absence of the Glc residue at the terminal end of the C-3 sugar chain in soyasaponins), we introduced a functional Sg-3 DNA fragment into an experimental sg-3 line named ‘JM’ (Fig. 4A), which has not only a recessive sg-3 allele from the sg-3 cultivar ‘Mikuriya-ao’, but also the ‘Jack’ genetic background to enable transformation. The transgene segregated in the siblings of the transformants (Fig. 4B). Soyasaponin accumulation analysis in these siblings revealed that those which had the Sg-3 transgene accumulated both soyasaponins Ab and αg, just like the Sg-3 cultivar ‘Jack’ (Fig. 4C). As soyasaponins Ab and αg had the Glc residue at the terminal end of the C-3 sugar chain, we concluded that the functional UDP-Glc glycosyltransferase enzyme was expressed in these siblings. In contrast, siblings without the Sg-3 transgene failed to accumulate soyasaponins Ab and αg, just like the ‘Mikuriya-ao’ line (Fig. 4C).
Genetic complementation of sg-3 by the Glyma.10G104700 transgene. (A) Map of the transformation vector designated pUHR:Sg-3. A genomic DNA fragment containing Glyma.10G104700 was cloned from the Sg-3 cultivar ‘Williams 82’ and inserted into a plasmid vector pUHR for soybean transformation. (B) RT–PCR analysis of transgene expression in the siblings of the transformants. ‘JM’ represents the sg-3 soybean experimental line that was subjected to transformation. Transgene expression was analyzed in the hypocotyl of developing seeds. (C) Liquid chromatography–mass spectrometry (LC-MS) analysis of endogenous soyasaponins in the siblings of transformants. Peaks for soyasaponins Ab, Ac, Af, αg and βg are indicated. It should be noted that accumulation of soyasaponins Ab and αg was restored in the seed hypocotyl of transgenic siblings that expressed the transgene (‘JM with trans-Sg-3’). ‘Mikuriya-ao’ (sg-3) and ‘Jack’ (Sg-3) were used as controls (B, C).
Genetic complementation of sg-3 by the Glyma.10G104700 transgene. (A) Map of the transformation vector designated pUHR:Sg-3. A genomic DNA fragment containing Glyma.10G104700 was cloned from the Sg-3 cultivar ‘Williams 82’ and inserted into a plasmid vector pUHR for soybean transformation. (B) RT–PCR analysis of transgene expression in the siblings of the transformants. ‘JM’ represents the sg-3 soybean experimental line that was subjected to transformation. Transgene expression was analyzed in the hypocotyl of developing seeds. (C) Liquid chromatography–mass spectrometry (LC-MS) analysis of endogenous soyasaponins in the siblings of transformants. Peaks for soyasaponins Ab, Ac, Af, αg and βg are indicated. It should be noted that accumulation of soyasaponins Ab and αg was restored in the seed hypocotyl of transgenic siblings that expressed the transgene (‘JM with trans-Sg-3’). ‘Mikuriya-ao’ (sg-3) and ‘Jack’ (Sg-3) were used as controls (B, C).
To obtain further genetic evidence, we screened another mutant allele for Glyma.10G104700 from an induced mutant library developed in the ‘Enrei’ cultivar (Tsuda et al. 2015). One mutant, designated Ent-0406, had a premature stop codon mutation (W270*) in Glyma.10G104700 (Fig. 5A). While the wild-type ‘Enrei’ could accumulate soyasaponins Ab and αg, this mutant failed to accumulate either of these soyasaponins, which was similar to the sg-3 cultivar ‘Mikuriya-ao’ (Figs. 4, 5B). However, accumulation of soyasaponins Ac and βg, both of which have a Rha residue at the terminal end of the C-3 sugar chain, was not affected in Ent-0406. Taken together, these results indicated that Glyma.10G104700 is responsible for the sg-3 phenotype.
Changes in endogenous soyasaponin composition in induced mutants for Glyma.10G104700 (Sg-3) and GmSGT3. (A) Cartoon illustrating gene structures and mutation positions in Glyma.10G104700 (top) and GmSGT3 (bottom). The EnT-0406 mutant line has a premature stop codon mutation (W270*) in Glyma.10G104700, while EnT-0440 has an amino acid substitution mutation (R339H) in GmSGT3. These mutants were identified with a high-resolution melting screen from the induced mutant library developed in the ‘Enrei’ cultivar (Tsuda et al. 2015). (B) Liquid chromatography–mass spectrometry (LC-MS) analysis of endogenous soyasaponins extracted from EnT-0406 (sg-3 mutant) and EnT-0440 (gmsgt3 mutant) seed hypocotyls. The wild-type ‘Enrei’ was used as a control. Peaks for soyasaponins Ab, Ac, Af, αg and βg are indicated. Note that EnT-0406 (sg-3 mutant) is deficient in soyasaponins Ab and αg, whereas EnT-0440 (gmsgt3 mutant) is deficient in soyasaponins Ac and βg. The Ab and αg soyasaponins have a terminal Glc residue at the C-3 sugar chain, while Ac and βg have a terminal Rha residue at the C-3 sugar chain (Fig. 1).
Changes in endogenous soyasaponin composition in induced mutants for Glyma.10G104700 (Sg-3) and GmSGT3. (A) Cartoon illustrating gene structures and mutation positions in Glyma.10G104700 (top) and GmSGT3 (bottom). The EnT-0406 mutant line has a premature stop codon mutation (W270*) in Glyma.10G104700, while EnT-0440 has an amino acid substitution mutation (R339H) in GmSGT3. These mutants were identified with a high-resolution melting screen from the induced mutant library developed in the ‘Enrei’ cultivar (Tsuda et al. 2015). (B) Liquid chromatography–mass spectrometry (LC-MS) analysis of endogenous soyasaponins extracted from EnT-0406 (sg-3 mutant) and EnT-0440 (gmsgt3 mutant) seed hypocotyls. The wild-type ‘Enrei’ was used as a control. Peaks for soyasaponins Ab, Ac, Af, αg and βg are indicated. Note that EnT-0406 (sg-3 mutant) is deficient in soyasaponins Ab and αg, whereas EnT-0440 (gmsgt3 mutant) is deficient in soyasaponins Ac and βg. The Ab and αg soyasaponins have a terminal Glc residue at the C-3 sugar chain, while Ac and βg have a terminal Rha residue at the C-3 sugar chain (Fig. 1).
The sg-3 gmsgt3 double mutant does not conjugate with the terminal sugar residue at the C-3 sugar chain in soyasaponins
To obtain further genetic evidence for the role of Sg-3 and GmSGT3 in soyasaponin biosynthesis, we isolated an induced mutant for GmSGT3. One mutant, designated EnT-0440, was found to carry a single amino acid substitution (R339H) in the deduced protein sequence of GmSGT3 (Fig. 5A). Unlike the wild-type ‘Enrei’ and the EnT-0406 sg-3 mutant, the EnT-0440 gmsgt3 mutant failed to accumulate soyasaponins Ac and βg, indicating that the UDP-Rha glycosyltransferase activity required for the conjugation of the Rha residue at the terminal end of the C-3 sugar chain was absent in EnT-0440 (Fig. 5B). This phenotype was consistent with the results of a previous biochemical analysis by Shibuya et al. (2010), who demonstrated the UDP-Rha glycosyltransferase activity of GmSGT3. We obtained a sg-3 gmsgt3 double mutant by crossing the EnT-0440 mutant with the experimental sg-3 line, ‘JM’. The resultant double mutant failed to accumulate not only soyasaponins Ac and βg, but also soyasaponins Ab and αg (Fig. 6). This lack of soyasaponin biosynthesis indicated that the double mutant could conjugate neither the Glc nor the Rha residues at the terminal end of the C-3 sugar chain. However, soyasaponins Af and γg, both of which lack the terminal sugar residue at the C-3 sugar chain, accumulated normally in the mutant. These results suggested that Sg-3 (Glyma.10G104700) and GmSGT3 are both required for the terminal sugar conjugation at the C-3 position in soyasaponins.
Changes in endogenous soyasaponin composition in the sg-3 gmsgt3 double mutant. (A) Liquid chromatography–mass spectrometry (LC-MS) analysis of endogenous soyasaponins in the seed hypocotyl of sg-3 gmsgt3 double mutants. Peaks for Ab, Ac, Af, αg, βg and γg soyasaponins are indicated. Note that soyasaponins Ab, Ac, αg and βg are deficient in the sg-3 gmsgt3 double mutant, which, however, still accumulates soyasaponins Af and γg. (B) Mass spectrometry of γg (left) and βg (right) soyasaponins present in the sg-3 gmsgt3 double mutant and the ‘Enrei’ cultivar, respectively. The m/z values for the parental ions are identical to the theoretically expected molecular weights for γg (m/z = 923.50) and βg (m/z = 1069.56).
Changes in endogenous soyasaponin composition in the sg-3 gmsgt3 double mutant. (A) Liquid chromatography–mass spectrometry (LC-MS) analysis of endogenous soyasaponins in the seed hypocotyl of sg-3 gmsgt3 double mutants. Peaks for Ab, Ac, Af, αg, βg and γg soyasaponins are indicated. Note that soyasaponins Ab, Ac, αg and βg are deficient in the sg-3 gmsgt3 double mutant, which, however, still accumulates soyasaponins Af and γg. (B) Mass spectrometry of γg (left) and βg (right) soyasaponins present in the sg-3 gmsgt3 double mutant and the ‘Enrei’ cultivar, respectively. The m/z values for the parental ions are identical to the theoretically expected molecular weights for γg (m/z = 923.50) and βg (m/z = 1069.56).
Phylogenetic analysis of Sg-3-related UGT91 genes in soybean and common bean
To obtain insight into the evolutionary relationships between Sg-3-related UGT91 genes, we performed a phylogenetic analysis using the predicted UGT91 amino acid sequence information from soybean, common bean (Phaseolus vulgaris) and barrel medic (Medicago truncatula). Soybean and common bean are considered to have the same ancestral origin; the soybean genome has been shown to have undergone an additional round of whole-genome duplication (Schmutz et al. 2010, Cannon and Shoemaker 2012). The genome of the common bean exhibits a genome-wide synteny against the soybean genome (Wojciechowski et al. 2004, Schmutz et al. 2010, Cannon and Shoemaker 2012); however, the barrel medic genome has lesser synteny against the soybean genome than the common bean genome (Young et al. 2011). The soybean genome contains at least two Sg-3 homologs that are classified as UGT91 members: GmSGT3 (UDP-Rha glycosyltransferase) and Glyma.15G051400 (Fig. 7A; BLASTp, E-value cut-off = 1e-150). The common bean genome was found to have at least three Sg-3-related UGT91 genes: Phvul.006G208300, Phvul.006G208400 and Phvul.006G208500, under the same BLASTp cut-off conditions. According to bidirectional best hit (BDBH) BLASTp analysis with the whole gene information of soybean and common bean (88,647 vs. 31,638 genes), GmSGT3, Glyma.15G051400 and Sg-3 were the BDBH partners of Phvul.006G208300, Phvul.006G208400 and Phvul.006G208500, respectively, implying possible orthologous relationships between these genes (Table 1; Fig. 7B). Sg-3 had the highest degree of similarity to Phvul.006G208500 among all the soybean and common bean UGT91 genes (77% amino acid sequence identity based on BLASTp; E-value = 0.0). Likewise, GmSGT3 had the highest degree of similarity to Phvul.006G208300 (85% identity). Glyma.15G051400 and Phvul.006G208400 had 87% sequence identity. Interestingly, while the three common bean UGT91 genes are tandemly positioned on chromosome 6 (Fig. 7B), the soybean UGT91 genes are randomly positioned on different chromosomes (Chr. 8, 15 and 10), suggesting that Sg-3-related UGT91 genes might have scattered to different chromosomes during the evolution of the soybean genome. We also studied Sg-3 homologs in the barrel medic genome and found at least seven Sg-3-like genes (Fig. 7A; BLASTp, E-value cut-off = 1e-150). Although GmSGT3 and Glyma.15G051400 had BDBH relationships with barrel medic Medtr2g008220.1 and Medtr2g008226.1, respectively, there were no BDBH genes in the barrel medic genome for soybean Sg-3 (Table 1). In fact, six of the seven Sg-3-like barrel medic genes exhibited higher degrees of similarity to GmSGT3 than to Sg-3 or Glyma.15G051400 (Table 1; Fig. 7A).
A summary of the bidirectional best hit (BDBH) search for Sg-3-related UGT91 genes
| Query . | TOP hit . | BDBH partner . | % identity . | E-value . | Score . |
|---|---|---|---|---|---|
| Soybean vs. Common bean | |||||
| Glyma.08G181000.1 (GmSGT3) | Phvul.006G208300.1 | True | 85.0 | 0 | 833 |
| Glyma.15G051400.1 | Phvul.006G208400.1 | True | 86.7 | 0 | 841 |
| Glyma.10G104700.1 (Sg-3) | Phvul.006G208500.1 | True | 77.2 | 0 | 740 |
| Phvul.006G208300.1 | Glyma.08G181000.1 (GmSGT3) | True | 85.0 | 0 | 833 |
| Phvul.006G208400.1 | Glyma.15G051400.1 | True | 86.7 | 0 | 841 |
| Phvul.006G208500.1 | Glyma.10G104700.1 (Sg-3) | True | 77.2 | 0 | 715 |
| Soybean vs. Barrel medic | |||||
| Glyma.08G181000.1 (GmSGT3) | Medtr2g008220.1 | True | 77.2 | 0 | 755 |
| Glyma.15G051400.1 | Medtr2g008226.1 | True | 81.9 | 0 | 782 |
| Glyma.10G104700.1 (Sg-3) | Medtr2g008220.1 | False | 70.1 | 0 | 684 |
| Medtr2g008220.1 | Glyma.08G181000.1 (GmSGT3) | True | 77.2 | 0 | 755 |
| Medtr2g008225.1 | Glyma.08G181000.1 (GmSGT3) | False | 74.4 | 0 | 731 |
| Medtr6g036650.1 | Glyma.08G181000.1 (GmSGT3) | False | 72.3 | 0 | 704 |
| Medtr4g078060.1 | Glyma.08G181000.1 (GmSGT3) | False | 70.3 | 0 | 647 |
| Medtr4g077960.1 | Glyma.08G181000.1 (GmSGT3) | False | 70.3 | 0 | 647 |
| Medtr6g042310.1 | Glyma.08G181000.1 (GmSGT3) | False | 66.6 | 0 | 637 |
| Medtr2g008226.1 | Glyma.15G051400.1 | True | 81.9 | 0 | 782 |
| Query . | TOP hit . | BDBH partner . | % identity . | E-value . | Score . |
|---|---|---|---|---|---|
| Soybean vs. Common bean | |||||
| Glyma.08G181000.1 (GmSGT3) | Phvul.006G208300.1 | True | 85.0 | 0 | 833 |
| Glyma.15G051400.1 | Phvul.006G208400.1 | True | 86.7 | 0 | 841 |
| Glyma.10G104700.1 (Sg-3) | Phvul.006G208500.1 | True | 77.2 | 0 | 740 |
| Phvul.006G208300.1 | Glyma.08G181000.1 (GmSGT3) | True | 85.0 | 0 | 833 |
| Phvul.006G208400.1 | Glyma.15G051400.1 | True | 86.7 | 0 | 841 |
| Phvul.006G208500.1 | Glyma.10G104700.1 (Sg-3) | True | 77.2 | 0 | 715 |
| Soybean vs. Barrel medic | |||||
| Glyma.08G181000.1 (GmSGT3) | Medtr2g008220.1 | True | 77.2 | 0 | 755 |
| Glyma.15G051400.1 | Medtr2g008226.1 | True | 81.9 | 0 | 782 |
| Glyma.10G104700.1 (Sg-3) | Medtr2g008220.1 | False | 70.1 | 0 | 684 |
| Medtr2g008220.1 | Glyma.08G181000.1 (GmSGT3) | True | 77.2 | 0 | 755 |
| Medtr2g008225.1 | Glyma.08G181000.1 (GmSGT3) | False | 74.4 | 0 | 731 |
| Medtr6g036650.1 | Glyma.08G181000.1 (GmSGT3) | False | 72.3 | 0 | 704 |
| Medtr4g078060.1 | Glyma.08G181000.1 (GmSGT3) | False | 70.3 | 0 | 647 |
| Medtr4g077960.1 | Glyma.08G181000.1 (GmSGT3) | False | 70.3 | 0 | 647 |
| Medtr6g042310.1 | Glyma.08G181000.1 (GmSGT3) | False | 66.6 | 0 | 637 |
| Medtr2g008226.1 | Glyma.15G051400.1 | True | 81.9 | 0 | 782 |
BDBH homolog pairs were identified by genome-wide BLASTp search using the whole-genome information of soybean, common bean and barrel medic.
A summary of the bidirectional best hit (BDBH) search for Sg-3-related UGT91 genes
| Query . | TOP hit . | BDBH partner . | % identity . | E-value . | Score . |
|---|---|---|---|---|---|
| Soybean vs. Common bean | |||||
| Glyma.08G181000.1 (GmSGT3) | Phvul.006G208300.1 | True | 85.0 | 0 | 833 |
| Glyma.15G051400.1 | Phvul.006G208400.1 | True | 86.7 | 0 | 841 |
| Glyma.10G104700.1 (Sg-3) | Phvul.006G208500.1 | True | 77.2 | 0 | 740 |
| Phvul.006G208300.1 | Glyma.08G181000.1 (GmSGT3) | True | 85.0 | 0 | 833 |
| Phvul.006G208400.1 | Glyma.15G051400.1 | True | 86.7 | 0 | 841 |
| Phvul.006G208500.1 | Glyma.10G104700.1 (Sg-3) | True | 77.2 | 0 | 715 |
| Soybean vs. Barrel medic | |||||
| Glyma.08G181000.1 (GmSGT3) | Medtr2g008220.1 | True | 77.2 | 0 | 755 |
| Glyma.15G051400.1 | Medtr2g008226.1 | True | 81.9 | 0 | 782 |
| Glyma.10G104700.1 (Sg-3) | Medtr2g008220.1 | False | 70.1 | 0 | 684 |
| Medtr2g008220.1 | Glyma.08G181000.1 (GmSGT3) | True | 77.2 | 0 | 755 |
| Medtr2g008225.1 | Glyma.08G181000.1 (GmSGT3) | False | 74.4 | 0 | 731 |
| Medtr6g036650.1 | Glyma.08G181000.1 (GmSGT3) | False | 72.3 | 0 | 704 |
| Medtr4g078060.1 | Glyma.08G181000.1 (GmSGT3) | False | 70.3 | 0 | 647 |
| Medtr4g077960.1 | Glyma.08G181000.1 (GmSGT3) | False | 70.3 | 0 | 647 |
| Medtr6g042310.1 | Glyma.08G181000.1 (GmSGT3) | False | 66.6 | 0 | 637 |
| Medtr2g008226.1 | Glyma.15G051400.1 | True | 81.9 | 0 | 782 |
| Query . | TOP hit . | BDBH partner . | % identity . | E-value . | Score . |
|---|---|---|---|---|---|
| Soybean vs. Common bean | |||||
| Glyma.08G181000.1 (GmSGT3) | Phvul.006G208300.1 | True | 85.0 | 0 | 833 |
| Glyma.15G051400.1 | Phvul.006G208400.1 | True | 86.7 | 0 | 841 |
| Glyma.10G104700.1 (Sg-3) | Phvul.006G208500.1 | True | 77.2 | 0 | 740 |
| Phvul.006G208300.1 | Glyma.08G181000.1 (GmSGT3) | True | 85.0 | 0 | 833 |
| Phvul.006G208400.1 | Glyma.15G051400.1 | True | 86.7 | 0 | 841 |
| Phvul.006G208500.1 | Glyma.10G104700.1 (Sg-3) | True | 77.2 | 0 | 715 |
| Soybean vs. Barrel medic | |||||
| Glyma.08G181000.1 (GmSGT3) | Medtr2g008220.1 | True | 77.2 | 0 | 755 |
| Glyma.15G051400.1 | Medtr2g008226.1 | True | 81.9 | 0 | 782 |
| Glyma.10G104700.1 (Sg-3) | Medtr2g008220.1 | False | 70.1 | 0 | 684 |
| Medtr2g008220.1 | Glyma.08G181000.1 (GmSGT3) | True | 77.2 | 0 | 755 |
| Medtr2g008225.1 | Glyma.08G181000.1 (GmSGT3) | False | 74.4 | 0 | 731 |
| Medtr6g036650.1 | Glyma.08G181000.1 (GmSGT3) | False | 72.3 | 0 | 704 |
| Medtr4g078060.1 | Glyma.08G181000.1 (GmSGT3) | False | 70.3 | 0 | 647 |
| Medtr4g077960.1 | Glyma.08G181000.1 (GmSGT3) | False | 70.3 | 0 | 647 |
| Medtr6g042310.1 | Glyma.08G181000.1 (GmSGT3) | False | 66.6 | 0 | 637 |
| Medtr2g008226.1 | Glyma.15G051400.1 | True | 81.9 | 0 | 782 |
BDBH homolog pairs were identified by genome-wide BLASTp search using the whole-genome information of soybean, common bean and barrel medic.
Phylogenetic analysis of UGT91 family genes. (A) Phylogenetic tree analysis of UGT91 family genes in soybean, common bean and barrel medic. Bidirectional best hit (BDBH) partner relationships between Sg-3-related UGT91 genes of soybean and common bean are indicated by brackets. Numbers I–III are used to distinguish between the BDBH groups. Amino acid sequences of UGT73 family proteins are used as controls. (B) Cartoon illustrating the positional relationship of Sg-3-related UGT91 genes in soybean and common bean. In the common bean genome, Sg-3-related UGT91 genes are tandemly positioned on chromosome 6, while in the soybean genome, they are randomly positioned on distinct chromosomes. Connected lines indicate a homologous relationship between two genes, and double lines represent an orthologous relationship inferred by BDBHs in BLASTp analysis (E-value = 0.0). Percentage values next to each line indicate the protein amino acid sequence identity.
Phylogenetic analysis of UGT91 family genes. (A) Phylogenetic tree analysis of UGT91 family genes in soybean, common bean and barrel medic. Bidirectional best hit (BDBH) partner relationships between Sg-3-related UGT91 genes of soybean and common bean are indicated by brackets. Numbers I–III are used to distinguish between the BDBH groups. Amino acid sequences of UGT73 family proteins are used as controls. (B) Cartoon illustrating the positional relationship of Sg-3-related UGT91 genes in soybean and common bean. In the common bean genome, Sg-3-related UGT91 genes are tandemly positioned on chromosome 6, while in the soybean genome, they are randomly positioned on distinct chromosomes. Connected lines indicate a homologous relationship between two genes, and double lines represent an orthologous relationship inferred by BDBHs in BLASTp analysis (E-value = 0.0). Percentage values next to each line indicate the protein amino acid sequence identity.
Based on these phylogenetic relationships, we grouped soybean and common bean Sg-3-related genes into three distinct homologous groups: group I (Sg-3 and Phvul.006G208500), group II (GmSGT3 and Phvul.006G208300) and group III (Glyma.15G051400 and Phvul.006G208400). Based on the soybean mutant phenotypes (Figs. 5, 6) and the biochemical enzyme activities observed (Fig. 3) (Shibuya et al. 2010), each group probably has a different role in the biosynthesis of soyasaponins or other compounds. We next compared the protein sequences of Sg-3-related UGT91 gene products (Sg-3, GmSGT3, Glyma.15G051400, Phvul.006G208300, Phvul.006G208400 and Phvul.006G208500) to identify amino acid substitution(s) associated with UDP-sugar specificity. All six proteins have a plant secondary product GT (PSPG) motif, which is conserved in plant UGTs and is thought to play an important role in UDP-sugar specificity (Fig. 8) (Hughes and Hughes 1994, Paquette et al. 2003, Masada et al. 2007, Osmani et al. 2008). Among the 17 amino acid substitutions found within the PSPG motif between these proteins, 12 were present between Sg-3 and GmSGT3 (Fig. 8). These results suggested that differences in UDP-sugar specificity and/or enzyme function might be attributed to the substitution of only a few amino acids within these proteins.
Amino acid sequence alignment of Sg-3-related UGT91 proteins. Amino acid sequences of soybean and common bean Sg-3-related UGT91 gene products were subjected to CLUSTALW alignment analysis. The conserved motif of a putative plant secondary product, glycosyltransferase (PSPG), is indicated by gray lines. Orthologous groups inferred by phylogenetic tree and BLASTp bidirectional best hits analyses (Fig. 7) are indicated by the numbers to the left of the gene names. Amino acid substitutions between Sg-3 and GmSGT3 are indicated by triangles within the PSPG motif.
Amino acid sequence alignment of Sg-3-related UGT91 proteins. Amino acid sequences of soybean and common bean Sg-3-related UGT91 gene products were subjected to CLUSTALW alignment analysis. The conserved motif of a putative plant secondary product, glycosyltransferase (PSPG), is indicated by gray lines. Orthologous groups inferred by phylogenetic tree and BLASTp bidirectional best hits analyses (Fig. 7) are indicated by the numbers to the left of the gene names. Amino acid substitutions between Sg-3 and GmSGT3 are indicated by triangles within the PSPG motif.
Discussion
Loss-of-function mutation in Glyma.10G104700 is responsible for the sg-3 phenotype
In this study, we present evidence that Glyma.10G104700 is responsible for the absence of soyasaponins Ab and αg in sg-3 mutants. First, we showed that a functional Sg-3 transgene derived from the Sg-3 cultivar ‘Williams 82’ is sufficient to restore the accumulation of soyasaponins Ab and αg in the sg-3 genetic background (Fig. 4C). Because both Ab and αg saponins have a terminal Glc residue at the C-3 sugar chain, it is evident that the Sg-3 transgene encodes a functional UDP-Glc glycosyltransferase for the biosynthesis of soyasaponins Af and γg. This is consistent with the enzyme assay, which showed the UDP-Glc glycosyltransferase activity of the recombinant Glyma.10G104700 protein (Fig. 3). Although soyasaponins Af and γg were not used as substrates in this enzyme assay, Glyma.10G104700 could act on soyasaponin Bb', a derivative of naturally occurring soyasaponin. Along with the genetic complementation of the sg-3 mutant, these findings indicated that Glyma.10G104700 could act not only on soyasaponin Bb', but also on soyasaponins Af and γg. We also showed that a premature stop codon mutation (EnT-0406; W270*) in Glyma.10G104700, which was found in the ‘Enrei’ cultivar background, could cause the absence of soyasaponins Ab and αg, similar to the ‘Mikuriya-ao’ cultivar (Fig. 5). This mutant allele is independent of the naturally occurring sg-3 allele, thus reinforcing the hypothesis that Glyma.10G104700 alone is responsible for the sg-3 phenotype. Interestingly, soyasaponins Ac and βg were found to accumulate in both ‘Mikuriya-ao’ and EnT-0406 (W270* stop codon mutant) (Figs. 4C, 5B). As both these soyasaponins carry the terminal Rha residue at the C-3 sugar chain, this would suggest that UDP-Rha glycosyltransferase is active in both mutants. Previously, Shibuya et al. (2010) had identified GmSGT3 as a UDP-Rha glycosyltransferase in an in vitro enzyme assay. Although genetic evidence for the role of GmSGT3 is lacking, our present study demonstrated that the amino acid substitution mutant EnT-0440 (R339H) did not accumulate soyasaponins Ac and βg (Fig. 5B). The accumulation of soyasaponins Ab and αg was also not affected in the gmsgt3 mutant, indicating that this mutant had functional UDP-Glc glycosyltransferase activity. These results clearly demonstrated that Sg-3 and GmSGT3 play non-overlapping roles in the glycosylation of soyasaponins. This model is further supported by the phenotype of the sg-3 gmsgt3 double mutant, which does not accumulate Ab, Ac, αg or βg soyasaponins (Fig. 6A). Soyasaponins Af and γg, both of which lack the terminal sugar residue at the C-3 sugar chain, were significantly accumulated in the double mutant. Thus, glycosyltransferase activity was absent at the terminal sugar residue of the C-3 sugar chain in the sg-3 gmsgt3 double mutant. Taken together, our results demonstrated that Sg-3 and GmSGT3 are required biosynthetic factors responsible for terminal Glc and Rha residue conjugation, respectively, at the C-3 sugar chain in soyasaponins. Although some saponin biosynthesis mutants are known to exhibit stunted growth in barrel medic (Naoumkina et al. 2010, Carelli et al. 2011), the sg-3 gmsgt3 double mutant did not show any visible phenotypes under normal growth conditions. Thus, both Sg-3 and GmSGT3 are unlikely to be essential for normal plant growth and development.
Evolution of Sg-3-related UGT91 genes in soybean
The most interesting feature of the soybean genome is that it has undergone at least two rounds of whole-genome multiplication (Schmutz et al. 2010, Cannon and Shoemaker 2012). The size and chromosome number of the soybean genome is almost twice those of the common bean genome; hence, there is a strong syntenic relationship across all the chromosomes of the two plants (Schmutz et al. 2014). It would be interesting to compare the two genomes to obtain an insight into the gene evolution in these species. Interestingly, we found a BDBH relationship between the Sg-3-related UGT91 genes of soybean and common bean. For example, GmSGT3 and Glyma.15G051400 showed the highest degree of sequence identity to Phvul.006G208300 and Phvul.006G208400, respectively, among all the UGT91 genes (Fig. 7B; 85–87% amino acid sequence similarity). Sg-3 exhibited 73% sequence identity with GmSGT3, although a greater degree of identity was observed between Sg-3 and its possible ortholog, Phvul.006G208500 (77% identity). However, unlike most of the soybean and common bean genes that show inter-chromosomal syntenic relationships (Cannon and Shoemaker 2012, Yano et al. 2017), there was no synteny between the UGT91 genes of these two species. While common bean Sg-3-related UGT91 genes (Phvul.006G208300, Phvul.006G208400 and Phvul.006G208500) are tandemly positioned on chromosome 6, soybean Sg-3, GmSGT3 and Glyma.15G051400 are randomly positioned on three different chromosomes (Fig. 7B). The tandem repeat nature of the common bean UGT91 genes reflects an ancestral form, compared with the randomly positioned soybean UGT91 genes, as it is unlikely that randomly generated homologous genes were rearranged into a tandem repeat pattern at a specific genomic position. Rather, it is likely that the genes were first duplicated as tandem repeats at a specific position and then scattered onto distinct chromosomes. These phylogenetic data, along with the genetically distinct roles of Sg-3 and GmSGT3, suggested that gene function first diversified during a gene duplication event in the common ancestor of common bean and soybean; the UGT91 genes were then rearranged on different chromosomes during the course of genomic evolution in the ancestor of the soybean. Indeed, Sg-3 does not appear to have a BDBH partner in the barrel medic genome (Table 1). However, the distinct substrate specificity of Sg-3-related UGT91 genes might have been achieved by only a few amino acid substitutions, as there are only 12 substitutions within the PSPG motif, between Sg-3 and GmSGT3 (Fig. 8). To assess the exact evolutionary relationship between the Sg-3-related UGT91 genes in soybean and common bean, it will be important to perform genetic complementation studies as well as enzyme assays, which could clarify the precise mechanism underlying UGT91 gene evolution and functionalization.
Materials and Methods
Plant materials and growth conditions
Seeds of soybean cultivars ‘Mikuriya-ao’ (JP29211), ‘Fukuyutaka’ (JP29668), ‘Enrei’ (JP28862), ‘Williams 82’ (PI518671) and ‘Jack’ (PI540556) were obtained from the NARO Genebank (Tsukuba, Japan) for JP accessions and the Germplasm Resources Information Network (GRIN) for PI accessions. The ‘JM’ line used for genetic complementation analysis was selected from the progeny of a cross between ‘Jack’ and ‘Mikuriya-ao’. The sg-3 genotype was screened for two SSR marker loci using the following primers: Glyma10.19870k, 5′-CAATTCATCTGAAAGTAAAATTAGA-3′ and 5′-GACACACGCAAGTTGGAATA-3′; and Satt241, 5′-CAAGGGGAACATAAGGTAGCA-3′ and 5′-GTAGAAAGCAACATTCTCAGGA-3′. Plants were grown in a soil mixture comprising ‘Nippi’ (Japan Agricultural Cooperatives) and ‘SuperMix’ (Sakata Seed Corp.) (2:1, v/v) at 28°C under 16/8 h light/dark conditions in air-conditioned greenhouses. The soil was inoculated with Rhizobium and Azospirillum species (Tokachi Nokyoren) to facilitate plant growth. For isolating genomic DNA or total RNA, harvested plant tissues were frozen in liquid nitrogen and stored at −80°C.
Genomic PCR and Southern blot analysis of the Sg-3 genomic region
Genomic DNA was extracted from fresh leaves using the standard cetyltrimethylammonium bromide (CTAB) method (Murray and Thompson 1980). Genomic DNA fragments of Glmya.10G104500 (0.3 kb), Glmya.10G104700 (1.7 kb), Glmya.10G104800 (0.6 kb), Glmya.10G107900 (1.6 kb) and Glmya.10G108400 (2.0 kb) were detected by PCR using the following primers: (Glmya.10G104500) 5′-CCAGTTAAAATTAGAGGCTCAGGCTGT-3′ and 5′-ACTACCCCGACCCTAGGACAA-3′; (Glmya.10G104700) 5′-TCCCAGGTGAAAAAGAAAACACACTGAC-3′ and 5′-AGTGGTATGGGTCGTATCTATAATTGCAC-3′; (Glmya.10G104800) 5′-GAGCTTAAGAATTGCAAGGGACTAGTTTTA-3′ and 5′-CCACATGGAGGGACTCGCAGTT-3′; (Glmya.10G107900) 5′-ACATGCCAACCCAAGACACTACTAC-3′ and 5′-CCTTTTGTTTCAGTAACCTCCTGCATTAG-3′; and (Glmya.10G108400) 5′-ATTTGTCTCACTAGGTGTCCATTAGTACAT-3′ and 5′-CTTCACTCTTCAGGCAAGCTTCAATC-3′. Genomic DNA fragments of five positions in the mapped region of Sg-3 were also amplified by PCR using the following primers: (Position 1) 5′-ATGGAAAAATGTGGTCATCGCCAAGC-3′ and 5′-GCCGTTCATACTCAATCTTGTAACATCTA-3′; (Position 2) 5′-AATCAAGTCCGATGACGTGTAAGGC-3′ and 5′-TGCTTAAGTTTCCTGAGTGCACCCTTT-3′; (Position 3) 5′-AATAATTCACCATAAAAATGGTTGCAACTT-3′ and 5′-TTACATAAATTAATATAGGATAAAGATTGT-3′; (Position 4) 5′-AAATATTCTTACTGAAATTGATTGTATATG-3′ and 5′-CGACGTTCTTGGCTAATGTCTTCTG-3′; and (Position 5) 5′-GAAACTCTTTAAAAAATAACATAAATGACG-3′ and 5′-TTTTAATTTATTTTTCTTTCATATGACCGA-3′. The amplicons were analyzed by 0.8–2.0% agarose gel by electrophoresis. Southern blot analysis was performed using the ECL Direct Nucleic Acid Labelling and Detection System (GE Healthcare), as previously described (Ishimoto et al. 2010). The DNA probe used to detect the Sg-3 genomic region was prepared by PCR using the primers 5′-TCAACATTCCCTGTGCACACTACAACT-3′ and 5′-AAAGCTCAATACCATGAGCCAACTCG-3′.
Enzyme assay for UGT glycosyltransferase activity
The open reading frame of Glmya.10G104700 (Sg-3) was amplified by PCR from a cDNA library derived from soybean seeds using the primers 5′-CACCATGCCTCTACACATTGCAATGCTCCCG-3′ and 5′-TTAACAGTTGGAATTAGGAGTCTTGTACTTTTGAAGA-3′. The underlined nucleotides indicate the bases that were introduced for cloning. The resulting PCR products were cloned via the pENTR™/D-TOPO® (Invitrogen) into a Gateway-adopted version of the pCold TF vector (TAKARA BIO INC.) using the Gateway LR clonase II Enzyme mix (Invitrogen). To express the recombinant Sg-3 protein fused with TF in E. coli, the resultant plasmid was introduced into the E. coli RosettaTM (DE3) strain (Novagen). The TF enhances the expression and solubility of the recombinant glucosyltransferase in E. coli. The transformed cells were cultured at 37°C in a Luria–Bertani broth containing ampicillin (50 μg ml–1) and chloramphenicol (34 μg ml–1). The cultures were incubated at 37°C until the OD600 reached 0.5, after which they were refrigerated at 15°C for 30 min. For induction of protein expression, isopropyl-β-d-thiogalactopyranoside was added at a final concentration of 0.1 mM, followed by incubation at 15°C for 22 h. The cells were harvested by centrifugation (7,000×g, 10 min) and resuspended in buffer A [40 mM sodium Pi (pH 8.0), with 1 mM EDTA, 0.3 M NaCl, 0.1% TritonX-100, 10% glycerol, 1 g l–1 lysozyme and 0.8 mM imidazole] supplemented with 1 mM dithiothreitol and a complete protease inhibitor cocktail (Roche). The cells were disrupted by ultrasonication and the cell lysate was centrifuged at 7,300×g for 10 min at 4°C to remove the cell debris. The supernatant was incubated with NiNTA agarose beads (Qiagen) at 4°C for 2 h, and the resin was transferred into a micro bio spin column (BioRad). The His-tagged recombinant protein was eluted with buffer B [40 mM sodium Pi (pH 8.0), with 1 mM EDTA, 0.3 M NaCl, 0.1% TritonX-100, 10% glycerol and 20 mM imidazole]. The protein concentration was determined using the Bio-Rad Protein Assay Dye Reagent (Bio-Rad) with bovine serum albumin (BSA) as a standard. SDS–PAGE was performed, and the proteins in the gels were visualized with CBB stain One (Nacalai Tesque).
Enzymatic reactions were carried out in 50 mM Tris–HCl buffer (pH 7.0) containing 2 μg of the recombinant TF–Glmya.10G104700 protein, 10 μM acceptor substrate (soyasaponin Bb', or Bc'), 50 μM UDP-sugar, 14 mM 2-mercaptoethanol and 100 μM MgCl2 in a total volume of 100 μl for 25 h at 30°C. The reaction mixtures were then extracted with 500 μl of 1-butanol, and evaporated to dryness with a centrifugal evaporator. The resultant residues were resuspended in 500 μl of methanol. The sample solutions were filtered through a GL-chromato disc 4 A filter (0.2 μm pore size, GL Sciences), and analyzed by a Waters ACQUITY TQD UPLC/MS system equipped with a UPLC HSS C18 column (2.1 mm×150 mm, 1.8 μm particle size, Waters) and a UPLC HSS C18 VanGuard pre-column (2.1 mm×5 mm, 1.8 μm particle size, Waters). Column temperature was maintained at 30°C. The flow rate was 0.2 ml min–1. The mobile phases were: A, 0.025% (v/v) acetic acid in water; and B, 0.025% (v/v) acetic acid in acetonitrile. The gradient program was as follows: 0–5 min, 30% B; 5–6 min, 30–40% B (linear gradient); 6–18 min, 40–50% B (linear gradient); 18–28 min, 50–100% B (linear gradient); 28–31.5 min, 100% B (isocratic); 31.5–36.5 min, decrease from 100% to 30% B; 36.5–38.5 min, equilibration with 30% B. Detection of reaction products by electrospray ionization mass spectrometry (ESI-MS) was performed in a negative ion mode with a capillary voltage of 2.5 kV, a dry temperature of 350°C and a flow rate of high-purity dry nitrogen gas of 10.0 l min–1. The selected-ion monitoring (SIM) mode with target ions at m/z 957.5 for soyasaponin Ba (V), m/z 941.5 for soyasaponin Bb (I), m/z 795.5 for soyasaponin Bb' (III), m/z 927.5 for soyasaponin Bx, m/z 911.5 for soyasaponin Bc, m/z 765.5 for soyasaponin Bc' (IV), m/z 633.4 for SBMG and m/z 457.4 for soyasapogenol B was used to detect target compounds. Additional ions at m/z 619.5, 809.6 and 971.5 were also detected together with the target ions above to confirm the UDP-sugar donor specificity of Glmya.10G104700 protein. UDP-glucose, UDP-galactose and UDP-glucuronic acid were purchased from Sigma-Aldrich. Soyasaponin III (Bb') and soyasaponin IV (Bc') were purchased from ChromaDex. Soyasaponin I (Bb), soyasaponin V (Ba) and soyasapogenol B were purchased from Tokiwa Phytochemical. SBMG was isolated from a crude soybean saponins powder (Wako Pure Chemical Industries) as described below. A 2 g aliquot of the crude soybean saponins powder was extracted with 200 ml of 70% (v/v) methanol for 22 h at room temperature with vigorous shaking. The extracts were then filtered and evaporated at 50°C. The resultant residue was resuspended in 5 ml of methanol and injected into an Inertsil ODS-3 C18 column (a main column, 10×250 mm, a particle size of 5 μm, GL Sciences; and its guard column, 3.0×10 mm, a particle size of 3 μm, GL Sciences) for HPLC analysis with a Jasco PU-2089 Plus Quaternary Gradient Pump and a UV-2075 Plus Intelligent UV/VIS Detector (Jasco). Column temperature was held constant at 30°C. The mobile phases were: A, 0.1% (v/v) formic acid in water; and B, 0.1% (v/v) formic acid in acetonitrile. The absorbance of 210 nm was continuously monitored for 90 min. The isocratic elution was achieved in 60% A and 40% B at a flow rate of 4 ml min–1. Solution eluted from the column was fractionated every 5 min, and each of the fractions was analyzed by liquid chromatography–mass spectrometry (LC-MS). Fractions containing SBMG were mixed together, evaporated to dryness and dissolved in methanol again.
Biochemical analysis of saponins
Soybean saponins were extracted with 80% (v/v) aqueous methanol and analyzed using LC-MS, as previously described (Takada et al. 2013).
Isolation of total RNA and RT–PCR analysis
Frozen soybean tissues were ground with a multi-beads shocker (Yasui Kikai) that had been pre-cooled with liquid nitrogen. Total RNA was extracted and purified using the RNeasy plant mini kit (Qiagen). Isolated total RNA was stored at −80°C. To synthesize cDNA, a 1 μg aliquot of the RNA was added to a reverse transcription reaction, which was performed using QuantiTect Rev. The Transcription Kit (Qiagen) had a total volume of 20 μl. The resulting cDNA products (1 μl) were subjected to PCR to amplify cDNAs using the following specific primers: (Glyma.10G104700) 5′-CCCTCAAACAAAAGCTATTACC-3′ and 5′-CCTTATCTGAATGGATGGTGGAACC-3′; (Actin 3) 5′-ATTGAACCCCTTGTTTGCGA-3′ and 5′-ATCAGGAAGCTCATGGCTTT-3′.
Genetic complementation test
For the genetic complementation of sg-3, we constructed a vector, designated pUHR:Sg-3 (Fig. 4A). Briefly, a 5.8 kb genomic DNA fragment containing the candidate gene (Glmya.10G104700) was obtained from the Sg-3 cultivar ‘Williams 82’ by PCR using the following primers: 5′-CTGAGTCGACCATTTCTATTGCGCAACACCTTCTAAACTG-3′ (underlined sequence is an introduced SalI site) and 5′-GACTGAATTCAAAGTGATTGTGCGGTTGAACTCAGGT-3′ (underlined sequence is an introduced EcoRI site). The obtained fragment was then cloned into the pCR4Blunt-TOPO vector (Invitrogen). After confirming the DNA sequence with a genetic analyzer 3730xl (Applied Biosystems) using a BigDye-Terminator v.3.1 cycle sequencing kit (Applied Biosystems), the inserted DNA fragment (Glmya.10G104700) was subcloned into the soybean transformation vector pUHR (Nishizawa et al. 2006) using the SalI and EcoRI restriction enzyme sites. The vector was constructed with hygromycin phosphotransferase (hpt) and red fluorescent protein (DsRed2) genes as selection marker genes. The primers used for dideoxy sequencing were: 5 ′-TATTGATAGTATTTTTCGCTAGTTGAGAAT-3′, 5′-GATGTTGTTCTTAAATATAATTTTTTTGCT-3′, 5′-AAAAAAATGATCATTGAAAAGTCGGGTTGT-3′, 5′-CAAAGATTTATTTTTATCCATCATCTTGAC-3′, 5′-AAATAGTTACCACTTATCTTCAAAAAGGTT-3′, 5′-TTATTTTCACAGTCACCACTCTTATATCAT-3′, 5′-GGGACACTTTGTTACCTTTATAAGCACT-3′, 5′-GATTGGGTTTTTTACGATTTCGCAACCGAG-3′, 5′-TTGGCTATTGGTGGTAACCACTCGGT-3′, 5′-GAAGGAGAATGGTTGGATTATCTTGCTCAC-3′, 5′-TCACTCACTGTGGTACCAATTCTCTCG-3′, 5′-ACTCCAGTTTCTTCTCTGTTTTTGTTTTGC-3′, 5′-ATGTACATTTATTTTTATCCCCAATCTCTC-3′, 5′-TGTTGGAGATCTCACCTTGAGTAGATATAG-3′, 5′-GAGTTTTTGTAACTCTCATGCGTA-3′ and 5′-GAATGTTGTCTTCGTATTGAAAAAGTGTTG-3′.
For soybean transformation, somatic embryos were induced in the immature cotyledons of the ‘JM’ soybean experimental line (see above). The cotyledons were cultured on MSD40 medium (Finer and Nagasawa 1988) and maintained in FNL medium (Samoylov et al. 1998) at 25°C under white fluorescent light (12.1 µmol m–2 s–1; 23/1 h light/dark cycle). Transformation of the somatic embryos was performed by particle bombardment using a Biolistic PDS-1000/He Particle Delivery System (Bio-Rad). Plant regeneration was then performed, as previously described (Khalafalla et al. 2005). The regenerated plants with hygromycin B resistance (Roche Diagnostics) were examined with an M165 FC fluorescence stereomicroscope (Leica Microsystems), and those expressing the red fluorescent protein (DsRed2) were grown under greenhouse conditions. The presence of the transgene was confirmed in the T1 generation by PCR using the primers: 5′-TCAACATTCCCTGTGCACACTACAACT-3′ and 5′-AAAGCTCAATACCATGAGCCAACTCG-3′.
Isolation of mutants with induced mutations in Sg-3 and GmSGT3
The EnT-0406 (W270*, sg-3 mutant) and EnT-0440 (R339H, gmsgt3 mutant) soybean lines were isolated from the induced mutant library, which was developed using the Japanese cultivar ‘Enrei’, as previously described (Tsuda et al. 2015). The primary candidates were screened from 1,536 M2 soybean lines by high-resolution melting (HRM) analysis in the MeltDoctor HRM master mix (Life Technologies) and the ViiA7 QPCR system (Applied Biosystems). The EnT-0406 and EnT-0440 lines were isolated using the following HRM primers: (Sg-3_F2) 5′-CCTCAAACAAAAGCTATTACCTCAAGC-3′ and (Sg-3_R3) 5′-CCACATGGGTTTGAGGAGAGA-3′; and (GmSGT3_F3) 5′-CCTTACGAGTTTTTGAGAGCATACG-3′ and (GmSGT3_R4) 5′-CAAGTGGCTGTTGAGGTTCC-3′. The W270* premature stop codon mutation in Sg-3 (EnT-0406) and the R339H amino acid substitution mutation in GmSGT3 (EnT-0440) were further verified by dideoxy sequencing (see above), using the following sequencing primers: (Sg-3_R3) 5′-CCACATGGGTTTGAGGAGAGA-3′ and (GmSGT3_F4) 5′-CCTTACGAGTTTTTGAGAGCATACG-3′. For genotyping progeny, 3–5 thin slices of seed cotyledons were obtained from individual seeds and crushed with a TissueLyzer (Qiagen). Genomic DNA was then extracted from these samples using the BioSprint 96 DNA Plant Kit (Qiagen). The zygosity of the mutations was analyzed by dideoxy sequencing, as described previously. The DDBJ accession numbers for Sg-3 and GmSGT3 cDNA sequences are as follows: LC317062 (Sg-3 cDNA, ‘Enrei’), LC317063 (GmSGT3 cDNA, ‘Enrei’), LC317064 (sg-3 cDNA, EnT-0406) and LC317065 (gmsgt3 cDNA, EnT-0440).
Sequence alignments and phylogenetic tree analysis of UGT91 genes
Sg-3-related UGT91 genes were identified by BLASTp (Camacho et al. 2009), with an E-value threshold of 1e-150 in the genomes of soybean (G. max; Glyma2.0) (Schmutz et al. 2010), common bean (P. vulgaris; release v1.0) (Schmutz et al. 2014) and barrel medic (M. truncatula) (Young et al. 2011), which are available in the Phytozome database (http://www.phytozome.net/). To identify orthologous relationships between soybean and common bean Sg-3-related UGT91 proteins, a bidirectional best hits BLASTp search was performed, using both soybean and common bean sequences as queries. Protein amino acid sequence alignment was performed using CLUSTALW version 2.1 (Larkin et al. 2007). Phylogenetic analysis was performed using MEGA version 6.0 (Tamura et al. 2013). Accession numbers of protein amino acid sequences used for the phylogenetic analyses were obtained from the GenBank/EMBL/DDBJ databases: Sg-1a, AB628091; Sg-1 b, AB628089; MtUGT73F3, FJ477891; GeIF7GlcT, AB098614; MtUGT73K1, AAW56091; SaGT4A, AB182385; MtUGT73P3, FJ477889; GmSGT2, BAI99584; AcUGT73G1, AAP88406; BvUGT73C10, AFN26666; BvUGT73C11, AFN26667; AtUGT73C5, AAD20156; SbUBGlcT, AB031274; BvUGT73A4, AY526080; Gt3’GlcT, AB076697; FaGT7, ABB92749; AcUGT73J1, AAP88407; OsUGT91G7, AAL83350; AtUGT91B1, ABH04468; FaUGT91A2, AAU09445; AtUGT91A1, AAL36076; and AtUGT91C1, AAO63454. Gm, Glycine max; At, Arabidopsis thaliana; Ac, Allium cepa; Fa, Fragaria×ananassa; Sb, Scutellaria baicalensis; Gt, Gentiana triflora; Bv, Beta vulgaris; Mt, M. truncatula; Sa, Solanum aculeatissimum; and Ge, Glycyrrhiza echinata.
Supplementary Data
Supplementary data are available at PCP online.
Funding
This work was supported by the Program for the Promotion of Basic and Applied Research for Innovations in the Bio-oriented Industry (BRAIN) [to H.S., T.M. and M.I.]; the Ministry of Agriculture, Forestry and Fisheries of Japan and the Scientific Technique Research Promotion Program for Agriculture, Forestry, Fisheries and Food Industry [to H.S., T.M. and M.I.]; Genomics-based Technology for Agricultural Improvement [to A.K., T.A. and M.I.]; and the National Institute of Agrobiological Sciences (NIAS) Strategic Research Fund [to A.K. and M.I].
Acknowledgments
We thank Dr. Kazuhito Fujiyama and Dr. Takao Ohashi (International Center for Biotechnology, Osaka University) for their valuable inputs in discussions on the catalytic mechanism and sugar donor specificity of UGTs and Junko Kamiya (NICS) for the production and care of transgenic soybeans. We also thank Dr. Munenori Suzuki (Osaka University) for technical advice on the separation of SBMG from crude soybean saponins.
Disclosures
The authors have no conflicts of interest to declare.
References
Abbreviations
- Ara
arabinose
- BDBH
bidirectional best hit
- DDMP
2,3-dihydro-2,25-dihydroxy-6-methyl-4-H-pyran-4-one
- Gal
galactinose
- Glc
glucose
- HRM
high-resolution melting
- LC-MS
liquid chromatography-mass spectrometry
- Rha
rhamnose
- SBMG
soyasapogenol B 3-O-monoglucuronide
- SSR
simple sequence repeat
- TF
trigger factor
- UGT
UDP-glycosyltransferasr
Author notes
Present address: Tohoku Agricultural Research Center, NARO, Fukushima, 960-2156 Japan








