Tea GOLDEN2-LIKE genes enhance catechin biosynthesis through activating R2R3-MYB transcription factor

Abstract The biosynthesis of catechins, a major type of flavonoids accumulated in tea, is mediated by developmental cues and environmental stimuli. Light enhances but shading treatment reduces catechin accumulation in tea leaves. However, the transcription factors involved in light-mediated catechin biosynthesis remain to be identified. Two GOLDEN2 LIKE genes from tea plant (CsGLK1 and CsGLK2) were isolated and characterized in both tomato and tea plants. Transcripts of both CsGLK1 and CsGLK2 were affected by light intensity in tea plants. Overexpression of CsGLK1 and CsGLK2 promoted chloroplast development and carotenoid accumulation in tomato fruits. An integrated metabolomic and transcriptomic approach revealed that both catechin content and related biosynthetic genes were upregulated in CsGLK-overexpressing tomato leaves. Our further studies in tea plants indicated that CsGLKs directly regulate the transcription of CsMYB5b, a transcription factor involved in catechin biosynthesis. Suppression of CsGLKs in tea leaves led to the reduction of both CsMYB5b expression and catechin accumulation. Taken together, the results show that CsGLKs are involved in light-regulated catechin accumulation in tea plants by regulating expression of CsMYB5b and have great potential for enhancing the accumulation of both carotenoids and flavonoids in fruits of horticultural crops.

The MYB transcription factors, including AtPAP1, AtTT2, MtMYB14, and VvMYBPA1, play a predominant role in the accumulation of f lavan-3-ols and proanthocyanidins (PAs), their polymers, in many plant species [16][17][18]. Their homologous MYB genes in tea plant include CsMYB5b and CsMYB75. CsMYB5b enhances the synthesis of both catechins and PAs in tobacco leaves by upregulating tobacco LAR and ANR [12,14]. CsMYB75 is responsible for anthocyanin hyperaccumulation in purple tea [19]. In addition, catechins and PAs were accumulated by co-expressing an Arabidopsis PAP1 MYB transcription factor (AtPAP1) and a Medicago ANR (MtANR) in tobacco and Medicago [20]. Light plays an indispensable role in the biosynthesis of catechins and PAs [21][22][23]. However, the transcription factors involved in light-mediated catechin biosynthesis remain to be identified.
The Golden2-like (GLK) transcription factors are key regulators of chloroplast development in many plant species, including maize [24], Arabidopsis [25,26], rice [27], moss [28], pepper [29], tomato [30,31], and kiwifruit [32]. The flowering plants usually contain two GLK genes (GLK1 and GLK2), due to a recent genome duplication [33]. Both GLK1 and GLK2 are expressed in photosynthetic tissue and exhibit functional redundancy [25,31]. Knockout of both GLK genes in Arabidopsis resulted in reduced chlorophyll levels and chloroplast size and number, because of the impaired expression of genes involved in photosystems and chlorophyll biosynthesis [25,26]. Overexpression of tomato GLK2 (SlGLK2), whose mutation is responsible for the uniform ripening phenotype in fruits, increased the chloroplast levels and thereby enhanced the nutritional quality of fruits [30,31] since the carotenoids are synthesized and stored in chromoplasts, which are converted from chloroplasts. More chromoplasts lead to greater accumulation of carotenoids [34,35].
Light induces GLK expression, suggesting that GLKs are important for light-induced chloroplast development [25]. A recent study indicated that dark induces GLK1 degradation while light can stabilize and activate GLK1 by BIN2-mediated phosphorylation in Arabidopsis [36]. In tomato, SlGLK2 was degraded by the CUL4-DDB1-DET1 ubiquitin E3 ligase complex [37], which is a crucial component of light signaling and is required for plant photomorphogenesis [38]. Loss-of-function mutants of the E3 ligase complex displayed higher chloroplast levels and increased carotenoid accumulation in tomato fruits [34,35,39], possibly because SlGLK2 is stabilized in these mutants [37]. UV-B light increases SlGLK2 expression and overexpression of the tomato UV-B receptor UVR8 (SlUVR8) also enhances chloroplast development and carotenoid accumulation by increasing SlGLK2 accumulation [40].
The functions of GLK genes in chloroplast development and carotenoid accumulation were characterized previously. However, it remains to be determined whether GLK genes regulate f lavonoid biosynthesis. In this study, we demonstrated that tea GLK genes (CsGLK1 and CsGLK2) not only promote chloroplast development and carotenoid accumulation but also participate in the lightmediated biosynthesis of catechins and PAs.

Protein alignment and subcellular localization
The tomato GLK2 protein sequence was used for BLAST in the tea genome sequence database (http://tpia.tea plant.org/index.html) [41]. Six loci were obtained and highlighted in the green frame in Supplementary Data Fig. S1. Only two loci (TEA015144 and TEA009544) contain the complete nuclear localization signal (NLS, as indicated in the yellow frame in Supplementary Data Fig. S1), GARP DNA-binding domain (DBD, as indicated in the blue frame in Supplementary Data Fig. S1), and AREAEAA/AREVEAA hexapeptide (highlighted in the red frame in Supplementary Data Fig. S1). All these domains are conserved and essential for GLK functions [24]. For further functional characterization, these two genes were isolated and named CsGLK2 and CsGLK1, respectively, based on protein sequence alignment (Supplementary Data Fig. S2A) and phylogenetic tree analysis ( Supplementary Data Fig. S2B).
The protein sequence alignments (Supplementary Data Fig. S2A) showed that the CsGLKs have an Nterminal DBD domain (marked by a black line), and a conserved GOLDEN2 C-terminal (GCT) box [24]. In addition, the hexapeptide sequence (AREAEAA/AREVEAA, marked with a black asterisk) is also highly conserved in the GLK proteins [25]. The alignment analysis showed that CsGLKs contain typical domains that determine the GLK protein family. The phylogenetic tree analysis (Supplementary Data Fig. S2B) showed that CsGLK1 and CsGLK2 homologs were more closely related to kiwifruit AchGLK [32].
To examine the subcellular localization, GFP was fused to the C terminuses of CsGLK1 and CsGLK2. Transient expression in tobacco protoplasts showed that CsGLK1/CsGLK2-GFP were localized in the nucleus, while the GFP (green fluorescent protein) signal was detected in the cytoplasm (Supplementary Data Fig. S3). These results confirmed the nuclear localization of both CsGLK1 and CsGLK2.

Expression pattern of CsGLK1 and CsGLK2
To study the tissue-specific expressions of CsGLK1 and CsGLK2, total RNAs were extracted from different tissues of tea plants. qRT-PCR assays showed that CsGLKs were expressed in the leaves and stems tested, but at a very low level in roots and flowers (Fig. 1A). CsGLK2 showed much higher expression in tea fruits than CsGLK1 (Fig. 1A), similar to the patterns of SlGLKs in tomato [30]. Besides, both genes were downregulated by treatment with decreased light intensity (Fig. 1B). Only 34% of CsGLK1 transcripts and 18% of CsGLK2 transcripts were detected in the dark, compared with control light (216 μmol m −2 s −1 ) (Fig. 1B). These results indicated that light regulates the expression of CsGLKs in tea plants.

CsGLK overexpression enhanced chloroplast development and carotenoid accumulation in tomato
The 'Micro-Tom' tomato is suitable for functional verification of GLKs because it contains an insertion mutant in the SlGLK2 gene, which causes the uniform ripening phenotype in fruits [30]. Therefore, we used the CaMV35S promoter and ectopically overexpressed CsGLK1 and CsGLK2 in 'Micro-Tom' tomato plants to investigate their function in the regulation of chloroplast development. More than 10 transgenic tomato lines were obtained. For each construct, three independent transgenic lines (CsGLK1-OE-1, -2, and -3; CsGLK2-OE-1, -2, and -3) with the highest expression were chosen for further analysis and their transcription levels were verified by qRT-PCR ( Fig. 2A). As expected, all transgenic plants overexpressing CsGLK1 or CsGLK2 displayed darker green leaves (Fig. 2B) and immature fruits (  and leaves (Supplementary Data Fig. S4C and D). Similar phenotypes were observed in tomato fruits overexpressing SlGLK2 [30,31] or AchGLK [32], suggesting that CsGLKs share the conserved function in promoting chloroplast development and chlorophyll synthesis.
Previous studies showed that the carotenoid content of fruit is positively correlated with chloroplast development [31,34]. Therefore, we also analyzed carotenoid accumulation in the fully ripened fruits of CsGLK1/CsGLK2-overexpressing lines, and the results indicated that total carotenoid contents were obviously increased in the transgenic lines (Fig. 2F).

CsGLK-overexpressing leaves
Since tea is made from the leaves of tea plants and it remains largely unknown how GLK inf luences leaf metabolism, we used metabolomic assay to investigate the alterations of leaf metabolism in CsGLKoverexpressing tomato plants. Six biological replicates were analyzed for WT and CsGLK1-and CsGLK2overexpressing plants. Each replicate was actually a pooled sample collected from three independent transgenic lines or WT individuals. A total of 297 annotated metabolites were identified in at least one of the samples (Supplementary Data Table S1), including ∼30% amino acids, 28% lipids, 18% carbohydrates, 10% nucleotides, 8% cofactors and vitamins, 4% xenobiotics, and 2% other groups (Fig. 3A). All identified metabolites were analyzed by agglomerate hierarchical clustering and are presented in a heat map (Supplementary Data Fig. S5). Principal component analysis (PCA) showed that these samples could be divided into three groups for WT, CsGLK1-, and CsGLK2-overexpressing plants, respectively, indicating the reliable repeatability of these samples derived from three different genotypes (Fig. 3B).
Next, more than 160 differently accumulated metabolites (DAMs) were identified among three genotypes (Supplementary Data Table S1). KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway analysis indicated that these DAMs were obviously enriched in the processes of primary metabolism, such as the metabolisms of amino acids, ABC transporters, and energy, as well as in some secondary metabolisms, such as α-linolenic acid metabolism and isoquinoline alkaloid biosynthesis (Fig. 3C). Among them, phenylpropanoid, flavonoid, flavone, and flavonol biosyntheses were highly related (highlighted by red frames in Fig. 3C). These results indicated that CsGLKs are involved not only in primary metabolisms but also in some secondary ones, especially flavonoid biosynthesis, which is pivotal for the nutrient qualities of tea.
To further investigate how CsGLKs affect flavonoid biosynthesis (Fig. 4A), we presented the flavonoid-related metabolites detected in the metabolomic assays as the heat map shown in Fig. 4B. Among these metabolites, cinnamic acid, coumaric acid, epicatechin, and catechin were significantly increased in both the CsGLK1-and CsGLK2-overexpressing lines compared with WT plants (Fig. 4B). Catechin was increased by 4.49 and 4.05 times in the CsGLK1-and CsGLK2-overexpressing lines, respectively (Supplementary Data Table S1). To further confirm these results, we used two commercial assay kits to quantitatively measure the total flavonoids (   Fig. S6) also confirmed that catechin (C) and epicatechin gallate (GCG) are substantially accumulated in the transgenic tomato leaves. The results showed that CsGLK overexpression enhanced f lavonoid and OPC accumulation in both fruits and leaves.

catechin monomers (Supplementary Data
Next, we performed transcriptome analysis for the same samples as those used in the metabolomic analysis and all the differentially expressed genes (DEGs) are listed in Supplementary Data Table S2. Most of the flavonoid biosynthetic genes (Fig. 4A) were detected in our transcriptome and the average levels of their transcripts in three replicates are shown in a heat map in Fig. 4E. The PAL, C4H, F3 H, DFR, and ANR genes were increased in both CsGLK1-and CsGLK2-overexpressing lines (Fig. 4E). Most of their expression patterns were validated by qRT-PCR (Supplementary Data Fig. S7).
The enhanced expression of PALs and C4H might be responsible for the increased accumulation of cinnamic and coumaric acids. Upregulation of F3 H, DFRs, and ANR genes (Fig. 4E) should contribute to the increased accumulation of epicatechin and catechin (Fig. 4B). Consistent with the transcriptome analysis, an increase in ANR expression was detected in the CsGLK1-and CsGLK2-overexpressing lines by qRT-PCR assays (Fig. 4F). Although the LAR gene was not identified in our transcriptomic analysis, our qRT-PCR showed that it was also upregulated in both transgenic lines (Fig. 4F). These results indicated that the enhanced catechin and epicatechin accumulation possibly resulted from upregulation of some flavonoid biosynthetic genes, including SlLAR and SlANR, which play important roles in catechin and epicatechin biosynthesis.  the co-expressions of CsANR-Pro::GUS with 35S::CsGLK1 (or 35S::CsGLK2) had no obvious effect on GUS activity (Fig. 5A). Similarly, the combined expression of CsLAR-Pro::GUS with 35S::CsGLK1 (or 35S::CsGLK2) did not produce higher GUS activity than the individual expression of CsLAR-Pro::GUS (Fig. 5A). These results indicated that CsGLKs cannot activate the promoters of the CsLAR or CsANR genes. Among the DEGs from our transcriptomic analysis, several MYB transcription factors were identified (Supplementary Data Table S2 and Supplementary Data Fig. S8), including SlMYB54, the highly homozygous gene of CsMYB5b that was demonstrated to positively regulate catechin biosynthesis by activating LARs and ANRs in tea plants [12,14]. The qRT-PCR analysis confirmed that SlMYB54 and other identified MYB genes were indeed upregulated in the CsGLK1-and CsGLK2-overexpressing transgenic tomato lines (Supplementary Data Fig. S8), suggesting that CsGLKs might enhance catechin synthesis by directly activating the transcription of CsMYB5b in tea plants. To support this notion, we expressed CsMYB5b-Pro::GUS alone or co-expressed with 35S::CsGLK1 (or 35S::CsGLK2) in tobacco leaves. We observed significantly higher GUS activities in the co-expressed combinations than with the expression of CsMYB5b-Pro::GUS alone (Fig. 5A). In addition, the native promoter of another MYB gene (CsMYB5a), which promotes the accumulation of PAs [42], was also cloned and fused with GUS. The combined expression of CsMYB5a-Pro::GUS and 35S::CsGLK1 (or 35S::CsGLK2) did not alter the GUS activities compared with expression of CsMYB5a-Pro::GUS alone (Fig. 5A). These results suggest that CsGLKs involve transcriptional activation of CsMYB5b but not CsMYB5a. Next, the electrophoretic mobility shift assay (EMSA) was used to study whether CsGLKs can physically bind the promoter of CsMYB5b. Based on the identified ciselements that GLK binds in Arabidopsis [26,36], we found three similar element motifs in the promoter of CsMYB5b and designed three corresponding probes (probes 1, 2, and 3) (Supplementary Data Fig. S9A). EMSA results showed that CsGLKs were capable of binding probes 1 and 2 in vitro (Fig. 5B), but not probe 3 ( Supplementary Data Fig. S9B). These results indicated the direct interaction between CsGLKs and the promoter of CsMYB5b.
To further confirm the involvement of CsGLK-mediated regulation of CsMYB5b expression and catechin biosynthesis, a gene-specific antisense oligonucleotide (AsODN) method was employed to suppress the expression of CsGLK1 and CsGLK2 in C. sinensis leaves [43]. Consequently, the expression of CsGLK1 in tea leaves treated with AsODN_CsGLK1 was obviously decreased in three independent tests, compared with leaves treated with sense oligonucleotides of CsGLK1 (sODN_CsGLK1) (Fig. 6A). The mRNA levels of CsMYB5b (Fig. 6B), CsANR (Fig. 6D), and CsLAR (Fig. 6E) and OPC contents (Fig. 6F) of the tea leaves were all significantly reduced in the AsODN_CsGLK1 plants compared with those in control (sODN_CsGLK1) plants. Similarly, CsGLK2 silencing (Fig. 6A) also impaired the expressions of these genes (Fig. 6B, D, and E) and OPC contents (Fig. 6F). HPLC analysis of catechin monomers also confirmed that all the detectable monomers were significantly less in AsOND leaves than in sODN leaves (Supplementary Data Fig.  S10). Consistent with the results of the above-mentioned GUS assays, suppression of CsGLKs had no obvious effect on CsMYB5a expression (Fig. 6C). These results confirmed that CsGLKs mediate catechin biosynthesis in tea leaves by transcriptionally regulating CsMYB5b but not CsMYB5a. Besides, light intensity also affects expression levels of CsMYB5b, CsANR, and CsLAR in the tea plant ( Supplementary Data Fig. S11), which is consistent with the expression pattern of CsGLKs (Fig. 1B).

Discussion
GLKs have been demonstrated to play a key role in chloroplast development in many plant species [25,26,30,31]. Our results indicate that CsGLK1 and CsGLK2 from the tea plant also possess the conserved function in enhancing chloroplast development, as indicated by the increased chloroplast number and size in fruits and leaves of CsGLK-overexpressing tomato plants (Supplementary Data Fig. S4). The carotenoids, including lycopene, are very important nutrients of tomato fruits and are synthesized and accumulated in chromoplasts. CsGLK-overexpressing tomato fruits exhibit higher contents of carotenoids (Fig. 2) since they contain more chromoplasts than WT plants.
In most tomato cultivars, flavonoids are hardly produced in the fruits because of low expression of flavonoid biosynthetic genes in fruits [44,45]. Flavonoids, a large family of polyphenolic compounds, include anthocyanins, flavan-3-ols, flavonols, flavones, flavanones, and isoflavones [46]. Overexpression of the petunia CHI gene increased flavonol levels in tomato fruits [44]. Mutation [47] or fruit-specific silencing [48] of the tomato DET1 gene led to higher contents of flavonol (quercetin) and flavanone (naringenin) in fruits. Fruit-specific co-expression of a bHLH transcription factor, Del, and an MYB-related transcription factor, Ros1, resulted in purple tomato fruits with enhanced accumulation of anthocyanins [45]. Recent studies indicated that SlAN2-like (an MYB transcription factor) also regulates anthocyanin accumulation in both peel and flesh of tomato fruits [49,50]. To our best knowledge, enhanced flavan-3-ol biosynthesis in tomato fruits has not been accomplished yet. Our work indicated that overexpression of CsGLKs enhanced the accumulation of flavan-3-ols, including catechin, epicatechins, and their polymers (PAs) in both leaves and fruits (Fig. 4).
The biosynthesis of catechin, a major type of flavonoid accumulated in tea, is mediated by both developmental cues and environmental stimuli [23,51]. In most cases, light promotes catechin biosynthesis [21,52,53] and shading treatment reduces catechin accumulation in tea leaves [22,23,51,54]. However, it remains largely unknown how light regulates catechin biosynthesis. Our results indicated that the transcription of both GLK genes was affected by various light intensities (Fig. 1B) and overexpression of CsGLKs significantly increased the contents of catechin and epicatechin ( Fig. 4; Supplementary Data  Fig. S6). Suppression of CsGLK1 or CsGLK2 expression in tea leaves resulted in reduced accumulation of catechin monomers (Supplementary Data Fig. S10) and OPC contents (Fig. 6). Interestingly, CsGLK-overexpressing tomato leaves substantially accumulated catechin and epicatechin gallate (Supplementary Data Fig. S10), indicating that tomato might have all the structural genes required for catechin biosynthesis and have great potentials for studying catechin accumulation. Besides, these results suggested that CsGLKs are involved in light-mediated catechin accumulation in tea plants and shading treatment decreases catechin contents, possibly through suppressing expression levels of GsGLKs (Supplementary Data Fig.  S12). In Arabidopsis, GLK genes were also induced by light [25] and GLK proteins were degraded in the dark [36]. It appears that light-mediated GLK expression is also conserved in tea plants.
PAL catalyzes the rate-determining step in phenylpropanoid synthesis [55]. Previous studies indicated that both gene expression and enzyme activity of PAL are affected by shading treatment and PAL is involved in light-mediated catechin biosynthesis [21][22][23]. In CsGLKoverexpressing plants, transcripts of PALs and the catalytic product cinnamic acid were significantly increased (Fig. 4). Besides, transcripts of C4H, F3 H, ANR, and LAR were decreased by shading treatment [23] and all were increased in CsGLK-overexpressing plants in our study (Fig. 4). These results indicate that CsGLKs participate in light-mediated activation of f lavonoid biosynthetic genes.
LARs catalyze the leucoanthocyanidins to catechins [56] and ANRs convert anthocyanins into epicatechins [57]. Co-expression of LAR and ANR enhances the accumulation of flavan-3-ols and their polymeric PAs [12]. Although upregulation of LAR and ANR was observed in the CsGLK-overexpressing tomato lines, CsGLKs did not directly activate the promoters of CsLAR and CsANR (Fig. 5). Previous studies in many plant species indicated that MYBs, such as AtTT2, VvMYBPA1, and MtMYB14, mediate the biosynthesis of flavan-3-ols and PAs [16][17][18]. Their homologous MYB gene in the tea plant (CsMYB5b) was also proved to promote accumulation of f lavan-3ols and PAs by upregulating LAR and ANR [12,14]. Our transcriptomic results revealed that several MYB genes, including SlMYB54, the homologous gene of CsMYB5b in tomato, were upregulated in the CsGLK-overexpressing plants (Supplementary Data Fig. S7). Therefore, we speculate that CsGLKs increased the transcripts of LAR and ANR genes, possibly through directly activating transcription of CsMYB5b in tea plants. Our further analysis confirmed that both CsGLKs recognized the promoter of CsMYB5b (Fig. 5). The AsODN-mediated suppression of CsGLKs in tea plant leaves also led to decreased expression of CsMYB5b and reduced accumulation of PAs ( Fig. 6B and F). These results indicated that CsGLKs enhance catechin biosynthesis by directly regulating CsMYB5b transcription in tea leaves ( Supplementary  Data Fig. S12).
Metabolic activities in chloroplasts are highly oxidizing and the rapid electron f low frequently results in the production of chloroplastic reactive oxygen species (ROS) [58,59]. The presence of ROS-generating centers makes the chloroplast a major organelle of ROS production, especially under f luctuating light intensity [60,61]. Flavonoids, including catechin, are natural antioxidants and act as scavengers of ROS [62,63]. On one hand, CsGLKs promote chloroplast development. On the other hand, they also enhance the biosynthesis of flavonoids, possibly to alleviate the oxidative stress caused by more chloroplasts under f luctuating irradiation. This speculation is consistent with a previous observation indicating that light-sensitive tea leaves produced large amounts of f lavonoids, including catechin and epicatechin, which function in photoprotection and facilitate the acclimatization of tea plants by scavenging ROS [51]. Of course, this requires further investigations to demonstrate the coordination between CsGLK-enhanced chloroplast development and flavonoid biosynthesis.
Our characterization of tea GLK genes (CsGLK1 and CsGLK2) has indicated that CsGLKs are involved in both chloroplast development and catechin accumulation in tea plants. These two genes are also promising candidates for genetic manipulation to enhance accumulation of both carotenoids and f lavan-3-ols in fruits of horticultural crops.

Plant material and growth conditions
Tea plants (C. sinensis 'Shuchazao') were grown in a tea plantation at Anhui Agricultural University, Hefei, China (117.27E, 31.86N). Roots, stems, mature leaves, f lowers, and fruits of tea were collected and immediately frozen in liquid nitrogen. Shade treatment was performed using shading nets with a transmittance rate of 50% and the whole tea plants were covered by the shading nets for 24 hours. Control and shaded plants were grown in an incubator at 24 • C with continuous light (light intensity 216 μmol m −2 s −1 ) for 24 hours. Dark treatment was performed in the same incubator without light.
WT tomatoes (Solanum lycopersicum 'Micro-Tom') were used in our laboratory. They were grown in a constanttemperature cultivation chamber at 24 • C with a photoperiod of 16 hours light/8 hours dark.
Tobacco (Nicotiana tabacum and N. benthamiana) plants, which were used for GUS activation and subcellular localization, respectively, were cultivated in a constant temperature of 24 • C and a photoperiod of 16 hours light/8 hours dark.

Phylogenetic analysis of GLKs
The protein sequences of GLKs from kiwifruit, rice, tomato, and Arabidopsis were used. The tree was constructed according to the alignments, using MEGA 7.0 with the neighbor-joining method under the standard parameters. Alignment of amino acid sequences was conducted using DNAMAN 8.0.

Subcellular localization
The coding sequences of CsGLK1 and CsGLK2 were amplified by PCR with gene-specific primers (Supplementary  Data Table S3). The product was inserted into vector pART27 containing the CaMV35S promoter and a GFP. The recombinant vectors 35S::CsGLK1/CsGLK2-GFP were transformed into A. tumefaciens strain GV3101, which was infiltrated into tobacco (N. benthamiana) leaves. The infiltrated plants were cultured for 2 days. Protoplasts from the injection site were isolated by enzymolysis (Cellulase R-10 and Macerozyme R-10, Yakult, Japan) and subsequently treated with 10 μg/mL DAPI for 30 min. The GFP signal was detected with a confocal microscope (FV1000; Olympus, Tokyo, Japan).

Tomato transformation
Whole coding sequences for CsGLK1 and CsGLK2 were amplified and ligated into plant expression vector pBI121 containing the 35S promoter. The recombinant vectors 35S::CsGLK1/CsGLK2OE were transformed into A. tumefaciens strain EHA105. A. tumefaciens-mediated transformation was used to obtain transgenic plants according to the described procedure [34].

Transmission electron microscopy analyses of chloroplast cells
For analysis by TEM, the pericarps of mature green fruits (20 days after anthesis) and leaves from transgenic and WT tomato plants were cut and fixed in electron microscope fixative (Servicebio, Wuhan, China). The sections were analyzed using a transmission electron microscope (Hitachi, HT7700). The chloroplast area was measured using Image J software.

Metabolomic and transcriptomic analysis
Leaf tissues (six biological replicates, each of which was a pooled sample from three independent transgenic lines) of 40-day-old transgenic lines (CsGLK1OE and CsGLK2OE) and WT plants were harvested and examined by ultrahigh-performance liquid chromatography-tandem mass spectrometry (UPLC-MS/MS). All metabolites were analyzed by comparison of the ion features in the samples according to a library of chemical standard entries, including molecular weight, preferred adducts, retention time, and in-source fragments as well as associated MS spectra and the results were checked for accuracy by visual inspection for quality control [65].
RNA-Seq cDNA libraries were obtained from RNA isolated from the above-mentioned tomato leaves (three biological replicates for each genotype) using the NEB-Next mRNA Library for Illumina, and subsequently PCRamplified using NEBNext Multiplex Oligos for Illumina (New England Biolabs, USA). A High Sensitivity DNA chip on a 2100 Bioanalyzer (Agilent Technologies, USA) was used to determine the quality and average length of cDNAs in the library. RNA-Seq libraries were sequenced on a HiSeq 2500 (Illumina) system according to the manufacturer's instructions. DEG data were analyzed using methods similar to those described previously [66]. Genes with a fold change ≥2 and P-value < .01 were determined to be differentially expressed.

Transient expression and GUS assays
The native promoter (2023 bp upstream sequence of the coding region) of CsMYB5b (KY827397) was isolated from 'Shuchazao' and inserted into pBI121 to replace the CaMV35S promoter followed by the GUS reporter. Similarly, the CsMYB5a (KY827396) promoter (2216 bp), the CsANR (GU992402) promoter (2179 bp), and the CsLAR (GU992401.1) promoter (2598 bp) were isolated and cloned to construct CsMYB5a-Pro::GUS, CsANR-Pro::GUS, and CsLAR-Pro::GUS, respectively. Six-week-old N. tabacum leaves were injected with either a reporter construct alone or combined with 35S::CsGLK1 or 35S::CsGLK2. The constitutive expression vector (35S::GUS) was used as the positive control by A. tumefaciens-mediated transient expression. The methods of GUS staining and enzyme activity determination were as described previously [66].

Electrophoretic mobility shift assay
The full-length coding sequences of CsGLK1 and CsGLK2 were amplified and inserted into the pET32a vector to generate the recombinant His-CsGLK1/CsGLK2 plasmid. The recombinant plasmid was introduced into Rosetta cells, and His-GLK1 and His-GLK2 proteins were expressed and purified using His Sepharose beads. Two conserved cis-element motifs of CsMYB5b promoters were found. The oligonucleotide probes containing GLKbinding sites CCAAAC and G-box TACGTT were labeled with biotin at the 3 end of the sense strand according to the EMSA probe biotin labeling kit (Beyotime GS008). The EMSA was performed according to a previous study [67]. To confirm the specificity of the shifted band, a 100-to 200-fold amount of non-labeled cold probe was used. A Chemiluminescent EMSA Kit (Beyotime GS009) was used to detect the binding of protein-DNA.

Gene suppression of CsGLK1 and CsGLK2 in tea plants
SOLIGO software (https://sfold.wadsworth.org/cgi-bin/ soligo.pl) was used to select candidate AsODNs against CsGLK1 and CsGLK2 (Supplementary Data Table S3). AsODNs were synthesized by General Biosystems Company. Tea seedlings at the stage of one bud and two leaves were injected with 20 μM AsODN solution and treated for 24 hours (16 h light/8 h dark). Sense oligonucleotides (sODNs) were used as control. The experiments were performed according to the methods described in the previous study [43].