Identification of a WRKY transcriptional activator from Camellia sinensis that regulates methylated EGCG biosynthesis

ABSTRACT Naturally occurring methylated catechins, especially methylated EGCG in tea leaves, are known to have many health benefits. Although the genes involved in methylated EGCG biosynthesis have been studied extensively, the transcription factors that control methylated EGCG biosynthesis are still poorly understood. In the present study, a WRKY domain-containing protein termed CsWRKY57like was identified, which belongs to group IIc of the WRKY family and contains one conserved WRKY motif. CsWRKY57like was found to localize in the nucleus and function as a transcriptional activator; its expression was positively correlated with methylated EGCG level. In addition, CsWRKY57like activated the transcription of three genes related to methylated EGCG biosynthesis (CCoAOMT, CsLAR, and CsDFR), specifically interacting with their promoters by binding to the cis-acting element (C/T)TGAC(T/C). Further assays revealed that CsWRKY57like physically interacts with CsVQ4 and participates in the metabolic regulation of O-methylated catechin biosynthesis. We conclude that CsWRKY57like may positively impact the biosynthesis of methylated EGCG in the tea plant. These results comprehensively enrich the regulatory network of WRKY TFs associated with methylated EGCG and provide a potential strategy for the breeding of specific tea plant cultivars with high methylated EGCG levels.


Introduction
Tea is the second most consumed beverage in the world after water. The quality of tea and its health-promoting effects on humans depend on its leaves and leaf buds, which are rich in varied secondary metabolites, especially flavonoids, theanine, and caffeine [1][2][3]. Numerous studies suggest that tea plays a significant role in reducing vascular disease, heart disease, and cancers [4,5]. Catechin, especially epigallocatechin gallate (EGCG), is the main functional component in tea leaves. Recently, methylated tea catechin derivatives, especially (−)epigallocatechin-3-O-(3-O-methyl)-gallate (EGCG3 Me, Figure 1a), have attracted significant attention for their role in the prevention of arteriosclerosis and their strong antiallergic and antihypertensive activities [6][7][8].
Catechin is an important and abundant compound in tea plants and can be detected in all tea plant tissues. However, methylated catechins are present in limited quantities in a few tea cultivars. Previously, we found that EGCG3 Me accumulated to high levels in two tea cultivars, 'Jinmudan' and 'Mingke 1', among several Chinese tea cultivars [9] (Figure 1b). Nevertheless, the molecular mechanism of methylated catechin biosynthesis and the transcriptional regulatory mechanism of related genes during EGCG3 Me production remain unclear. Therefore, an in-depth analysis of the catechin biosynthesis regulatory network is essential for breeding tea plant cultivars with high EGCG3 Me content.
CsLAR, CsDFR, and CCoAOMT were positively correlated with the accumulation of EGCG3 Me [9], indicating that the biosynthesis of methylated catechins may be a complex process mediated by the dynamic balance of putative enzymes at the transcript and protein levels. However, the transcriptional regulatory mechanism of methylated catechin biosynthesis is still unclear.
Many transcription factors (TFs) are known to be involved in the biosynthesis of flavonoids in different plant species; these include members of the MYB, WRKY, bHLH, and WD40 families [13][14][15][16]. In tea plants, potential TFs controlling catechin biosynthetic pathways have been identified, but there has been little functional characterization. The WRKY TFs are a type of protein that is structurally defined by the presence of one or two WRKY domains [17]. Numerous studies indicate that WRKYs can control target gene expression via interaction with specific DNA sequences, and they are involved in plant growth and adaptation to biotic and abiotic stresses [18][19][20]. More importantly, several members of the WRKY TFs are involved in flavonoid biosynthesis. For example, MdWRKY11 was found to participate in the biosynthesis of flavonoids in apple (Malus) [14], and a grapevine TTG2like WRKY is reported to regulate flavonoid biosynthesis and vacuolar transport [21]. Previously, we identified two WRKY TFs that may negatively affect methylated EGCG biosynthesis. However, whether WRKYs in tea plants can upregulate the biosynthesis of polyphenols, particularly methylated catechin, is not clear.
In this study, the common tea cultivar 'Fuding Dabaicha' and the two high-EGCG3 Me tea cultivars 'Jinmudan' and 'Mingke 1' were used for transcriptomic analysis. A positive transcriptional activator termed CsWRKY57like was identified, which may be associated with methylated catechin biosynthesis. A dual-luciferase assay was carried out to investigate the transcriptional regulation by CsWRKY57like of promoters of three genes related to methylated EGCG biosynthesis (CCoAOMT, CsLAR, and CsDFR) in tobacco. ChIP-PCR and EMSA were used to evaluate the binding activities of CsWRKY57like to the promoters of CsLAR, CsDFR, and CCoAOMT. Our findings provide new evidence for the WRKY-mediated regulation of methylated EGCG biosynthesis in tea plants.

FPKM expression analysis of CsWRKY57 genes among different cultivars
WRKY TFs play an important regulatory role in plant metabolism. To further study the possible role of the CsWRKY genes in the biosynthesis of methylated EGCG, three tea cultivars with different methylated EGCG contents (Figure 1b, 'Fuding Dabaicha', 'Mingke 1', and 'Jinmudan') were subjected to transcriptomic analysis. Previous studies have identified and characterized fiftynine WRKY genes in the tea genome (http://www.pla ntkingdomgdb.com/tea_tree/) [22]. We performed an FPKM statistical analysis, in which FPKM values were calculated to indicate the expression of CsWRKY57like [|log2(Fold Change)| > 0]. Interestingly, our results suggested that CsWRKY57like had higher expression in 'Mingke 1' and 'Jinmudan' cultivars than in 'Fuding Dabaicha' (Table S2).

CsWRKY57like belongs to the IIc sub-group of the WRKY family
According to our RNA-seq and Camellia sinensis var. sinensis (CSS) (pcsb.ahau.edu.cn:8080/CSS/) databases, one full-length WRKY gene, designated CsWRKY57like, was upregulated in 'Mingke 1' and 'Jinmudan'. The cDNA length of CsWRKY57like is 891 bp, and it encodes 296 amino acids. The molecular weight and pI values of CsWRKY57like are 32.05 kDa and 6.01, respectively. Analysis of its amino acid sequence suggested that CsWRKY57like possesses a highly conserved WRKY domain and a zinc-finger motif (C2H2 type) at the C terminus ( Figure 2a).
The WRKY family can be divided into three major groups (I-III) based mainly on the number of conserved WRKY domains and the type of zinc finger structure. Group II is further divided into five subclasses (IIa-IIe). Phylogenetic analysis showed that CsWRKY57like is an IIc sub-group member, along with AtWRKY48, AtWRKY23, and AtWRKY12 ( Figure 2b). AtWRKY23 and AtWRKY12 are capable of regulating plant secondary metabolism [23], suggesting the possible involvement of CsWRKY57like in the regulation of secondary metabolism in tea plants.

Analysis of CsWRKY57like expression and sub-cellular localization
Similar to the FPKM expression data, RT-qPCR analysis showed that the transcript levels of CsWRKY57like were significantly higher in 'Jinmudan' and 'Mingke 1' than in the control cultivar 'Fuding Dabaicha' (Figure 1c), consistent with the EGCG3 Me contents of these cultivars. These results imply that CsWRKY57like may positively regulate methylated EGCG biosynthesis in tea plants.
Transcriptional regulation usually occurs in the nucleus. To explore the subcellular localization of CsWRKY57like in vivo, the coding region of CsWRKY57like was inserted into a GFP vector, and the GFP-Empty plasmid was used as the control. Both CsWRKY57like-GFP and the control plasmid were transiently expressed in tobacco leaves. As shown in Figure 3a, a GFP signal was observed throughout whole cells expressing GFP-Empty control, whereas CsWRKY57like-GFP was expressed only in the nucleus. This result implies that CsWRKY57like is a nuclear protein, which is characteristic of a typical TF.

CsWRKY57like possesses trans-activation ability
To investigate whether CsWRKY57like possesses transactivation activity, we analyzed the transcriptional activity of CsWRKY57like in yeast cells and tobacco. The coding region of CsWRKY57like was inserted into pGBKT7. As shown in Figure 3b, the yeast expressing pGBKT7-CsWRKY57like grew well on SD plates (−Trp-His-Ade), as did the positive control (p53 + T-antigen).
This yeast-based assay indicates that CsWRKY57like possesses trans-activation ability.
A dual-luciferase reporter system was used to examine the transcriptional activity of CsWRKY57like in tobacco. The full-length sequence of CsWRKY57like was inserted into the pBD vector to generate pBD-CsWRKY57like, and the empty pBD and VP16 vectors were used as the negative and positive controls, respectively. As shown in Figure 3c, after expression in tobacco, both pBD-CsWRKY57like and VP16 showed a markedly higher LUC/REN ratio than the pBD-empty vector control ( Figure 3c). This luciferase assay indicates that CsWRKY57like may act as a transcriptional activator.

Three genes related to methylated EGCG biosynthesis are direct targets of CsWRKY57like
Based on the above findings, we wanted to understand the potential transcriptional regulatory mechanism by which CsWRKY57like affects methylated EGCG biosynthesis. Understanding whether CsWRKY57like can directly target genes involved in methylated EGCG biosynthesis might be interesting. Previous studies show that WRKY proteins directly target the W-box ciselements in CsLAR, CsDFR, and CCoAOMT [9]. To explore whether CsWRKY57like could specifically recognize and directly bind to the promoters of genes related to methylated EGCG biosynthesis, an EMSA assay was performed. The probes covered about 50 bp that contained the cis-acting element (C/T)TGAC(T/C) in the predicted promoter sequence (Figure 4a). The coding region of CsWRKY57like (from amino acid position 145-222) was cloned into the pGEX-4T construct and transformed into Escherichia coli strain BL21(DE3). The CsWRKY57like recombinant protein was successfully expressed and purified (Figure 4a). An EMSA assay demonstrated that CsWRKY57like was capable of binding to the promoters of CsLAR, CsDFR, and CCoAOMT via the W-box motif, and the binding disappeared upon addition of excess unlabeled competitor probes. By contrast, GST protein  could not bind sequences containing the cis-acting element (C/T)TGAC(T/C). This assay demonstrates that CsWRKY57like specifically binds to W-box cis-elements in the promoters of CsLAR, CsDFR, and CCoAOMT, indicating that genes involved in methylated EGCG biosynthesis are likely to be direct targets of CsWRKY57like.
The ability of CsWRKY57like to bind directly to target genes was further verified by an in vivo ChIP-qPCR assay using a polyclonal anti-CsWRKY57like antibody. As expected, compared with the IgG control, the CsLAR, CsDFR, and CCoAOMT promoter regions containing the cis-acting element (C/T)TGAC(T/C) were significantly enriched in anti-CsWRKY57like groups (Figure 4b). Taken together, these data illustrate that CsWRKY57like binds directly to the cis-acting element (C/T)TGAC(T/C) and directly targets methylated EGCG biosynthesis.

CsWRKY57like regulates promoter activities of three genes related to methylated EGCG biosynthesis
Dual-luciferase assays were performed to further understand the regulation of three genes related to methylated EGCG biosynthesis by CsWRKY57like. The promoter sequences of CsLAR, CsDFR, and CCoAOMT were cloned into the 0800-LUC plasmid, and CsWRKY57like was cloned into the PEAQ plasmid. These plasmids were transiently expressed in tobacco plants. As shown in Figure 5, CsWRKY57like showed a significantly higher LUC/REN ratio than the empty vector expressed together with the CsLAR, CsDFR, or CCoAOMT promoter. This result implies that CsWRKY57like is capable of activating the promoters of genes related to methylated EGCG biosynthesis.

CsWRKY57like interacts with CsVQ4
A large number of studies have shown that WRKY proteins can interact with different types of proteins, including VQ and MAPK proteins [24,25]. To investigate whether CsWRKY57like might also interact with VQ motif-containing proteins, a Y2H library assay was performed. The candidate VQ motif-containing protein CsVQ4 was identified. As shown in Figure 6a and b, the yeast cells transformed with the negative control (pGBKT7-empty) and pGBKT7-CsVQ4 did not grow on SD plates, indicating that CsVQ4 showed no α-galactosidase activity (Figure 6a). Therefore, the full-length coding sequence of CsVQ4 was cloned into the pGBKT7 vector, and CsWRKY57like was cloned into the pGADT7 vector for a Y2H assay. As shown in Figure 6b, the cells expressing BD-CsVQ4 + AD-CsWRKY57like grew well on SD plates (−Trp-His-Ade-Leu), as did the positive control (p53 + Tantigen). However, the negative control did not grow on SD plates (Figure 6b). This Y2H assay indicates that CsWRKY57like physically interacts with CsVQ4.
The interaction between CsWRKY57like and CsVQ4 was further analyzed by a BiFC assay. In this experiment, the coding regions of CsWRKY57like and CsVQ4 were cloned into YCE and YNE vectors, respectively. As shown in Figure 6c, either CsWRKY57like-YCE and CsVQ4-YNE or CsWRKY57like-YNE and CsVQ4-YCE showed a strong yellow fluorescent signal in tobacco, unlike single fusion proteins, which produced no yellow fluorescent signal (Figure 6c). These results demonstrated that CsWRKY57like could interact with CsVQ4 in the nucleus ( Figure S1) and may co-regulate the biosynthesis of methylated EGCG.

Discussion and conclusions
The WRKYs play indispensable roles in regulating plant physiological processes and are widely studied in plant stress resistance pathways [26,27]. For instance, Arabidopsis WRKY52 was found to be involved in responses to disease and stresses [28]. CsWRKY2 was suggested to function in the drought stress response of tea plants [29]. Among the TF families, MYB and bHLH TFs have been extensively reported to participate in plant metabolic regulation [30,31]. Recent studies have shown that WRKY TFs regulate several plant secondary metabolites, including flavonoids, alkaloids, and other substances [23,32,33]. Previous studies have shown that the biosynthesis of tea flavonoids is jointly regulated by structural genes and regulatory factors. In terms of methylated EGCG biosynthesis, the relevant structural genes have already been explored. However, the involvement of WRKY TFs in promoting the biosynthesis of methylated EGCG has been unclear. Therefore, we performed a systematic study of WRKY transcription factors in the tea plant (Figure 7).
In the present research, the WRKY subfamily IIc TF CsWRKY57like was identified from tea. We found that the expression of CsWRKY57like correlated well with the elevated accumulation of EGCG3 Me (Figure 1). This finding prompted us to further investigate the function of CsWRKY57like in the biosynthesis of methylated EGCG in tea plants. A phylogenetic analysis showed that CsWRKY57like is clustered in the same clade with AtWRKY12 and AtWRKY23 of the Arabidopsis IIc subfamily (Figure 2). AtWRKY12 and AtWRKY23 belong to the same family and are involved in regulating the biosynthesis of lignin and flavonoids [32,34]. Proteins of the same family usually have similar functions [35], suggesting that CsWRKY57like may regulate the biosynthesis of secondary metabolites in tea plants. However, it should be pointed out that studying the protein level of CsWRKY57like will be important in the future.
In general, TFs can only play their transcriptional regulatory role in the nucleus. Our assay showed that CsWRKY57like is a nuclear protein (Figure 3a), which is consistent with the nuclear localization of TFs [36]. WRKY TFs can act as either transcriptional activators or transcriptional repressors to activate or inhibit target gene expression and regulate different biological processes [37,38]. For example, WSR1 is a novel WRKY transcription factor in Brassica napus that regulates leaf senescence by activating the expression of senescence-associated gene 14 [39]. In this study, we provided evidence that CsWRKY57like controls the biosynthesis of methylated EGCG by regulating the transcription of three genes related to methylated EGCG biosynthesis: CsLAR, CsDFR, and CCoAOMT. We found that CsWRKY57like acts as a transcriptional activator (Figure 3b and c) and can specifically bind to the cisacting element (C/T)TGAC(T/C) in the CsLAR, CsDFR, and CCoAOMT promoters (Figure 4), activating their transcription. This indicates that CsWRKY57like may positively regulate EGCG3 Me biosynthesis ( Figure 5). Taken together, our results provide new evidence about how WRKY TFs regulate methylated EGCG biosynthesis and are consistent with previous research suggesting that WRKYs may regulate the biosynthesis of catechins, L-theanine, and caffeine in tea plants [40,41].
Recently, flavonoid biosynthesis was shown to be controlled by transcriptional complexes such as MYB-bHLH-WD40 (MBW) in tea plants. Thus, MYB was capable of interacting with bHLH and WD40 to co-regulate the biosynthesis of important secondary metabolites [35,42]. Similarly, extensive studies have established that WRKY TFs, particularly subfamily I or IIc WRKYs, interact with several different types of proteins, such as VQ proteins [43,44]. In the current study, our Y2H and BiFC assays revealed that CsWRKY57like physically interacts with the VQ protein CsVQ4 (Figure 6). This finding demonstrates that CsVQ4 may collaboratively regulate the biosynthesis of EGCG3 Me. In general, interaction with VQs may differentially affect the properties and DNA-binding activities of the interacting WRKYs. For instance, Arabidopsis WRKY33 can interact with two VQ motif-containing proteins to enhance its DNA binding activity [45]. Therefore, it would be interesting to investigate whether CsVQ4 could repress or activate the properties of CsWRKY57like in future studies. In addition, the biosynthesis of catechins is controlled by many TFs and is believed to be a complicated biological process [46]. Therefore, identifying and dissecting other regulators, such as MYBs, bHLHs, and MADS, as well as their functions associated with methylated catechin biosynthesis, will be interesting and important for future research. Therefore, our findings have shed light on the regulatory network of methylated EGCG biosynthesis.
In summary, we demonstrated that the WRKY TF CsWRKY57like from C. sinensis may positively regulate methylated EGCG biosynthesis. CsWRKY57like acts as a transcriptional activator of three genes involved in methylated EGCG biosynthesis by specifically interacting with their promoters. Furthermore, CsWRKY57like may physically interact with CsVQ4. We propose that CsWRKY57like collaborates with CsVQ4 and participates in the metabolic regulation of methylated catechin biosynthesis (Figure 7). Taken together, our findings shed light on the regulatory network of catechin biosynthesis, as well as the transcriptional regulatory mechanism of secondary metabolism in tea plants. They suggest a potential strategy for the breeding of specific tea cultivars with high methylated EGCG levels.

Plant materials
The three tea cultivars 'Jinmudan', 'Mingke 1', and 'Fuding Dabaicha' were grown at the experimental station of Gao Qiao tea garden, Changsha, Hunan, China. Fresh tea leaves (two leaves and a bud) were obtained from each cultivar, and three biological replicates were taken for each sample. The samples were stored at −80 • C for further use after flash freezing in liquid nitrogen.

cDNA synthesis, sequencing, and real-time PCR analysis
Total RNA was isolated from each frozen tea leaf sample using the RNeasy Mini kit (Qiagen, Hilden, Germany) according to the manufacturer's protocol. RNA quantity and purity were measured by 1.0% agarose gel electrophoresis and spectrophotometry. Genomic DNA was removed from the total extracted RNA, which was then reverse-transcribed into first-strand cDNA following the manufacturer's protocol.
According to our transcriptome FPKM expression data related to catechins and EGCG3 Me biosynthesis, the upregulated WRKY gene CsWRKY57like was identified and selected. The full-length coding region of CsWRKY57like was amplified and confirmed by sequencing. The gene sequence was blasted at NCBI to identify homologous sequences. GeneDoc and ClustalX software were used for sequence alignment, and MEGA5 software was used for phylogenetic tree construction. The primers used are described in Table S1.
qRT-PCR was performed to measure gene expression levels. The reactions were performed in a total volume of 20 μL, and the PCR conditions were as described in our previous study [9,47]. The relative gene expression levels were calculated using the formula 2 − Ct . The primers are described in Table S1.

Subcellular localization
To study the subcellular localization of CsWRKY57like, it was transiently expressed in tobacco. The full length of CsWRKY57like was cloned into a GFP vector. The primers are described in Table S1. 35S-CsWRKY57like-GFP and control 35S-GFP were transiently transformed into 1month-old tobacco leaves as described previously [9,48]. Fluorescent images were obtained with a Zeiss Axioskop 2 Plus microscope after 2-3 days of transient expression.

Transcriptional activation in yeast cells
Transcriptional activities of CsWRKY57like and CsVQ4 were assayed in yeast cells. The full-length sequences of the genes were independently cloned into the pGBKT7 vector (Clontech, USA). Primer sequences are shown in Table S1. All constructs were separately transformed into yeast cells and plated on different media (SD/−Trp or SD/−Trp-His-Ade), as the transcriptional activities of CsWRKY57like and CsVQ4 depend on the growth status and α-galactosidase activity of yeast cells. Three biological replicates were performed for all transcriptional activation assays.

Protein purification and EMSA assay
The cDNA sequence of CsWRKY57like (from amino acid position 145-222) covering the WRKY protein domain was cloned into the pGEX-4T vector (Amersham Biosciences) and expressed in E. coli strain BM Rosetta (DE3). The quality of the recombinant CsWRKY57like protein was estimated by SDS−PAGE, and it was then stored at −80 • C for the EMSA assay. The primers are described in Table S1.
Probes including the cis-acting element (C/T)TGAC(T/C) from the promoters of CsLAR, CsDFR, and CCoAOMT were labeled with biotin. The EMSA experiment was performed as described previously [48,49]. In brief, the assay mixtures of CsWRKY57like protein and biotinlabeled probes were incubated together, and the DNA-CsWRKY57like protein complexes were then analyzed by 6% native PAGE according to the manufacturer's protocol.

CsWRKY57like-specific antibody production and ChIP-qPCR analysis
The coding region of CsWRKY57like (from amino acid position 136-240) was cloned into the pET-B2M construct and transformed into E. coli strain BL21(DE3). The recombinant CsWRKY57like protein was induced, affinity purified, and separated by SDS-PAGE. Antibodies to CsWRKY57like were produced by Jinkairui Biotechnology Company (Wuhan, China). The ChIP-qPCR assay was performed as described by Fan et al. [49]. In brief, fresh tea leaves were crosslinked in 1% formaldehyde and then neutralized with glycine (0.125 M). The chromatin was pretreated by sonication and sheared to an average length of 500 bp. Immunoprecipitation was performed using a specific anti-CsWRKY57like antibody. The preimmune serum IgG was used as the negative control. Protein A/agarose beads were used to capture the DNAprotein-antibody complex for 1 h at 4 • C, followed by pelleting and washing of the beads. The immunoprecipitated material was eluted by gently rotating the reverse crosslinking immunoprecipitated DNA. After treatment with proteinase K, the immunoprecipitated DNA was purified and eluted. The DNA immunoprecipitated by the anti-CsWRKY57like antibody was amplified by qRT-PCR. The percentage of IP/Input was calculated by determining 2 − Ct (=2 −[Ct(IP)−Ct(Input)] ). The primers are described in Table S1.

Dual-luciferase reporter assay.
To assess the transcriptional activity of CsWRKY57like in tobacco, the full-length coding sequence of CsWRKY57like was cloned into the pBD vector. The pBD vector is a double-reporter vector that contains a GAL4-LUC and an internal control REN, as described previously [48,49]. VP16 was used as a positive control.
To assess the interaction of CsWRKY57like with the promoters of genes related to methylated EGCG biosynthesis, the CsWRKY57like coding region was cloned into the pEAQ vector as an effector, and the promoters of CsLAR, CsDFR, and CCoAOMT were ligated into the 0800-LUC vector as reporters. The primers are described in Table S1. Agrobacterium EHA105 containing all the constructs was transformed into tobacco. LUC and REN luciferase were analyzed after 3 days of infiltration using a dual-luciferase assay kit (Promega, USA). The transactivation ability of CsWRKY57like was indicated by the luciferase activity.

Y2H and BiFC assay
For the Y2H assay, full-length coding sequences of CsWRKY57like and CsVQ4 were cloned into pGADT7 and pGBKT7 vectors, respectively. The primers are described in Table S1. After testing for self-activation ability, all constructs were introduced into yeast strain Y2HGold for interaction assays according to the manufacturer's instructions. Interaction assays were performed on different media (SD/−Trp-Leu or SD/−Trp-His-Ade-Leu), and the yeast growth status and α-galactosidase activity were measured. Three biological replicates were performed.
Full-length coding sequences of CsWRKY57like and CsVQ4 were separately cloned into the yellow fluorescent protein (YFP) vectors pSPYCE and pSPYNE, respectively. The primers are described in Table S1. Agrobacterium EHA105 containing all the constructs was transformed into tobacco. YFP fluorescent images were obtained under a fluorescence microscope.

Statistical analyses
All assays were performed with three biological replicates. All data in this research are presented as Mean ± S.E. The statistical significance of differences between means was tested by Student's t-test, and p < 0.05 or p < 0.01 were considered significant.