Chemical and transcriptomic analyses of leaf trichomes from Cistus creticus subsp. creticus reveal the biosynthetic pathways of certain labdane-type diterpenoids and their acetylated forms

Abstract Labdane-related diterpenoids (LRDs), a subgroup of terpenoids, exhibit structural diversity and significant commercial and pharmacological potential. LRDs share the characteristic decalin–labdanic core structure that derives from the cycloisomerization of geranylgeranyl diphosphate (GGPP). Labdanes derive their name from the oleoresin known as ‘Labdanum’, ‘Ladano’, or ‘Aladano’, used since ancient Greek times. Acetylated labdanes, rarely identified in plants, are associated with enhanced biological activities. Chemical analysis of Cistus creticus subsp. creticus revealed labda-7,13(E)-dien-15-yl acetate and labda-7,13(E)-dien-15-ol as major constituents. In addition, novel labdanes such as cis-abienol, neoabienol, ent-copalol, and one as yet unidentified labdane-type diterpenoid were detected for the first time. These compounds exhibit developmental regulation, with higher accumulation observed in young leaves. Using RNA-sequencing (RNA-seq) analysis of young leaf trichomes, it was possible to identify, clone, and eventually functionally characterize labdane-type diterpenoid synthase (diTPS) genes, encoding proteins responsible for the production of labda-7,13(E)-dien-15-yl diphosphate (endo-7,13-CPP), labda-7,13(E)-dien-15-yl acetate, and labda-13(E)-ene-8α-ol-15-yl acetate. Moreover, the reconstitution of labda-7,13(E)-dien-15-yl acetate and labda-13(E)-ene-8α-ol-15-yl acetate production in yeast is presented. Finally, the accumulation of LRDs in different plant tissues showed a correlation with the expression profiles of the corresponding genes.

In the initiation step, enzyme-induced multistep carbocation rearrangements set the foundation for terpene cyclization.Proliferation occurs when the intermediate ion attacks π-electrons from an adjacent C=C bond, generating another carbenium ion.This process can be repeated sequentially, depending on the availability of other C=C bonds in the correct orientation within the substrate.Termination can occur via various alternative routes that may involve: (i) deprotonation of the terminal carbocation, leading to C=C bond formation, or (ii) quenching with a nucleophile, resulting in the formation of alcohol or a diphosphate ester moiety (Bohlmann et al., 1998;Christianson, 2008;Ueberbacher et al., 2012).The vast chemical diversity of natural metabolites is a direct result of the alternative initiation and termination reactions facilitated by TPS enzymes.
Dephosphorylation of the terpenoid diphosphate moiety usually occurs through a single-step reaction in a TPS-based pathway.Class I TPSs utilize a conserved trinuclear metal cluster to facilitate the molecular recognition of the substrate diphosphate group and initiate catalysis.This process is triggered by the ionization of the isoprenoid diphosphate substrate, leading to the formation of an allylic cation and inorganic pyrophosphate (PPi) (Wendt and Schulz, 1998;Aaron and Christianson, 2010).However, alternative pathways exist (Sun et al., 2016), involving endogenous phosphatases, Nudix hydrolases, or even a combination of both.A subset of the Nudix hydrolase superfamily have been shown to act as monophosphatases by removing the first phosphate group on nonnucleoside diphosphate-bearing terpenic substrates (Magnard et al., 2015;Henry et al., 2018;Liu et al., 2018;Li et al., 2020;Sun et al., 2020;Bergman et al., 2021).
The terpenic scaffold can be further modified by tailoring enzymes such as BAHD acyltransferases (Moghe et al., 2023).This superfamily of plant-specific, cytosolic (Fujiwara et al., 1998), acyl-CoA-dependent, monomeric enzymes introduces acyl groups into oxygen, nitrogen, or thiol nucleophiles, producing acyl ester derivatives.These enzymes participate in the synthesis of polymers and secondary metabolites, including lignin, volatile esters, anthocyanins, phytoalexins, and alkaloids.The identification of BAHD enzymes is based on several factors, including the presence of the HXXXD catalytic motif located in the middle of the protein sequence and the conserved C-terminal DFGWG structural motif.Members of this enzyme family typically have an average molecular mass of 48-55 kDa.They consist of ~445 amino acids, lack a signal peptide sequence, and exhibit low overall sequence identity (25-34%) (D'Auria, 2006).Prediction of the possible function of uncharacterized BAHD enzymes based solely on amino acid sequences is challenging due to the large number of various putative substrates and the low overall sequence identity among members of this family.
In previous studies conducted by our group, the trichomespecific class II diterpenoid cyclase gene responsible for encoding copal-8-ol diphosphate synthase (CcCLS) was identified (Falara et al., 2010).While the biosynthetic route of copal-8-ol diphosphate has been studied, the formation of many LRD compounds in the glandular trichomes of C. creticus remains largely unexplored.
This study reports the isolation, cloning, and functional characterization of CcLDDS1 and CcLDDS2, two distinct class II endo-7,13-CPP-encoding genes, along with CcLAT, a gene encoding a BAHD acetyltransferase capable of acetylating labdane-type diterpenoids such as labda-7,13(E)-dien-15-ol and labda-13(E)-ene-8α,15-diol.The identification of these sequences was achieved through RNA-sequencing (RNA-seq) analysis on trichomes isolated from young leaves.Subsequently, the sequences were cloned and introduced into heterologous expression systems, such as Escherichia coli, Nicotiana benthamiana, and/or Saccharomyces cerevisiae.The LRDs obtained from transformed cultures and/or plants were identified using GC-MS analysis.Moreover, the reconstitution of labda-7,13(E)-dien-15-yl acetate and labda-13(E)-ene-8α-ol-15-yl acetate production in yeast is presented.Additionally, this study reports the presence of labdanes cis-abienol, neoabienol, ent-copalol, and an as yet unidentified labdane-type diterpenoid in C. creticus subsp.creticus for the first time.Furthermore, the potential involvement of Nudix hydrolases in the dephosphorylation reaction of 8-OH-CPP and endo-7,13-CPP diphosphate esters is discussed.Finally, the accumulation of LRDs in different plant tissues (leaf stages 1-4, stem, flower, blossom, fruit, and root) and the expression profiles [quantitative real-time PCR (qRT-PCR) analysis] of the corresponding genes were examined.
The plants were cultivated under controlled temperature conditions (25/18 °C, day/night, winter; 32/20 °C, day/night, summer) and natural photoperiod.The greenhouse was equipped with climate control systems and a data logger.Tissue samples collected were stored in Falcon tubes, flash-frozen in liquid nitrogen, and kept at -80 °C until further use, including total RNA and n-hexane extraction.

Isolation of trichomes
Cistus creticus subsp.creticus trichomes were isolated from young leaves of developmental stage S2, following the protocol as described in Falara et al. (2008).Subsequently, RNA-seq and cDNA synthesis were performed on the extracted RNA from these trichomes.
The tissues were extracted three times in Erlenmeyer flasks, with each extraction using 20 ml of high-purity n-hexane (Sigma-Aldrich Co.).Cells were disrupted by applying a brief incubation period (15 min) with mild agitation (100 rpm) at room temperature, followed by water bath sonication (15 min, room temperature).MgSO 4 , used as a drying agent, was promptly removed using Whatman No.1 filter paper.The resulting 60 ml of extracted tissue samples were evaporated under a stream of air for volume reduction until they could be transferred to a 2 ml screw-cap PTFE septum GC-MS vial.

RNA extraction and cDNA synthesis
Total RNA extraction from C. creticus subsp.creticus tissues was performed using the Spectrum Plant Total RNA Kit (Sigma-Aldrich Co.).To eliminate potential genomic DNA contamination, the On-Column DNase I Digest Set (Sigma-Aldrich Co.) was employed.The trichome RNA intended for cDNA synthesis was subjected to an additional purification step using the Rneasy MinElute Cleanup Kit (Qiagen, Valencia, CA, USA).The concentration and quality of the extracted RNA samples were initially assessed using the NanoDrop™ 2000c (Thermo Fisher Scientific Inc., Waltham, MA, USA) and agarose bleach gel electrophoresis with a 2% chlorine solution (Aranda et al., 2012).Finally, the integrity of the RNA samples was confirmed through analysis on the Agilent 2100 bioanalyzer (Agilent Technologies, Inc., Santa Clara, CA, USA).
cDNAs were synthesized from 1 μg of high-quality total RNA using the Superscript III first-strand synthesis kit (Thermo Fisher Scientific Inc.).Random primers (Thermo Fisher Scientific Inc.) were incorporated into the synthesis of cDNA templates used for qRT-PCR analysis.All steps of RNA extraction, purification, and cDNA synthesis were conducted according to the manufacturer's instructions.

RNA-seq
Library preparation, transcriptome sequencing, and bioinformatic analysis of trichome RNA were conducted by Beijing Genomics Institute (BGI-Group, Shenzhen, Guangdong, China) using the Illumina HiSEq2000 genome sequencer (Illumina Inc., San Diego, CA, USA).To perform transcriptome sequencing, a minimum concentration of 0.3 μg (15 μl, 20 ng μl -1 ) of high-quality RNA was required to be shipped with dry ice.
Using Trinity (Grabherr et al., 2011), raw reads were analyzed to assemble high-quality contiguous sequences known as contigs.Unigenes were subsequently categorized into two classes through clustering: one cluster consisted of those with the prefix CL (distinct cluster) and another cluster included singletons (distinct singleton) with the prefix unigenes.The functional classification of unigenes against the Clusters of Orthologous Group (COG) database is presented in Supplementary Fig. S3.

Identification of contigs sharing high similarity with sequences of interest
Candidate contigs were fished-out utilizing BLAST searches against sequences of: (i) characterized class I, II, and I/II plant diTPSs (tBLASTn search) (selected from Supplementary Table S3), and (ii) BAHD acyltransferase-encoding genes (D'Auria, 2006), including a partial candidate sequence previously detected in C. creticus subsp.creticus trichomes EST analysis (Falara et al., 2008) (BLAST-n search).
The tBLAST-n (NCBI local BLAST) analysis, employing the Bioedit package (Hall, 1999), was conducted utilizing contigs identified from RNA-seq, set as -nucleotide database-, against diTPS protein sequences, set as -query sequences-.Additionally, a second BLAST-n search was performed against nucleotide query sequences of plant BAHD alcohol acyltransferase-encoding genes, including a partial sequence of a candidate BAHD alcohol acetyltransferase gene from C. creticus subsp.creticus.In both instances, contig sequences with a BLAST E-value of 1.0E-10 or better, based on the BLOSUM62 matrix, were retained for further analysis.Candidate contigs retrieved from local tBLAST-n (NCBI local BLAST) were translated into amino acid sequences using ExPASy (Gasteiger et al., 2003).
The translated amino acid sequences of candidate contigs were queried against the BLAST-p NCBI database (http://www.ncbi.nlm.nih.gov/BLAST/) (Altschul et al., 1990) to verify sequence identities against known protein sequences and tentatively classify those lacking the motif sequence in their partial read sequences.The top BLAST hits for each translated sequence were collected, and the presence or absence of a motif sequence is reported (Supplementary Table S4).
The ORF is defined as the part of a reading frame that can be translated into a protein.Contigs of interest were analyzed using the ORFfinder service (https://www.ncbi.nlm.nih.gov/orffinder/).Additional ORF sequences were explored by combining overlapping partial contig sequences identified in this study (Supplementary Table S4) with sequences generated from: (i) an EST analysis of C. creticus subsp.creticus trichomes (Falara et al., 2008), and (ii) an RNA-seq project on C. creticus subsp.creticus wounded tissue (BioProject ID: PRJNA1074482).
The optimal (unrooted) tree depicting phylogenetic relationships among BAHD acyltransferases (Fig. 4; Supplementary Table S5) was constructed using the Neighbor-Joining method based on the Poisson  S1).
correction model (Zuckerkandl and Pauling, 1965).The tree was drawn to scale, with branch lengths representing the number of amino acid substitutions per site, employing 100 bootstrap repetitions (Felsenstein, 1985).This analysis was based on the phylogenetic analysis of the superfamily of BAHD acyltransferases (Clade I-V) performed by D'Auria (2006).

Bioinformatics analysis
Protein molecular weight (kDa) and pI for CcLDDS1, CcLDDS2, and CcLAT were calculated using the ExPASy server (Gasteiger et al., 2003).Prediction of putative transit chloroplast peptide was carried out using the SignalP-5.0and ChloroPV1.1 online prediction servers (Emanuelsson et al., 1999(Emanuelsson et al., , 2007)).Sequence alignment was conducted using the ClustalW algorithm in the Sequence Alignment Editor of the Bioedit Software Program (Hall, 1999), as well as the AlignX program from the Vector NTI software package (Invitrogen-Thermo Fisher Scientific Inc., Waltham, MA, USA).

Isolation of sequences of interest from C. creticus subsp. creticus cDNA and cloning into expression vectors
The full-length cDNA of the CcLDDS1 ORF sequence was obtained through PCR conducted on mRNA extracted from trichomes isolated from young leaves (S2), utilizing KAPA Taq DNA Polymerase (Roche Sequencing Solutions Inc., Pleasanton, CA, USA) and a set of manually designed primers (Supplementary Table S6).Subsequently, TA cloning was performed on the resulting PCR products using both the pGEM®-T Easy Vector (Promega Corporation, Madison, WI, USA) and the pCR™II-TOPO™ Vector (Thermo Fisher Scientific Inc.) cloning systems.
The full-length cDNA of CcLAT, as well as cDNAs of CcLDDS2(Δ42) lacking the sequence corresponding to the N-plastidial signal peptide, were isolated by PCR performed on mRNA from young leaves (S2).Amplification was carried out using Phusion® Hot Start Flex DNA Polymerase (2 U μl -1 ) (New England Biolabs Inc., Ipswich, MA, USΑ) and/or KAPA HiFi HotStart DNA Polymerase (1 U μl -1 ) (Roche Sequencing Solutions Inc.).Specially designed cloning primers (Supplementary Table S6) were employed to introduce the necessary restriction site overhangs at the 5' and 3' end of the amplified gene sequences (Invitrogen-Thermo Fisher Scientific Inc.).
The vector and amplicon DNA were subjected to double digest using restriction endonucleases (for directional cloning) (New England Biolabs Inc.).The digestion mix, consisting of vector, insert, enzyme, and buffer, was allowed to incubate for ~3.5 h in a 37 °C water bath.Following incubation, a small amount of the linearized vector DNA was compared with an equivalent amount of undigested circular vector DNA through electrophoresis in a 1% agarose gel.Subsequently, T4 DNA ligase (New England Biolabs Inc.) catalyzed the ATP-aided formation of phosphodiester bonds between the sticky ends of the insert and the vector, using ATP as cofactor.Throughout the procedure, purification of PCR products was carried out using the NucleoSpin® Gel and PCR Clean-up Kit (Macherey-Nagel GmbH & Co. KG, Germany) following the manufacturer's specifications.
Prior to the initiation of gene expression studies, the gene sequences that were isolated and cloned into expression vectors were validated through DNA sequencing performed by CeMIA SA (Larissa, Greece).S1,  S2).The carbon skeleton of labdanes is presented in the lower-right corner.S3). C. creticus subsp.creticus labda-7,13(E)-dien-15-yl diphosphate (endo-7,13-CPP) synthases (CcLDDS1 and CcLDDS2) and copal-8-ol diphosphate (8-OH-CPP) synthase (CcCLS) are set in pink font color.Color-coded patterns used in tree branches differentiate among plant groups.Terpenoid synthase subfamilies TPS-c, TPS-d, TPS-b, TPS-h, and TPS-e/f are inscribed along the tree, while information on their enzymatic activity is presented in the table in the lower-right corner.Bootstrap confidence percentages >70% are shown.
The obtained results from DNA sequencing were analyzed and visualized using the BioEdit (Ibis Therapeutics, Carlsbad, CA, USA) and the ContigExpress (Thermo Fisher Scientific Inc.) software packages.

In vitro functional characterization of CcLDDS1
The mature form of CcLDDS1 was expressed in a complex with DnaK to enhance solubility in E. coli.To that end, the CcLDDS1(Δ40) cDNA sequence was successfully transferred from the pGEM-Teasy-CcLDDS1 construct into the E. coli pXCK-K vector (Kyratsous et al., 2009) utilizing specifically designed restriction enzyme overhangs (BamHI/NotI, Supplementary Table S6).
Prior to induction, bacterial starter cultures were incubated in a shaking incubator (200-250 rpm) set at 37 °C until the exponential phase (OD 600 ~0.6) was reached.Induction was tested using 0.1 mM and 0.4 mM isopropyl-β-d-thiogalactopyranoside (IPTG).Following induction, cultures were allowed to grow at 24-26 °C for an additional 14 h and 18 h.
Protein profiles of the crude, soluble, and insoluble fractions were analyzed by SDS-PAGE.The 157 kDa 6×His-tagged protein complex DnaK/CcLDDS1(Δ40) was detected in the total (crude) extract at both IPTG concentrations tested during the initial electrophoresis stage.After confirming the successful expression of the protein complex, E. coli cells were disrupted using ultrasound sonication treatment (20 s, five repetitions) and the various fractions were separated via centrifugation (13 000 g, 20 min, 4 °C).Post-centrifugation, the protein complex was detected in both the soluble (supernatant) and insoluble fractions (cell debris-pellet).
The protein complex was purified using TALON Metal Affinity Resin (Takara Bio Inc., Otsu, Shiga, Japan).Subsequently, pure CcLDDS1(Δ40) was isolated from the purified protein complex through a thrombin enzymatic reaction.

Expression of CcLDDS1 in Nicotiana benthamiana
The pART7/27 binary vector system (Gleave, 1992) was employed to build a chimeric genetic construct enclosing the complete ORF sequence of CcLDDS1 (2427 bp).To that end, the cDNA sequence of the CcLDDS1 gene was PCR-amplified from the pCRII-TOPO-CcLDDS1 construct and directionally subcloned into the pART7 expression vector (KpnI/XbaI, Supplementary Table S6).Upon sequence confirmation, the pART27-CcLDDS1 construct was utilized to transform the Agrobacterium tumefaciens disarmed strain GV3101.Additionally, as a negative control, bacterial cultures transformed with the pART27 vector were also prepared.
To ensure transient expression of the CcLDDS1 transgene, infiltrated plants were incubated for 4.5 d.Leaf samples collected from each plant were freeze-dried in a container filled with liquid nitrogen.Next, frozen leaves were ground utilizing a mortar and pestle with the addition of small amounts of liquid nitrogen.A 4 g aliquot of pulverized leaves was extracted with 3 × 4 ml of n-hexane, following the procedure outlined in Božić et al. (2015).Finally, the extracted plant tissue was analyzed by GC-MS.
Yeast transformation was performed using 1 μg of each DNA plasmid, following a slightly modified LiAc-mediated transformation method (Gietz and Schiestl, 2007).Yeast fermentation procedures were conducted according to Brückner et al. (2014) and Trikka et al. (2015a).Hexane extraction was performed as reported in Božić et al. (2015).Decane extraction, adapted from Trikka et al. (2015b), was carried out using Erlenmeyer flasks and a shaking incubator.

GC-MS
GC-MS analysis was conducted on a Shimadzu GCMS-QP2010 Ultra instrument system equipped with an AOC20i-AOC20s autosampler (Shimadzu Corp., Kyoto, Japan).For the analysis, 2 μl of samples were injected onto the column at 230 °C in splitless mode.Helium (99.999%) was used as the carrier gas with a constant flow rate of 1 ml min -1 under electron ionization (EI) mode.Compound separation was achieved on a capillary, non-polar DB-5MS column (30 m×0.25 mm internal diam-eter×0.25 μm film thickness) (Agilent Technologies, Inc.).
A GC thermal cycle of 24 min duration, as described by Morrone et al. (2011), was employed to analyze samples from organic extracts of: (i) in vitro characterization of the CcLDDS1 gene product (Fig. 5A); (ii) transformed yeast cultures producing CcLDDS1 and CcLDDS2 gene products (Fig. 7); and (iii) in vivo characterization of CcLDDS2 (Fig. 6) and CcLAT gene products (Fig. 8).
A second GC thermal cycle of 76 min duration was employed to analyze n-hexane extracts of the in vivo characterization of the CcLDDS1 gene product (Fig. 5B).Here, the oven temperature (60 °C) was initially raised by 3 °C min -1 up to 240 °C (5 min) and then raised again by 10 °C min -1 until reaching 300 °C, where it was maintained for an additional 5 min.
A third GC thermal cycle of 56 min duration was employed to analyze n-hexane extracts of C. creticus subsp.creticus tissues (Fig. 1; Supplementary Fig. S1; Supplementary Table S1).The GC thermal cycle employed consisted of the oven temperature raised from 50 °C (3 min) at a rate of 12 °C min -1 up to 160 °C, followed by an increase of 3 °C min -1 up to 230 °C, and finally by 20 °C min -1 until reaching 325 °C where it remained for 15 min.

Quantitative real-time PCR
qRT-PCRs were performed on the Applied Biosystems StepOne Real-Time of PCR System (Thermo Fisher Scientific Inc.) using the KAPA SYBR FAST qPCR Kit (Roche Sequencing Solutions Inc.) and manually designed primers producing amplicons of ~100 bp length (Supplementary Table S8).PCR amplicons were verified using DNA sequencing.
The relative transcript abundance was calculated using the equation: 2^-ΔCT (ΔCT=CT Target gene -CT Reference genes , CT: cycle threshold value).
Experiments were conducted on three biological replicates using two technical replicate measurements.Transcript abundance values, measured through qRT-PCR analysis, were expressed in arbitrary units and normalized using C. creticus subsp.creticus endogenous CcActin and CcELFa genes.Student's t-test was employed to evaluate the statistical significance of the differences in gene expression among the examined plant tissues and are depicted as *P<0.05,**P<0.01,***P<0.001with confidence level equal to 0.95.

RNA-seq
Trichomes isolated from S2 leaves were selected for RNAseq analysis based on the accumulation of LRDs and the expression of CcCLS gene in young leaves (Falara et al., 2008(Falara et al., , 2010) ) (BioProject ID: PRJEB71924).The contig and unigene statistics, along with the assembly quality, are summarized in Supplementary Table S9.Sequences of interest were categorized into three main functional classes: (Q) biosynthesis, transport, and catabolism of secondary metabolites, (S) function unknown, and (V) defense mechanisms (Supplementary Fig. S3).
Bioinformatic sequence analysis and identification of CcLDDS1, CcLDDS2, and CcLAT genes A total of 43 contigs, sharing strong sequence similarities with known genes coding for class II (DxDD) plant diTPSs, were retrieved by Basic Local Alignment Search Tool (BLAST) analysis (Supplementary Table S4).Among them, contig 68 (2824 nt) contained an ORF sequence of 2427 bp, corresponding to a potential 808 amino acid putative class II diTPS (DTDD) with a mol.wt of 92.52 kDa and isoelectric point (pI) of 5.95.Additionally, a 55 amino acid chloroplast transit peptide in the N-terminal protein domain was identified.The newly discovered diTPS was named CcLDDS1 (C.creticus Labdanetype Diterpenoid Diphosphate Synthase 1, GenBank ID: MT666220).
In an effort to enrich the number of full-length contigs, additional BLAST searches were carried out against contigs obtained from RNA-seq analysis of isolated trichomes from S2 wounded leaves of C. creticus subsp.creticus (BioProject ID: PRJNA1074482).The analysis identified contigs 936 (686 nt) and 541 (1075 nt) as parts of a single ORF sequence (2490 bp) encoding a putative class II diTPS (829 amino acids, DxDD) (Supplementary Fig. S9) with estimated mol. wt 94 kDa and pI of 5.77.Additionally, a 50 amino acid chloroplast transit peptide located in the N-terminal protein domain was identified.This putative diTPS was termed CcLDDS2 (C.creticus Labdane-type Diterpenoid Diphosphate Synthase 2, GenBank ID: MT666221).
The amino acid sequence alignment of CcLDDS1, CcLDDS2, and the previously reported CcCLS (Falara et al., 2010) is presented in Supplementary Fig. S10.The sequences of CcLDDS1 and CcLDDS2 shared similarities equal to 72% and 75% with CcCLS, respectively, while sharing only 60% similarity with each other (Supplementary Table S10).
characterized members of the TPS-b, TPS-c, TPS-d, TPS-e/f, and TPS-h families of plant diTPSs (Fig. 3; Supplementary Table S3).Protein sequences from 42 plant species, consisting of angiosperms, gymnosperms, lycophytes, a liverwort, and a bryophyte were considered in this analysis.CcLDDS1 and CcLDDS2 clustered together with CcCLS in the TPS-c subfamily branch of diTPSs, inferring their function as putative class II diTPSs (DxDD).
Contig sequences obtained from RNA-seq were compared against known BAHD acyltransferase-encoding genes employing BLAST search.A putative alcohol BAHD acetyltransferase-encoding gene sequence was identified from the trichomes EST library of C. creticus subsp.creticus in a previous study (Falara et al., 2008).BAHD acetyltransferases play an active role in the biosynthetic pathway of various terpenoids (D'Auria, 2006).Here, it was investigated whether this class of enzymes is involved in the acetylation reaction leading to the formation of labda-7,13(E)-dien-15-yl acetate and labda-13(E)-ene-8αol-15-yl acetate in C. creticus subsp.creticus.BLAST search of the partial ESTs against the contigs obtained from RNA-seq retrieved a complete putative gene sequence (contig 149, 1640 nt) and four additional partial sequences (contigs: 6138, 11 027, 3042, and 4520) ranging from 143 nt to 246 nt.
The full-length sequence was subjected to bioinformatic analysis, revealing an ORF of 1347 bp, which encoded a putative acetyltransferase consisting of 448 amino acids, with a mol.wt of 50.59 kDa and pI of 8.42.The newly identified BAHD acetyltransferase was termed CcLAT (C.creticus Labdane AcetylTransferase, GenBank ID: MT666224).CcLAT contained the characteristic motifs HXXXD and DFGWG (NFGWG) of the BAHD superfamily of acetyltransferases (Supplementary Fig. S11).
In vitro characterization of the CcLDDS1 gene product involved the overexpression of the construct pXCK-K-CcLDDS1(Δ40) in E. coli.The CcLDDS1 enzyme, following purification and incubation with GGPP, yielded a diphosphate product.Subsequently, the intermediate diphosphate product, following dephosphorylation with alkaline phosphatase, n-hexane extraction, and GC-MS analysis, revealed a peak (peak 15), which was confirmed as labda-7,13(E)-dien-15-ol by comparison with commercial MS libraries (Fig. 5A).
Functional characterization of CcLDDS2 was conducted through the yeast expression system using the pWTDH3myc-CcLDDS2(Δ42) construct.As a positive control, a transformed yeast culture expressing the SmCPSKSL1 gene was utilized (Fig. 6).

Tissue and developmental expression of genes involved in the biosynthesis of labdane-related diterpenoids
To study the regulation of gene expression, qPCR analysis was achieved by monitoring CcLDDS1, CcLDDS2, and CcLAT genes in various tissues of C. creticus subsp.creticus, including leaf stages 1-4, stem, flower, blossom, fruit, and root (Fig. 9).Overall, the highest expression levels were uncovered in all aerial parts, predominantly in young leaves and stems, while roots exhibited little to no expression.CcLAT mRNA exhibited the highest accumulation among all studied genes, predominantly in young leaves of S1, followed by older leaves and stems.Among diTPS genes, CcLDDS1 transcripts showed the highest expression levels, with greater accumulation in young leaves of S1, followed by stems and blossoms.CcLDDS2 mRNA levels were mainly accumulated in young leaves (S1-S2), followed by stems and blossoms.Finally, CcCLS transcripts showed the least expression across all tissues (Fig. 9).

Labdane-type diterpenoid biosynthesis
The elucidation of the enzyme activity of CcLDDS1, CcLDDS2, and CcLAT has greatly contributed towards advancing our understanding of the catalytic landscape involved in labdane-type diterpenoid production in C. creticus subsp.creticus.Previous research in our lab on the C. creticus subsp.creticus trichome-specific EST library led to the in vitro characterization of CcCLS, which was the first characterized class II diTPS enzyme producing oxygen-containing labdane-type diterpenoids (Falara et al., 2010).In this study, we report the functional characterization of three novel genes: (i) CcLDDS1 and CcLDDS2, two class II endo-7,13-CPP synthase-encoding genes, and (ii) CcLAT, a BAHD alcohol acetyltransferase-encoding gene involved in the biosynthesis of labda-7,13(E)-dien-15-yl acetate and labda-13(E)-ene-8αol-15-yl acetate.
Experimental parameters, such as pH conditions during the extraction process, affect scaffold stability, resulting in the nonenzymatic formation of labdanes.An illustration of this is the generation of sclareol by acid hydrolysis and/or manoyl oxide isomers by acid dehydration of 8-OH-CPP (LDPP) (Schalk et al., 2012).
A previously reported mutagenic study aimed at identifying the catalytic base responsible for the class II bifunctional active site of abietadiene synthase (AgAS) of the gymnosperm Abies grandis.That study revealed the ability of mutant AgAS:H348A/D621A to produce 8-OH-CPP, (+)-CPP, and endo-7,13-CPP intermediate metabolites (Criswell et al., 2012).Similar observations were also made in different site-directed studies of Oryza sativa syn-CPP synthase (OsCPS4, Potter et al., 2016) and Salvia divinorum clerodienyl diphosphate synthase (SdCPS2) (Pelot et al., 2017).Specifically, the OsCPS4 mutant H501 was found to produce syn-halimadienyl diphosphate (Potter et al., 2016), while modifications of four catalytic residues in the sequence of SdCPS2 enabled the production of four distinct products (Pelot et al., 2017).
These findings provide insight into the enzymatic plasticity of plant TPSs.Based on these observations, we hypothesized that the genes CcCLS, CcLDDS1, and CcLDDS2, when expressed in yeast (Fig. 7), could also produce an array of labdane-type compounds, albeit in minor amounts, due to low catalytic specificity.To test this hypothesis, a modified protocol was followed, where incubation time and volume of yeast cultures were prolonged and augmented, respectively, which by a comparative analysis of the products, revealed the presence of endo-7,13-CPP-, 8-OH-CPP-, and (+)-CPPrelated metabolites, in varying amounts, demonstrating a surprising enzymatic plasticity.
This study provides insights into the catalytic activity of the BAHD alcohol acetyltransferase CcLAT in the acetylation reaction of labda-7,13(E)-dien-15-ol and labda-13(E)ene-8α,15-diol using the yeast expression system for pathway reconstitution (Fig. 8).Future research should focus on unraveling the biosynthetic pathway of less explored ent-labdanes, such as ent-3β-acetoxy-13-epi-manoyl oxide.Furthermore, the potential involvement of Nudix hydrolases in the dephosphorylation reaction of labdane-type diterpenoid diphosphate esters should also be investigated further.

Tissue-specific accumulation of labdane-related diterpenoids and corresponding gene expression
This study examined the accumulation of LRDs (Figs 1, 2; Supplementary Fig. S1; Supplementary Table S1) and the relative expression patterns of CcCLS, CcLDDS1, CcLDDS2, and CcLAT genes in various plant tissues, including leaves, stems, flowers, blossoms, fruits, and roots (Fig. 9).The biosynthesis of LRDs predominantly takes place in the glandular capitate trichomes covering the aerial parts of the plant.The higher expression of genes correlated with the increased accumulation of metabolic products in young leaves, suggesting tissue and developmental control.
In conclusion, this study significantly advances our understanding of the biosynthetic pathways in C. creticus subsp.creticus, a non-model plant, by characterizing CcLDDS1 and CcLDDS2, two class II endo-7,13-CPP-encoding genes and CcLAT, a BAHD alcohol acetyltransferase-encoding gene involved in the biosynthesis of labda-7,13(E)-dien-15-yl acetate and labda-13(E)-ene-8α-ol-15-yl acetate.Labdane-type diterpenoids, identified in the plant's resin known as 'Ladano', are bioactive compounds characterized by a distinct decalinlabdanic scaffold.Acetylated labdanes are rarely produced by plants and have been shown to exhibit enhanced antibacterial, antifungal, and cytotoxic activities.This study presents novel labdane-type diterpenoids, including cis-abienol, neoabienol, ent-copalol, and one as yet unidentified labdane, expanding our knowledge of the plant's secondary reservoir.These findings lay the groundwork for future research, in particular the exploration of the biosynthetic pathway of production of labdane-type diterpenoids such as ent-13-epi-manoyl oxide, ent-3β-hydroxy-13-epi-manoyl oxide, and ent-3β-acetoxy-13-epi-manoyl oxide in C. creticus subsp.creticus.Labdanes present favorable physicochemical and biological properties that could potentially be used in pharmaceutical and crop protection applications.The successful reconstitution of acetylated labdanes in the yeast expression system marks an initial step towards large-scale and sustainable industrial production.In a nutshell, this work has not only further elucidated the steps of labdane-type diterpenoid biosynthesis, but it has also provided the basis for future research in the field of pharmaceutical crops.

Supplementary data
The following supplementary data are available at JXB online.Table S1.List of labdane-related diterpenoids (LRDs) identified in n-hexane extracts from various tissues of C. creticus subsp.creticus by GC-MS analysis.
Table S2.Retention index (RI) values calculated for each of the 19 peaks identified in this study compared with their corresponding RI values as cited in GC/MS libraries (FFNSC 2; Adams, 2007; massfinder 3) and the literature.
Table S3.List of known plant diterpenoid synthases (diTPSs) that were used for the phylogenetic analysis of CcLDDS1 and CcLDDS2.
Table S6.List of primers used for gene cloning in this study (manually designed).
Table S9.Summary of Contig and Unigene statistics of RNA-seq data analysis from C. creticus subsp.creticus trichomes isolated from leaves of developmental stage 2.