Structure and mechanism of an Arabidopsis medium/long-chain length prenyl pyrophosphate synthase

Prenyltransferases (PTSs) are involved in the biosynthesis of terpenes with diverse functions. Here a novel PTS from Arabidopsis thaliana is identified as a trans -type polyprenyl pyrophosphate synthase ( At PPPS), which forms a trans double bond during each homoallylic substrate condensation, rather than a homomeric C 10 -geranyl pyrophosphate synthase as originally proposed. Biochemical and genetic complementation analyses indicate that At PPPS synthesizes medium/long-chain products. Its close relationship to other long-chain PTSs is also uncovered by phylogenetic analysis. A mutant of contiguous surface polar residues was produced by replacing four charged surface amino acids with alanines to facilitate the crystallization of the enzyme. The crystal structures of At PPPS determined here in apo and ligand-bound forms further reveal an active-site cavity sufficient to accommodate the medium/long-chain products. The two monomers in each dimer adopt different conformations at the entrance of the active site depending on the binding of substrates. Taken together, these results suggest that At PPPS is endowed with a unique functionality among the known PTSs.


INTRODUCTION
Over 55,000 terpenes (isoprenoids), the largest class of plant metabolites, have been identified to be involved in numerous vital biological processes, including growth, development, and response to environment stresses ( Fig. 1) (Pichersky et al., 2006;Gershenzon and Dudareva, 2007). Terpenes also have considerable applications as pharmaceuticals, fragrances, and nutritional supplements (Kirby and Keasling, 2009). These diverse compounds are derived from the rather simple universal precursors of linear prenyl pyrophosphates (LPPs), ranging from C 10 to C 10,000 in the number of carbon atoms, which are synthesized by groups of conserved prenyltransferases (PTSs) (Kellogg and Poulter, 1997;Liang et al., 2002). The various chain lengths of these LPPs, reflecting their distinctive physiological functions ( Fig.   1), in general are determined by the highly developed active site of PTSs via condensation reactions of allylic substrates (dimethylallyl diphosphate, C 5 -DMAPP; geranyl pyrophosphate, C 10 -GPP; farnesyl pyrophosphate, C 15 -FPP; geranylgeranyl pyrophosphate, C 20 -GGPP) with corresponding number of isopentenyl biosynthesis of plant volatiles to attract pollinators, mediators in inter-plant communication, and secondary metabolites for defense (Kessler and Baldwin, 2001;Gershenzon and Dudareva, 2007). Intriguingly, enzymes possessing the GPPS activity have been identified to be either homo-or heteromeric proteins (Burke et al., 1999;Bouvier et al., 2000;Burke and Croteau, 2002;Tholl et al, 2004;Van Schie et al., 2007;Schmidt and Gershenzon, 2008;Orlova et al., 2009;Wang and Dixon, 2009;Schmidt et al., 2010), in contrast to most homomeric PTSs (Liang, 2009). We previously reported the structure of mint heterotetrameric GPPS composed of two active catalytic large subunits (LSU) and two regulatory non-catalytic small subunits (SSU) (Chang et al., 2010), which is distinct from known homomeric PTSs such as C 15 -FPP synthase (FPPS) and C 20 -GGPP synthase (GGPPS) (Chang et al., 2006;Kavanagh et al., 2006). The LSU is closely akin to the subunit of homomeric PTSs but lacks enzymatic activity on its own, and it requires the interactions with SSU to achieve a functional assembly (Kloer et al, 2006;Chang et al, 2010). The product fidelity of heterotetrameric GPPS is regulated via a regulatory loop in the SSU, which controls the product release from the catalytic LSU (Hsieh et al., 2010).
The homomeric Arabidopsis thaliana GPPS (AtGPPS) has been used as a model target for assessing the GPPS activity in angiosperm in the past decade (Bouvier et al, 2000;Lange and Ghassemian, 2003;Van Schie et al, 2007;Orlova et al, 2009). A recent study showed that A. thaliana also expresses a heteromeric GPPS distinct from the homomeric type in their subunit compositions and sequence homology (Supplemental Fig. S1) (Wang and Dixon, 2009). This discovery raises two questions: Supplemental Fig. S3 and Table S1).
To resolve this controversy, we further analyzed the phylogenetic relationships between AtPPPS and other plant PTSs (Fig. 3). The phylogenetic tree shows that AtPPPS is evolutionarily more closely related to the long-chain PTSs (e.g., C 45 -solanesyl pyrophosphate synthase (SPPS) and C 50 -decaprenyl pyrophosphate synthase (DPPS) than to the short-chain PTSs (e.g., GGPPS and FPPS). An exception, naturally, is the grouping with other angiosperm GPPSs. Previous studies suggest that the active site of PTSs has been exquisitely developed to control their substrate and product specificities (Ohnuma et al., 1996;Tarshis et al., 1996;Guo et al., 2004;Sun et al., 2005;Chang et al., 2006). Therefore, the specifically conserved amino acid sequence of PTSs has been used to predict the chain length of the final product (Kellogg and Poulter CD, 1997;Ogura and Koyama, 1998;Liang, 2009 Table S2). Its activity was subsequently measured using four allylic substrates (C 5 -DMAPP, C 10 -GPP, C 15 -FPP, and C 20 -GGPP) in the presence of C 5 -[ 14 C]IPP. Surprisingly, the reaction yielded a broad spectrum of multiple products ranging from C 25 to C 45 (Fig. 4). Except for C 5 -DMAPP, AtPPPS can recognize the other three allylic substrates and react them with C 5 -[ 14 C]IPP, resulting in similar multiple product distribution patterns having C 35 as the major product ( Fig. 4 and Table I). In the subsequent time-course assay, multiple products were detected simultaneously (i.e., not sequentially) in the chain elongation process (Supplemental Fig. S4). This observation further indicates the products of medium/long-chain lengths as synthesized by AtPPPS (Fig.4) are not a result of the longer reaction time.
Additionally, our results also imply that the released products having longer chain lengths than C 25 would have a lower frequency of re-binding to the active site for further product elongation. Based on its product distribution, we rename this enzyme as a polyprenyl pyrophosphate synthase (PPPS). Intriguingly, most PTSs are mono-functional enzymes that exclusively synthesize single chain-length products (Tarshis et al., 1996;Ogura & Koyama, 1998;Guo et al., 2004;Sun et al., 2005;Chang et al., 2006), whereas a few PTSs from Cryptosporidium parvum, Menthanobacterium thermoautotrophicum, Myzus persicae, Picea abies, Toxoplasma godii, and Zea mays also possess the catalytic promiscuity to produce more than one product (Chen and Poulter, 1993;Cervantes-Cervantes et al., 2006;Ling et al., 2007;Artz et al., 2008;Vandermoten et al, 2008;Schmidt et al., 2010).

Overall structure and active site
To further understand its function, we determined the crystal structure at 2.6-Å resolution of the wild-type AtPPPS in its apo form, denoted WT-AtPPPS ( Fig. 5A et al., 1996;Liang, 2009). A stable homodimer was also detected by gel filtration analysis in a protein-concentration independent manner (Supplemental Fig. S5), consistent with previous studies that most PTSs exist as homodimers under physiological conditions (Guo et al., 2004;Sun et al., 2005;Chang et al., 2006;Kloer et al., 2006;Hsiao et al., 2008). The dimerization interface is mainly contributed by the respective helices F and G (Fig. 5A). Each subunit is composed of 16 antiparallel α -helices (A-P) that surround the active site, with the two conserved DD(X) n D motifs (D = aspartate; X = any residue; n = 2 or 4) facing each other on helices D and J (Fig. 5A). The electron density maps of a few loop regions (residues 1-7, 35-46, 68-81, and 110-125 for chain A; residues 1-8, 35-46, 68-81, and 111-125 for chain B) were not clearly visible (Fig.   5A). The active-site region of WT-AtPPPS is embedded with highly conserved catalytic amino-acid residues to be implemented in its enzymatic reaction. The consensus catalytic mechanism of PTSs has been demonstrated to be a set of  Table I).
The SM-AtPPPS crystal solved at 2.65-Å resolution contains an octamer (chain A-H) as its asymmetric unit, comprising four identical dimers related by three orthogonal non-crystallographic two-fold axes and expressing a tetrahedral 222 symmetry ( Fig. 5B and Table II). Those four mutations generate additional intermolecular crystal contacts both within the asymmetric unit and between different octamers in the unit cell (Supplemental Fig. S7). Although the two monomers in each dimer adopt distinct conformations, a dimer as the basic assembly unit is consistent with WT-AtPPPS and other known structures of PTSs (Tarshis et al., 1996;Guo et al., 2004;Hosfield et al., 2004;Sun et al., 2005;Chang et al., 2006;Kavanagh et al., 2006;Kloer et al., 2006). Judging by its gel filtration chromatography profile, SM-AtPPPS also exists as a dimer in solution (data not shown and C 15 -FPP, in its active site ( Fig. 5C; Supplemental Fig. S8). The crystal structure of homodimeric Sinapis alba GGPPS was found to contain different ligands bound to different subunits as well (Kloer et al., 2006). The N-terminal residues and some surface loops were disordered.
Further structural analyses show that the aliphatic-tail of C 15 -FPP is located in a large hydrophobic cleft starting with the active site cavity (AC) and connecting with the elongation cavity (EC) adjacent to the dimer interface (Fig. 6A). Previous studies suggest that the regulation of product chain length specificity and substrate selectivity is determined by the size of the tunnel-shaped cleft of PTSs since the product elongation extends along the EC tunnel (Tarshis et al., 1996;Ohnuma et al., 1998;Guo et al., 2004;Sun et al., 2005;Chang et al., 2006). The bound substrate models and the critical amino-acid residues in the active site of AtPPPS are similar to those of all other known PTSs, as verified by the biochemical and crystallographic studies (Liang et al., 2002;Liang, 2009). The residues located on helices D, E, and G that surround the EC tunnel are highly conserved among long-chain PTSs (Fig. 2).
Overall, the present structural studies of AtPPPS allow an unambiguous identification of a cavity to accommodate longer products beyond C 10 -GPP (

Comparison of apo form and ternary complex
Superposition of WT-AtPPPS and SM-AtPPPS allows the identification of three notable regions with significant conformational changes (Supplemental Fig. S9). First, the disordered region of helices D-F shows ligand binding-induced conformational changes to act as a gate for substrate entry and product release, consistent with previous studies (Sun et al., 2005;Kloer et al., 2006). Second, the region connecting helices A and C has extensive conformational change. Helix B becomes an ordered  Fig. S9). Therefore, this highly mobile region may be induced to become ordered by the binding of C 5 -IPP.
Third, the orientation of the first N-terminal helix A protrudes into the top of the other subunit and seems to be involved in regulating the conformational change during the catalytic reaction (Supplemental Fig. S8 and S9). It is in accordance with an alternating catalytic mechanism in the dimer, i.e., when one subunit is in action, the other subunit is empty in its active site (Sun et al., 2005;Kloer et al., 2006). The alternating mechanism is also reflected in the asymmetric binding of different ligands to different protein subunits of the homodimeric enzyme. This kind of enzymatic regulation mechanism may be used to control the steps of substrate entry and product release. Hence, these observations in the crystal structure further explain why the basic functional unit of PTSs is a dimer instead of monomer.

The mechanism of product elongation
To investigate how the EC tunnel accommodates the long-chain products, two generates C 20 -GGPP as the major product, plus a small amount of farnesylgeranyl pyrophosphate (C 25 -FGPP) ( Fig. 6B and Table I). The shorter product chain length is consistent with the reduced size of EC tunnel as a result of the mutations. As expected, the mutant enzyme showed lower activity to recognize the C 20 -GGPP as the allylic substrate to implement the chain-elongation reaction, while the activity for C 15 -FPP was largely unaffected (Fig. 6B). The altered active-site structure might have unfavorable effect in retaining the short-chain intermediate, and therefore the enzymatic activity for reacting C 10 -GPP with C 5 -IPP was reduced.  (Zhu et al., 1997;Engprasert et al, 2004;Chang et al., 2010) was employed to investigate whether this mutant exhibits the GGPPS activity in vivo.

DISCUSSION
The plant GPPS-encoding genes have been identified in both gymnosperm and angiosperm (Burke et al., 1999;Bouvier et al., 2000;Burke and Croteau, 2002;Tholl et al., 2004;van Schie et al., 2007;Schmidt and Gershenzon, 2008;Wang and Dixon, 2009;Schmidt et al., 2010). Interestingly, enzymes exhibiting this catalytic activity can be further classified into homo-and heteromeric proteins. In contrast to the studies of homomeric proteins (Chang et al, 2006;Liang, 2009), the crystal structure of a heteromeric GPPS and its enzymatic regulation mechanism were elucidated very recently (Chang et al, 2010;Hsieh et al., 2010). The production of C 10 -GPP is the key branching point in the C 10 -monoterpene biosynthesis by which the plant volatiles with the critical bioactivities involved in plant growth, development, and defense are made (Pichersky et al., 2006;Gershenzon and Dudareva, 2007).
The model plant, A. thaliana, is generally believed to be a self-pollinating plant.
However, several pieces of evidence support that insect-mediated cross-pollination also happens in the wild population (Jones, 1971;Snape and Lawrence, 1971;Davis et al., 1998). Arabidopsis has been confirmed to synthesize plant volatiles and emits a range of these compounds from its flower (Aharoni et al., 2003;Chen et al., 2003).
Remarkably, the emission of volatiles is a major feature of most insect-pollinated flower (Dudareva and Pichersky, 2000). In addition, the C 10 -monoterpene could also protect the reproductive organs from pathogen attack or oxidative damage (Wu et al., 2006). Consequently, the presence of homomeric GPPS in A. thaliana can be responsible for providing the C 10 -GPP in the critical metabolism of C 10 -monoterpenes.
On the other hand, Wang et al. have also identified a new plastidic A. thaliana heteromeric GPPS, comprising of SSU (AtSSU) and GGPPS isoform 11 (AtGGPPS11) (Wang and Dixon, 2009 Here we showed that the homomeric GPPS in Arabidopsis should be a novel enzyme to generate multiple products with medium/long-chain lengths, rather than a GPPS as previously reported (Bouvier et al., 2000) The previous study used C 5 -DMAPP and C 5 -[ 14 C]IPP in a ratio of 2 to 1. A homolog from tomato was also proposed to possess the GPPS activity when the ratio of C 5 -DMAPP to C 5 -[ 14 C]IPP was 2.5 to 1 (Van Schie et al., 2007). In contrast, we used various ratios of 1 to 15, 1 to 14, 1 to 13, and 1 to 12 for C 5 -DMAPP to C 5 -[ 14 C]IPP, C 10 -GPP to C 5 -[ 14 C]IPP, C 15 -FPP to C 5 -[ 14 C]IPP, and C 20 -GGPP to C 5 -[ 14 C]IPP, respectively, to assure sufficient homoallylic substrates (C 5 -IPP) for the condensation reaction (see

MATERIALS AND METHODS) and determined the preferred allylic substrate of
AtPPPS. Even in this assay condition, two unexpected products of C 15 -FPP and C 20 -GGPP were detected by using radio-gas chromatography. Hence, if the sufficient C 5 -[ 14 C]IPP is provided for the continued enzymatic reaction, the short-chain products would turn out to become the medium/long-chain products.
Although AtPPPS showed some GPPS activity when the isotope signal was measured prior to performing thin-layer chromatography, it was barely detectable and insignificant. By reacting C 5 -[ 14 C]IPP with the other three allylic substrates (C 10 -GPP, C 15 -FPP, and C 20 -GGPP), the similar multiple medium/long-chain product distribution patterns were observed (Fig. 4). It is also consistent with our phylogenetic analysis that AtPPPS is closely related to long-chain PTSs and the previous studies that the long-chain PTSs generally prefer using C 15 -FPP as the allylic substrate instead of C 5 -DMAPP (Kellogg and Poulter, 1997;Ogura and Koyama, 1998;Liang et al., 2002). The long-chain PTSs possess a long hydrophobic tunnel, which has higher affinity for intermediates with a longer aliphatic-tail. Because C 5 -DMAPP has the shortest tail, it would be harder for C 5 -DMAPP to remain in the long hydrophobic tunnel and easier to escape from the active site into the bulk solvent than the other   (Soll et al., 1985). Judging by these findings, it is therefore tempting to suggest that AtPPPS may play a role in Arabidopsis ubiquinone and plastoquinone biosynthesis. 18 question regarding the physiological roles of such multiple medium/long-chain products, generated by AtPPPS. It remains to be investigated whether these multiple products can serve as precursor pools to precisely balance di-, tri-, tetra-and polyterpene metabolisms, because the dysfunctioning in the terpene biosynthesis has been reported to have a deleterious effect on plant growth and development (Orlova et al., 2009). We also survey sequences similar to AtPPPS by using the conventional sequence homology search to provide an insight for future investigation (Supplemental Table S3). In the end, the role played by AtPPPS remains yet to be clarified and our findings encourage re-evaluating its enzymatic function in the complex system of metabolite biosyntheses.

CONCLUSION
Taken together, the crystal structures of AtPPPS in combination with phylogenetic analysis and both in vitro and in vivo biochemical assays have clarified the role of this enzyme, which was thought to be a GPPS in previous studies (Bouvier et al., 2000), to be an unusual PTS that synthesizes multiple medium/long-chain products. Our results, along with the identification of a heteromeric GPPS from A.
thaliana (Wang and Dixon, 2009), suggest that the precursor C 10 -GPP for C 10 -monoterpene biosynthesis in A. thaliana may be provided only by heteromeric GPPS. The integrated approach as described here can also be a good example of how gene functions in plant terpene biosynthesis are unraveled.

Cloning and mutagenesis
The sequence information of AtPPPS was downloaded from the NCBI data library with the accession number of AT2G34630. The PCR product without its plastid targeting sequences was amplified by PCR from the A. thaliana cDNA libraries using the primers of LIC-TEV-At-F and LIC-At-R, and cloned into pET-32 Xa/LIC (Novagen) (Supplemental Table S2). The forward primer of LIC-TEV-At-F includes a Tobacco Etch Virus (TEV) protease cleavage site allowing for N-terminal fusion tag removal. To enhance the Ni-resin binding affinity, the vector WT-AtPPPS/pET-32 was constructed to insert one additional His 6 -tag by site-directed mutagenesis. The TEV protease was cloned into pET-51 Ek/LIC (Novagen) by using primers of LIC-TEV-F and LIC-TEV-R (Table S2). Other mutant constructs, SM-AtPPPS/pET-32 and AtPPPS(I99F/V62F)/pET-32, were used WT-AtPPPS/pET-32 as template and prepared by site-directed mutagenesis. The primers were showed in Supplemental Table S2.   Spring-8 (Hyogo, Japan) and BL 5A at the Photon Factory (Tsukuba, Japan). Data were processed and scaled by using HKL2000 package (Otwinowski & Minor, 1997).

Protein expression and purification
5% randomly selected diffraction data were used for caluculating R free (Brunger, 1993).  Table I. High-quality images of the molecular structures were created with PyMOL (http://www.pymol.org/).

Enzymatic assay
The biochemical assays followed similar protocols as described previously (Guo were separated on thin-layer chromatography using acetone:water (29:1) as the mobile phase.

Genetic complementation assay
The construct of pACCAR25ΔcrtE containing the crt gene cluster except the deleted crtE encoding GGPPS were prepared for identification of GGPPS activity (Zhu et al, 1997;Kainou et al, 1999;Misawa et al, 1990;Engprasert et al, 2004;Ye et al, 2007;Chang et al, 2010

Phylogenetic analysis
The full-length amino acid sequences are aligned by using ClustalW (Thompson et al., 1994). The evolution history was inferred using the neighbor-joining method (Saitou and Nei 1987). The percentage of replicate trees which were evaluated by using bootstrap test with 1000 replicates is shown next to the branches (Fesenstein J, 1985

Accession codes
Coordinates and structure factors of WT-AtPPPS and SM-AtPPPS were deposited in the Protein Data Bank (www.rcsb.org) with codes 3APZ and 3AQ0 respectively.

Supplemental Data
The following materials are available in the online version of this article.         Table I.