Phl p 5, a 29 kDa major allergen from timothy grass pollen, is one of the most reactive members of group 5 allergens. Its sequence comprises two repeats of a novel alanine-rich motif (AR) whose structure and allergenic response are still mostly unkown. We report here a structural characterization of an immunodominant fragment of Phl p 5, Phl p 5(56–165) which comprises the first AR repeat. Recombinant (r)Phl p 5(56–165) was expressed in Escherichia coli, purified to homogeneity and shown to be sufficient to react with serum IgE from 90% of grass pollen allergic patients. Using NMR spectroscopy, we show conclusively that the fragment forms a compact globular domain which is, however, prone to degradation with time. The rPhl p 5(56–165) fold consists of a four-helix bundle held together by hydrophobic interactions between the aromatic rings and aliphatic side chains. This evidence gives clear indications about the structure of the full-length Phl p 5 and provides a rational basis for finding ways to stabilize the fold and designing therapeutic vaccines against grass pollen allergy.
Grasses are among the most potent sources of allergens both because they produce large amounts of pollen and because they are widely distributed in temperate climate areas where a higher population density is usually concentrated (Esch, 1999). Allergens from various grasses and corn pollens have been classified according to their cross-reactive potential into different groups in order of their discovery (King et al., 1995). Group 5 contains some of the most important grass pollen allergens which are recognized by IgE antibodies of almost all grass pollen allergic patients and induce strong IgE antibody responses and heavy clinical reactions (e.g. asthma) in sensitized patients (Suphioglu et al., 1992; Vrtala et al., 1993, 1998; Bufe et al., 1998). cDNA cloning and sequence comparison of group 5 allergens from various grass and corn species have revealed high sequence homology as the molecular basis for the cross-reactivity of IgE antibody and T cell responses to group 5 allergens (Valenta et al., 1996; Müller et al., 1998; Niederberger et al., 1998; Burton et al., 1999). Phl p 5, a group 5 allergen from timothy grass, is a protein of 29 kDa whose cDNA was first identified by Vrtala et al. (Vrtala et al., 1993). Since then, it was shown that Phl p 5 exists as different isoforms (isoallergens) which can be classified into two major groups, phl p 5a and b, with high sequence homology and similar allergenic activities (van Neerven et al., 1999). Recombinant phl p 5 (rphl p 5) proteins have been produced and their immunological properties shown to be comparable to those from natural sources (Niederberger et al., 1998). Ribonuclease activity has been attributed to phl p 5b and localised in the C-terminus of the protein (Bufe et al., 1996).
Understanding the structural rules that govern allergen–antibody recognition is a crucial step for designing recombinant versions of hypoallergenic variants which could then help to develop therapeutic strategies and/or specific vaccines against Type I allergies (Valenta et al., 1999). The molecular basis of the phl p 5/IgE recognition can therefore be clarified only by a detailed description of the structure of this allergen both as an isolated protein and in its IgE complex. However, only very little structural information is currently available for phl p 5. This may be partially due to the instability of certain recombinant group 5 allergens: in a crystallization trial, the full-length rphl p 5b degraded into smaller fragments, only one of which, spanning the last 133 C-terminal residues, would crystallize (Bufe et al., 1996). Circular dichroism studies have shown that phl p 5a has prevalently a helical secondary structure (Flicker et al., 2000).
In this work, we characterized an N-terminal fragment of phl p 5a, phl p 5(56–165), comprising about one-third of the full-length phl p 5. The fragment was identified by a monoclonal IgE isolated from a combinatorial IgE library that had been constructed from lymphocytes of a grass pollen allergic patient (Steinberger et al., 1996). phl p 5(56–165) has been shown to represent an immunodominant portion of phl p 5 and to contain IgE and T cell epitopes (Steinberger et al., 1996; Burton et al., 1999). The corresponding region of the rye grass homologue (Lol 5) is also a major IgE-reactive fragment (Vrtala et al., 1998). A recombinant version of phl p 5, rphl p 5(56–165), strongly induces basophil histamine release and immediate type skin reactions in grass pollen allergic individuals (Flicker et al., 2000). Here, we analysed IgE and IgG subclass recognition of purified rphl p 5(56–165) using sera from grass pollen allergic patients and provided, for the first time, direct evidence supported by extensive NMR structure analysis combined with molecular modelling, that rphl p 5(56–165) forms a compact structural domain with a four-helix bundle fold.
Materials and methods
Expression and purification of rPhl p 5(56–165)
rphl p5(56–165) was produced as described previously (Flicker et al., 2000). In short, the construct was amplified by PCR techniques, using the full-length cDNA clone as a template and expressed in Escherichia coli BL21 (DE3) (Novagen). Cells were suspended in 25 nM imidazole, pH 7.4, 0.1% Triton X-100, lysed by lysozyme (20 μg/g cells) and freeze–thawing. Bacterial DNA was digested by DNAse1 (0.1 mg/g cells) incubated at room temperature for 20 min. After centrifugation of the lysate (20000 g for 20 min), ammonium sulphate (50%, w/v) was added to the supernatant. The precipitate was resuspended in 10 mM Na2PO4 at pH 7.0 and desalted with a Sephadex PD10 column (Pharmacia). The eluate was loaded on an Sp-Sepharose column (Pharmacia) and eluted with a 0–0.5 M NaCl gradient. Fractions enriched for the protein were dialysed against 10 mM Tris buffer at pH 10, loaded on a diethylaminoethylcellulose-Sepharose column (Pharmacia) and eluted with a 0–0.5 M NaCl gradient. rphl p 5(56–165), as pure as estimated both by 12% SDS–PAGE and by mass spectrometry, was desalted using a Sephadex PD10 column. Protein concentrations were determined by UV spectrophotometry using the protein extinction coefficient calculated according to von Hippel and Gill (von Hippel and Gill, 1989).
IgE and IgG subclass reactivity to rPhl p 5(56–165) and IgE competition experiments
Enzyme-linked immunosorbent assay (ELISA) plates (Nunc, Roskilde, Denmark) were coated with freshly purified rphl p 5(56–165) (5 μg/ml PBS) at 4°C overnight. The plates were first washed twice with PBS, 0.05% Tween-20 and blocked for 3 h with PBS, 1% BSA, 0.05% Tween-20 at room temperature. They were then incubated with sera from grass pollen allergic patients which were diluted 1:10 and 1:50 in PBS, 0.5% BSA, 0.05% Tween-20 for detection of IgE and IgG1–4 antibodies, respectively, at 4°C overnight. The plates were finally washed five times with PBS, 0.5% BSA, 0.05% Tween-20. Bound immunoglobulins were detected with mouse monoclonal anti-human IgE and IgG1–4 antibodies (Pharmigen, San Diego, CA) and a horseradish peroxidase-coupled sheep anti-mouse antiserum (Amersham, Amersham, Buckinghamshire, UK) by ELISA (Vrtala et al., 1996). Serum samples (50 μl aliquots) were diluted 1:1 with PBS and pre-incubated with 3 μg of freshly purified rphl p 5(56–165) or, for control, with 3 μg of BSA (diluted in PBS to a concentration of 1 μg/μl) overnight at 4°C. The levels of specific IgE directed against the complete rphl p 5 allergen from timothy grass, rye grass and Kentucky bluegrass pollen extracts were quantified by CAP-fluoro enzyme immunoassay (CAP-FEIA) measurements (Pharmacia, Uppsala, Sweden) as described (Kazemi-Shirazi et al., 2000). The percentage inhibition of IgE binding was calculated under the assumption that pre-incubation with a protein that is immunologically unrelated to grass pollen proteins (i.e. BSA) will not reduce reactivity of IgE to grass pollen allergens and hence corresponds to 100% binding. The reduction of binding achieved by pre-incubation of sera with rphl p 5(56–165) versus the binding obtained after pre-incubation of sera with a control protein (BSA) is expressed as percentage inhibition. The percentage inhibition was calculated as follows:
NMR samples typically consisted of 0.5–1 mM protein samples dissolved in H2O–D2O (9:1) with 20 mM phosphate buffer at pH 7.0. The spectra were recorded at 25°C using either a Varian 600 MHz UNITY-PLUS or a Bruker 800 MHz spectrometer. 2D NOESY and TOCSY experiments were performed using 80 and 70 ms mixing times, respectively. A 3D NOESY-HSQC experiment with 80 ms mixing time was performed on a 15N-labelled sample. The water signal was suppressed using a WATERGATE sequence (Piotto et al., 1992). The spectra were transformed and analysed using FELIX software.
Structure prediction and modelling
A multiple alignment of the homologous allergen sequences was obtained using CLUSTALX (Thompson et al., 1997) and modified manually when necessary. Secondary structure predictions were attempted by submitting the multiple alignment to the PHD server (Rost and Sander, 1993). The two sequence repeats were identified by psi-Blast (Altschul and Koonin, 1998).
The following procedure was adopted to identify a suitable template for modelling. First, extensive search through the sequence databases was carried out by psi-Blast (Altschul and Koonin, 1998) but no homology could be spotted outside the phl p 5 family. The sequence of phl p 5(56–165) was then used as a query sequence to the UCLA–DOE structure prediction server at URL http://www.doe-mbi.ucla.edu/∼frsvr/frsvr.html (Fischer and Eisenberg, 1996). This query gave a fold prediction of cytochrome c′ from Rhodopseudomonas palustris (PDB code 1A7V) (Shibata et al., 1998) together with an initial alignment of the sequences (Z score 2.6).
A structural model of phl p 5(56–165) was built on 1A7V using QUANTA (Release 4.0, MSI). The experimental NMR restraints were analysed with respect to this model, which suggested how the sequence alignment could be amended. This procedure was iterated, changing the alignment at each step in an attempt to produce a model which satisfied all the NMR restraints.
rPhl p 5(56–165) represents a major IgE epitope-containing grass pollen allergen domain
The sera from 48 grass pollen allergic patients were analysed by ELISA for the presence of IgE antibodies against both the full-length protein and the rphl p 5(56–165) fragment. Forty-three of these sera (89.6%) exhibited IgE reactivity to both molecules whereas shorter synthetic peptides (e.g. a peptide comprising residues 97–128 of phl p 5) did not bind IgE (data not shown). When IgE and IgG subclass reactivity to the rphl p 5 (56–165) fragment was analysed, an interesting diversification of the IgE and IgG subclass recognition of the allergen domain was observed as exemplified in Table I for six representative patients. Patient 1 exhibits IgE, IgG1 and IgG2 subclass but almost no IgG3 or IgG4 reactivity to the molecule. Certain patients contained IgE and IgG antibodies of each of the subclasses to rphl p 5 (56–165) (e.g. patient 11), whereas others (e.g. patients 25 and 29) lacked specific IgG4 or (e.g. patient 14) had very low specific IgG3 antibodies. This finding demonstrates that IgE and IgG subclass responses in allergic patients can be directed to different epitopes on the allergen molecule. The latter can be explained by the activation of a great variety of B cell clones by different epitopes of the allergen during the sensitization process. Since allergen-specific IgE antibodies react to epitopes different from those recognized by IgG, it is unlikely that they have evolved by sequential class switch from B cells producing initially allergen-specific IgG.
Next, we determined the percentage of phl p 5 specific IgE response which can be pre-adsorbed from sera with rphl p 5(56–165). Quantitative IgE competition experiments performed by 29 phl p 5 reactive sera showed that rphl p 5(56–165) accounts on average for 29% of the phl p 5 reactive IgE antibodies. Moreover, rphl p 5 (56–165) cross-reacted with allergens present in pollens of other grass species. Preadsorption with the domain depleted on average 21% of rye grass pollen and 14% of Kentucky blue grass pollen-specific IgE from the sera (Table II).
These results prove that the rphl p 5(56–165) fragment binds a significant percentage of allergen-specific IgE of grass pollen allergic patients and thus the fragment contains a relevant percentage of the full-length phl p 5 IgE reactivity.
rPhl p 5(56–165) has a well defined tertiary structure
Sequence analysis of phl p 5 reveals that the protein, and the whole group 5 allergens, are assembled by two repeats of an AR motif (alanines account for ∼30% of the amino acid composition) (Figure 1). Only one AR repeat is observed in group 6 allergens. In phl p 5, the two repeats comprise approximately residues 55–174 and 183–284 and are probably the consequence of gene duplication. Besides close phl p 5 homologues, the AR motif does not share detectable sequence homology with any other known protein, making it difficult to predict its structure. However, alanines have a very strong tendency to form helices (Vila et al., 2000) and such an unusual composition is found for instance in trans-membrane helices and in antifreeze proteins (Davies and Sykes, 2000; Eilers et al., 2000). Accordingly, the far-UV circular dichroism spectra of both rphl p 5 and of rphl p 5(56–165) are typical of α-helical proteins (Flicker et al., 2000), with an estimated helical content of ∼60%. Secondary structure prediction based on a multiple alignment of group 5 allergens from different species suggests the approximate position of the helical regions along the sequence (Figure 1).
The presence of tertiary structure was probed by NMR spectroscopy. The proton spectrum of rphl p 5(56–165) shows excellent signal dispersion typical of globular domains (Figure 2a). In particular, the presence of several resonances shifted in the region 0.5–0.0 ppm is typically caused by the persistent interaction of aliphatic chains with aromatic rings expected for a well-folded protein (Wütrich, 1986). We can therefore conclude that rphl p 5(56–165) is an independently folded globular domain.
As the full-length protein, also rPhl p 5(56–165) degrades with time
Although thermal denaturation curves show that rphl p 5(56–165) is thermally stable and that its unfolding is mostly reversible (Flicker et al., 2000), the protein degrades quickly with time. Four samples from independent batches of preparation were tested. The average lifetime of each sample was 1–2 weeks. During this time, the solutions became progressively cloudy and a precipitate appeared. The samples were analysed by mass spectrometry and N-terminal sequencing typically at time 0 and 1–2 weeks after sample preparation. The pattern of degradation observed by mass was consistent with progressive degradation from the N-terminus (data not shown). Such a behaviour, which could not be prevented by varying experimental conditions, was reflected directly by the NMR spectra. Analysis of 2D NMR experiments recorded on independently prepared batches of protein showed the same general features but spectra recorded on a protein batch a few days old progressively contained additional peaks which were absent in fresh samples (data not shown).
Direct NMR evidence shows that rPhl p 5(56–165) is a four-helix bundle
Because of the progressive degradation, spectral assignment constituted a real challenge. A set of homonuclear 2D NMR experiments and a 15N NOESY-HSQC spectrum were recorded on fresh aliquots of three independent batches of unlabelled protein and on a 15N-labelled sample, respectively. The quality of the heteronuclear spectrum can be appreciated in Figure 2b. Reliable resonance assignment could be achieved unambiguously only for the regions L66–A76, A87–A100, K128–A140 and E146–G165 (52% of the total sequence) for which a clear match to NOESY and TOCSY connectivities was possible. Interruption of the assignment was caused by peak overlap and could not be confidently extended further. All the residues in the assigned regions are in an α-helical conformation as directly supported by the Hα secondary chemical shifts and by the presence of HN–HN(i, i + 1), Hα–HN(i, i + 3), Hα–HN(i, i + 3) and Hα–HN(i, i + 3) connectivities (Figure 2c). These results are in excellent agreement with the predicted location of secondary structure elements (Figure 1). The unassigned regions are in predicted loops and at the N-terminus of the predicted helix 3.
Direct indication about the tertiary fold could be obtained from the unambiguous assignment of several long-range contacts (38 restraints) involving eight out of the nine aromatic side chains (Figure 2c). These contacts are consistent with the domain being formed by four interrupted helices which fold into a globular four-helix bundle domain held together by hydrophobic interactions between the aromatic protons and other aliphatic side chains.
A molecular model of the first AR motif can be inferred from experimental evidence
The modelling of the phl p 5(56–165) fold was based on the following facts. (a) The domain folds into a compact (although prone to degradation) structure formed by not fewer than four helices as supported by the experimentally identified long-range contacts. (b) Of the several topologies in principle possible for a four-helix bundle, we can exclude those that require a long connectivity between adjacent helices 1–2 and 3–4 (Figure 3a). These linkers are not long enough to allow inversion of the helix direction. (c) Further restraints on the possible topologies are imposed by the length of the linker between helices 2 and 3 in the second AR repeat (Figures 1 and 3a). We must therefore conclude that the domain consists of a four-helix bundle with an up–down–up–down topology, a fold never observed before for an allergen. A helix wheel representation based on the NMR restraints was used to predict the relative orientation of the helices and their packing (Figure 3b). Orientation of the helices in such a way to satisfy the experimental long-range contacts observed experimentally also leads to maximization of the number of buried hydrophobic residues.
Although no known structure shows appreciable sequence homology with phl p 5, a sound, albeit low-resolution 3D model of the domain can be prepared using as a template the structure of cytochrome c′ from Rhodopseudomonas palustris (Shibata et al., 1998). This structure was the highest hit suggested by a threading algorithm and, since its secondary structure elements are of length comparable to those of phl p 5(56–165), the alignment of the two sequences requires relatively few insertions/deletions. The resulting model, which derives from the alignment shown in Figure 3a, satisfies all the NMR restraints (Figure 4a). Interestingly, when adopting this alignment, while the hydrophobic core is buried in the domain interior, two adjacent hydrophobic residues (Leu135 and Ile162) remain exposed, suggesting a potential surface of interaction with the second AR repeat.
We have characterized rphl p 5(56–165), a fragment of phl p 5 which comprises about one-third of the full-length protein and contains one of the two repeats of a novel sequence AR motif. The allergenic properties of rphl p 5(56–165) were measured quantitatively and compared with those of the full-length protein. We show that almost all grass pollen allergic patients contain IgE antibodies against rphl p 5(56–165) and a high percentage of phl p 5-specific IgE is recognized also by rphl p 5(56–165), in agreement with previous work which demonstrated that rphl p 5 induces effector cell activation, basophil histamine release and immediate-type skin reactions (Flicker et al., 2000). The rphl p 5(56–165) fragment is therefore a major allergenic agent on its own.
We have further shown that rphl p 5(56–165) is folded and acts as an independently folded unit. phl p 5 provides another example of preferential recognition of conformational epitopes by IgE antibodies as demonstrated for an increasing number of respiratory allergens (Vrtala et al., 1997, 1999; Focke et al., 2001): only the full-length rphl p 5(56–165) is recognized by IgE antibodies whereas smaller phl p 5-derived peptides failed to react with IgE antibodies (data not shown).
Structure determination of both the full-length protein and/or its fragments has proved difficult. In our hands, rphl p 5(56–165) is unstable and prone to degradation. This behaviour did not, however, come as a surprise: already in 1996, it was observed that recombinant phl p 5 is unstable in solution and degrades into fragments of variable length between 10 and 29 kDa (Bufe et al., 1996). Of these, only the fragment rphl p 5(180–313) was able to crystallize while the fragment described in the present study was not identified as a direct product of degradation. These observations strongly suggest that the instability is a property intrinsic to recombinant phl p 5 rather than being accidentally related to the experimental conditions. The behaviour observed for rphl p 5(180–313) is strongly reminiscent of apolipophorin III, a lipid-associated four-helix bundle, which also folds as a four-helix bundle prone to degradation in solution (Wang et al., 1997). It is likely that in vivo also phl p 5 needs to be stabilized either by a cofactor or by a specific environment, which is lost in the recombinant protein.
We have presented the first direct experimental evidence to allow reconstruction of the main features of its fold using an approach that combines experimental restraints with molecular modelling. The NMR observables (i.e. secondary chemical shifts, short-range NOE connectivities), together with a number of unambiguous long-range side chain contacts, conclusively show that rphl p 5(56–165) forms a compact four-helix bundle with an up–down–up–down topology. The fold is held together by an intricate network of aromatic and hydrophobic interactions which form the hydrophobic core (Figures 3b and 4a). This conclusion is in excellent agreement with a preliminary report on the crystal structure of a rphl p 5b(180–313) fragment (which spans the second AR repeat), also described to form a four-helix bundle with similar topology (Bufe et al., 2000). We can therefore conclude that phl p 5 is an all-α protein mostly assembled by two four-helix bundles packing against each other or intertwined by domain swapping (Figure 4b). Interactions between the two repeats are supported by the reported capability of rphl p 5(180–313) to dimerize (Bufe et al., 2000) and by the presence of conserved exposed hydrophobic side chains on the surface of phl p 5(56–165) (Figure 4). Interestingly, despite the fold duplication, the N- and C-terminal domains seem to have a different IgE response (Bufe et al., 1994). A helical bundle fold, never described before in allergens, will therefore need to be added to the list of observed topologies (Aalberse, 2000). Because of the high sequence homology, group 5 and 6 grass pollen allergens are all expected to share the same fold.
While further studies are still necessary to explore and characterize the structural basis of the allergenic response of phl p 5, the information presented in this paper should provide useful guidelines for designing both more stable and hypoallergic versions of phl p 5 or its parts which can then be used as vaccines.
This study was supported by grant Y078GEN from the Austrian Science Fund and by the ICP project of the Austrian Ministry of Education and Science. O.M. was supported by a NATO fellowship.