A novel family of lectins evolutionarily related to class V chitinases: an example of neofunctionalization in legumes.

A lectin has been identified in black locust (Robinia pseudoacacia) bark that shares approximately 50% sequence identity with plant class V chitinases but is essentially devoid of chitinase activity. Specificity studies indicated that the black locust chitinase-related agglutinin (RobpsCRA) preferentially binds to high-mannose N-glycans comprising the proximal pentasaccharide core structure. Closely related orthologs of RobpsCRA could be identified in the legumes Glycine max, Medicago truncatula, and Lotus japonicus but in no other plant species, suggesting that this novel lectin family most probably evolved in an ancient legume species or possibly an earlier ancestor. This identification of RobpsCRA not only illustrates neofunctionalization in plants, but also provides firm evidence that plants are capable of developing a sugar-binding domain from an existing structural scaffold with a different activity and accordingly sheds new light on the molecular evolution of plant lectins.

Flowering plants express a whole battery of carbohydrate-binding proteins commonly known as lectins or agglutinins. Despite the apparent heterogeneity in molecular structure and sugar specificity, virtually all known plant lectins can be classified into seven families of structurally and evolutionarily related proteins (Van Damme et al., 1998. Taking into account the obvious differences in both the overall fold and structure of the carbohydrate-binding sites, it seems likely that each of these seven sugar-binding domains is the final result of a unique evolutionary pathway. Several modern plant lectins belong to protein families with an obvious prokaryotic origin. For example, proteins sharing reasonable sequence similarity with plant lectins comprising a ricin B domain (cd00161; pfam00652) or GNA domains (cd00028; pfam01453) have been identified in bacteria as well as in various nonplant eukaryotes (for a quick overview, see the National Center for Biotechnology Information [NCBI] conserved domains [http://www.ncbi.nlm.nih.gov/Structure/cdd] and the Pfam Protein Families database [http://www.sanger. ac.uk/Software/Pfam]). Other plant lectins have no counterparts in prokaryotes but are clearly related to homologous proteins or protein domains found in animals, fungi, or some lower eukaryotes. Hevein domains (cd00035; pfam00187), for instance, are not confined to plants but are quite common in fungi, indicating that this carbohydrate-binding unit was already present in an early common eukaryotic ancestor. The same applies to the legume lectin domain (pfam00139), which is classified in the same protein superfamily as the animal and fungal vesicular integral membrane protein 36 (VIP36) and the endoplasmic reticulum-Golgi-intermediate compartment . For jacalin-related lectins (pfam01419), the situation is less clear. Recent reports claimed that the zymogen granule membrane protein 16 found in mouse, rat, and a few other vertebrates, as well as the lectin from the mushroom Grifola frondosa, belong to the jacalin family (Nagata et al., 2005). However, the residual sequence identities are low, suggesting that, even if a common ancestral domain occurred in an early eukaryote, parallel evolution eventually led to distantly related modern animal and fungal homologs of the plant jacalins. No homologous proteins or corresponding genes have hitherto been identified outside the plant kingdom for the amaranthins (pfam07468) and Cucurbitaceae phloem lectins. Although no definitive conclusions can be drawn on the basis of currently available data, it is tempting to speculate that the latter carbohydrate-binding domains evolved in plants. One of the major problems in unraveling the underlying mechanisms is the fact that no potential template protein or peptide can be traced.
Here we report the identification of a novel family of plant lectins that is structurally and evolutionary closely related to proteins that were originally described as class V chitinases, but, according to the generally accepted CAZY classification system (http:// afmb.cnrs-mrs.fr/CAZY/index.html), are placed in the glycoside hydrolase 18 (GH18) family (Henrissat and Bairoch, 1993), which is an ancient chitinase family found in all kingdoms from bacteria to fungi, animals, and plants. Only a relatively small number of plant GH18 chitinases have previously been identified. Most plant chitinases are classified, indeed, in the GH19 family, which so far has been found only in higher plants. GH18 and GH19 plant chitinases not only differ in sequence, but also in hydrolytic mechanisms because they operate with retention and inversion, respectively, of the anomeric configuration (Iseli et al., 1996). According to available sequence data, none of the plant GH18 enzymes comprises a putative chitinbinding domain in addition to the canonical catalytic domain. Cloning and characterization of the purified protein revealed that the bark of black locust (Robinia pseudoacacia) contains a lectin that shares high sequence identity with class V chitinases but is essentially devoid of chitinase activity. Closely related expressed orthologs and/or corresponding genes were found in several other legumes but could not be identified outside the family Fabaceae, indicating that the novel lectin might have arisen in an evolutionary recent past in an ancestor of modern legumes. The identification of the novel black locust agglutinin provides evidence that plants are capable of developing a domain with specific sugar-binding activity from a structural scaffold found in an existing protein and, accordingly, provides a well-defined example of neofunctionalization. Similar conversion of a chitinase into a lectin has been reported in mammalian systems. However, although both plants and mammals used a homologous chitinase as a structural scaffold, there are important differences in the conversion of a carbohydratemodifying into a carbohydrate-binding protein.  Damme et al., 1995b;Ina et al., 2005) revealed the presence of an additional lectin. Preliminary experiments indicated that this lectin was associated with a protein that eluted with a lower apparent molecular mass (70 kD) than RPbAI/robiniagrin (120 kD) from a gel filtration column and consisted almost exclusively of a polypeptide that was at least 5 kD larger than the RPbAI and robiniagrin subunits. Mass spectrometry of the purified lectin yielded a major peak of 36,790 D (see Supplemental Fig. S1). Gel filtration of the purified protein confirmed that the native lectin eluted with an apparent molecular mass of approximately 70 kD. Taking into account that a previously isolated homolog of class III chitinase from banana (Musa spp.; which is also classified in the GH18 family) behaved as a monomeric 30-kD protein when run under identical conditions (Peumans et al., 2002), one can reasonably assume that the novel lectin is a homodimer. This conclusion is supported by the hemagglutinating activity of the lectin (because crosslinking of cells requires multivalency). No covalently bound carbohydrate could be detected by the phenol sulfuric acid method, indicating that the 70-kD lectin is not glycosylated. N-terminal sequencing revealed that the subunit of the 70-kD lectin shares no sequence similarity with RPbAI or any other legume lectin, but can readily be aligned with the N terminus of a class V chitinase from tobacco (Nicotiana tabacum; Heitz et al., 1994;Melchers et al., 1994;Fig. 1). Edman degradation of a cyanogen bromide cleavage fragment yielded a sequence that could be aligned with an internal sequence of the tobacco class V chitinase, further supporting the idea that the 70-kD black locust agglutinin is not related to the legume lectins but shares a striking sequence similarity with class V chitinases.

Molecular Cloning Confirms That the Novel Lectin Is Closely Related to Class V Chitinases
BLAST searches revealed that several other legumes express proteins comprising sequences nearly identical to the N-terminal and internal sequences of the 70-kD black locust agglutinin. For Medicago truncatula, a complete contig could be assembled that comprises an open reading frame of 1,095 nucleotides encoding a 365-amino acid residue polypeptide (see Supplemental Fig. S2). Removal of a 28-residue signal peptide yields a 337-amino acid protein with an N terminus nearly identical to that of the black locust lectin and an internal sequence almost identical to the cyanogen bromide cleavage product of the lectin. To check whether the 70-kD black locust agglutinin is a genuine ortholog of the expressed M. truncatula protein, the corresponding genomic sequence was cloned. Sequencing of the PCR product confirmed that this fragment contains an open reading frame encoding a 337-amino acid residue polypeptide that shares 78.6% and 90% sequence identity and similarity, respectively, with the expressed M. truncatula protein (see Supplemental Fig. S2).
BLASTp searches using the deduced sequence as a query revealed that the black locust agglutinin yielded class V chitinases from Arabidopsis (Arabidopsis thaliana; At4g19810) and tobacco (CAA54374; NtChi) as best matches. The new black locust agglutinin shares approximately 54% identity and 80% similarity, respectively, with both At4g19810 and CAA54374 ( Fig.  1), leaving no doubt that it is a structural homolog of class V chitinases. Accordingly, the protein will further be referred to as black locust chitinase-related agglutinin (RobpsCRA). Although RobpsCRA is undoubtedly a homolog of a class V chitinase, there is apparently a major difference for what concerns the molecular structure of the native proteins because RobpsCRA is a homodimer, whereas all class V chitinases are monomeric proteins. This suggests that, unlike class V chitinases, the RobpsCRA subunits contain some structural features that promote dimerization and hence allow formation of a divalent carbohydrate-binding protein that behaves as a genuine agglutinin.
Molecular cloning also yielded additional information about the biosynthesis and processing of RobpsCRA. On the analogy of the M. truncatula ortholog, one can reasonably assume that RobpsCRA is synthesized with a signal peptide and follows the secretory pathway. The calculated molecular mass of the protein (36,747.9 D) is nearly identical to that measured by matrix-assisted laser-desorption ionization (MALDI)-time-of-flight (TOF) mass spectrometry (36,790 D). Taking into account that RobpsCRA is not glycosylated, it seems that no posttranslational processing takes place.

RobpsCRA Exhibits No Chitinase Activity But Is a Genuine Lectin
Unlike class I chitinases, class V chitinases from plants do not possess a genuine chitin-binding domain corresponding to a hevein domain. Accordingly, the agglutinating activity of RobpsCRA cannot be ascribed to the presence of a genuine or modified hevein domain.
A number of control experiments were set up to rule out the possibility that the observed agglutination Figure 1. Sequence alignment of RobpsCRA and the most closely related homologous proteins identified thus far. NtChi (CAA54374) is a catalytically active class V chitinase isolated from tobacco. At4g19810 is an expressed ortholog of Arabidopsis, but the protein has not yet been isolated and assayed for chitinase activity. In the top row, the N-terminal sequence of the native lectin (Lec-Nter) and a cyanogen bromide fragment (Lec-CNBr) are aligned with the tobacco chitinase. Identical and homologous residues are indicated by black and white boxes, respectively. Residues involved in the catalytic cleavage of chitin are indicated by black triangles. The Ser residue specifically involved in the catalytic activity of family 18 chitinases (Synstad et al., 2004) is indicated by a black square. activity might be due to contamination by the legume lectin-type bark lectin RPbAI. First, SDS-PAGE, using increasing amounts of the purified protein, yielded no additional protein band in the 29-to 32-kD range. Second, the chromatograms of the automated Edman degradation were indicative of a single sequence. Third, gel filtration experiments confirmed that the agglutination activity coeluted with RobpsCRA and hence cannot be ascribed to the larger tetrameric legume lectin-type bark lectins. Fourth, western-blot analysis indicated that RobpsCRA does not show any cross-reactivity with antibodies raised against RPbAI. Moreover, as is demonstrated below, the specificity of RobpsCRA does not match that of RPbAI and RPbAII.
Because RobpsCRA shares high sequence identity with class V chitinases, the possible enzymatic activity of the protein was checked. Concentrated solutions of the protein (final concentration 2 mg/mL) were incubated with carboxymethyl-chitin-Remazol-Brilliant-Violet 5R at different pH values ranging between 4.0 and 7.0. Even upon incubation for 72 h, no acid-soluble fragments were generated, indicating that RobpsCRA is devoid of chitinase activity. It should be mentioned here that two genuine class V chitinases isolated from tobacco leaves inoculated with Tobacco mosaic virus exhibited readily measurable catalytic activity when assayed with the same substrate (Melchers et al., 1994).
Agglutination assays with animal red blood cells demonstrated that RobpsCRA is a genuine lectin. Trypsintreated human erythrocytes (type A) were agglutinated at a lectin concentration of approximately 20 mg/mL. Hapten inhibition assays indicated that the agglutination activity of RobpsCRA is not affected by any simple sugar. Chito-oligosaccharides with chain lengths up to 4 GlcNAc units also could not prevent agglutination, indicating that the lectin activity of RobpsCRA does not rely on binding to chitin-like compounds. Only some animal glycoproteins, like thyroglobulin, inhibited the agglutination of human erythrocytes by RobpsCRA. Although indicative, the results of these preliminary inhibition assays did not allow any conclusion to be drawn with respect to the carbohydrate-binding specificity of the lectin. Therefore, more appropriate techniques, based on direct measurements of lectin-glycan interactions, were employed to unravel the fine specificity of RobpsCRA.

RobpsCRA Specifically Binds High-Man N-Glycans
Glycan array analysis revealed that RobpsCRA binds exclusively to some, but not all, high-Man-type N-glycans. As shown in Table I, lectin reacted most strongly with Man5-9mix, which is a mixture of high-Man N-glycans differing in the number of Man residues and the nature of the bonds between the individual Man units. Besides the Man5-9mix, RobpsCRA also reacted well with individual high-Man N-glycans. Man6, Man5, and Man8 were approximately 30% less reactive than the Man5-9mix, whereas Man7 and Man9 were roughly 5 times less active than the mix-ture (Table I). None of the oligomannosides tested showed any reactivity. The same applies to chitotriose. These findings clearly indicate that the specificity of RobpsCRA is directed toward the core pentasaccharide of N-glycans.
Although the results of glycan array screening experiments are only semiquantitative, they clearly demonstrate that the specificity of the lectin is directed toward high-Man N-glycans comprising the core pentasaccharide of N-glycans. Therefore, it is important to realize that the results of glycan arrays are based on a direct binding assay and, accordingly, give a fairly good idea of the relative affinity of the lectin for a very large set of glycans. At present, no conclusions can be drawn with respect to the affinity of RobpsCRA for the N-glycans. The figures obtained with the different high-Man N-glycans are relatively low (,4,000 relative fluorescence units [RFU]) as compared to other lectins (up to .50,000 RFU). However, these low values might partly be due to poor coupling of the fluorochrome to the lectin.
It should be emphasized here that the specificity of RobpsCRA differs from that of the previously described legume lectin-type bark agglutinins. The agglutination activity of RobpsCRA cannot be inhibited, indeed, by any simple sugar, whereas RPbAI and the self-aggregatable lectin lose their hemagglutinating activity in the presence of GalNAc and GlcNAc/Man, respectively. Moreover, RPbAI-type isolectins interact strongly with complex-type N-glycans, but are nonreactive toward high-Man N-glycans (Van Damme et al., 1995b). This difference in specificity confirms that the observed lectin activity of RobpsCRA is not due to contamination by another bark lectin.

Molecular Modeling of RobpsCRA and Tobacco Class V Chitinase
To find possible clues for the obvious lack of chitinase activity, the overall fold and structure of RobpsCRA was tentatively determined by molecular modeling. Because previously no structure was available for plant class V chitinase, the model was built using the coordinates of a human chitotriosidase (hMChi), which, of all resolved GH18 proteins, shares the highest sequence identity/similarity with RobpsCRA (Fusetti et al., 2002). Sequence alignments indicated that RobpsCRA and hMChi share 33% and 51% identity/similarity, respectively, over a 305-amino acid residue overlap (spanning residues 17-322 of RobpsCRA; Fig. 2), which is reasonably high considering the relatively low overall sequence identity/similarity within the GH18 family. Moreover, it should be emphasized here that, in spite of the apparent low overall sequence identity, the residues involved in binding of the substrate, as well as those involved in the catalytic reaction, are markedly conserved between all members of the GH18. Therefore, hMChi can be considered a suitable model to predict the structure of RobpsCRA. Because the modeling studies were primarily intended to explain the lack of chitinase activity of RobpsCRA, a parallel set of modeling experiments was set up with the tobacco class V chitinase (NtChi; Melchers et al., 1994) as an example of a class V chitinase with enzymatic activity.
Hydrophobic cluster analysis (HCA) yielded similar plots for RobpsCRA and the enzymatically active chitinases NtChi and hMChi (see Supplemental Fig. S3), providing additional support for hMChi as a suitable model for predicting the structure of plant homologs. The three-dimensional models built for RobpsCRA and NtChi could adopt a very similar TIM-barrel fold as hMChi. This TIM-barrel fold consists of an inner crown of b-sheet strands surrounded by an outer crown of a-helices (Fig. 3). An additional hairpin-shaped loop built from three antiparallel strands of the b-sheet protrudes from one edge of the TIM-barrel structure. A major functional feature of family 18 chitinase proteins is a central groove that accommodates a chitin chain through stacking interaction between the pyranose rings of the GlcNAc units and hydrophobic residues lining the groove over its entire length (Fig. 3). The catalytic site is located at one end of the groove, where it forms a strong electronegatively charged area (Fig. 3). Structural studies combined with mutational analysis of hevamine (Bokma et al., 2002) and chitinase B from Serratia marcescens (Synstad et al., 2004), for example, allowed the unambiguous identification of the amino acid residues involved in the catalytic activity of GH18 chitinases. All these chitinases possess the canonical DxDxE motif in the core of their catalytic site. In addition, several other motifs/residues (e.g. YD motif and a Ser residue, Ser-69 in NtChi and RobpsCRA, involved in the substrate binding in family 18 chitinases) located at different positions in the polypeptide chain are essential for activity ( Figs.  1 and 2).
Despite the obvious overall structural similarity with genuine GH18 chitinases, the structure of RobpsCRA exhibits striking differences especially with respect to the solvent-exposed hydrophobic residues that are positioned along the chitin-binding groove and ensure proper stacking of the chitin chain onto the chitinases. Most of the hydrophobic residues lining the 25, 24, 23, 11, and 13 subsites of hMChi (Rao et al., 2005) are replaced by hydrophilic residues in RobpsCRA. Trp residues Trp-10, 24,25,11, and 13 of hMChi are replaced by Lys-3, Ser-45, Gly-75, and Asp-191, respectively, in RobpsCRA. As a result of the replacement in RobpsCRA of the Trp residues by hydrophilic residues, the overall conformation and physicochemical properties (hydrophilicity, charges) of the groove are strongly altered as compared to those of the chitinbinding groove of hMChi. Moreover, due to the lack of hydrophobic residues, the chitin chain cannot properly stack into the groove of RobpsCRA. Most probably, the inability to accommodate a chitin chain in the catalytic groove can explain why RobpsCRA, in spite of the presence of the canonical catalytic acidic residues, is completely devoid of chitin-binding activity. It should be emphasized, indeed, that the DxDxE catalytic motif is perfectly conserved in RobpsCRA (Asp-112, Asp-114, and Glu-116; Fig. 3). Moreover, the hydrophobic environment of this catalytic region is also conserved in RobpsCRA. Therefore, RobpsCRA might still be capable of cleaving the scissile glycosidic bond linking sugars bound to subsites 21 and 11. However, the protein does not act as a chitinase because the substrate cannot be positioned in the catalytic groove. Failure to properly bind chitin as the underlying mechanism for the lack of chitinase activity of RobpsCRA is further supported by the results of parallel modeling of the catalytically active tobacco homolog NtChi.  Unlike in RobpsCRA, most of the hydrophobic residues found in the groove of hMChi are conserved in NtChi. Only Trp-10 and Trp-78 (of hMChi) are replaced by Lys-4 and Gly-74, respectively, in NtChi (Fig. 3F). Accordingly, the groove of NtChi is fully capable of properly positioning a chitin chain for cleavage by the catalytic motif Asp-111, Leu-112, Asp-113, Trp-114, Glu-115.

Expressed Orthologs of RobpsCRA Are Common in Legumes But Not Found in Other Plants
As already mentioned above, several other legumes express closely related orthologs of RobpsCRA. Complete or nearly complete contigs could be assembled for M. truncatula, Glycine max, and Lotus japonicus. All four proteins share 62.5% identity and 82.4% similarity, respectively, within a 301-amino acid residue overlap (see Supplemental Fig. S2).
In addition, a previously described, but only partially characterized, 67-kD homodimeric lectin from bean (Phaseolus vulgaris) seeds (Ye et al., 2001) also might represent a RobpsCRA ortholog. However, no genuine orthologs of RobpsCRA could be identified in protein, expressed sequence tag, or genomic databases of any other plant species. Although no definitive conclusions can be drawn from the available sequence data, it is evident that RobpsCRA-type lectins are far less widespread among flowering plants than class V chitinases and most likely are confined to a relatively small taxonomic group. Within the Fabaceae, RobpsCRA orthologs occur in at least four different tribes of the subfamily Papilionoideae (black locust, G. max, M. truncatula, and L. japonicus belong to the Robinieae, Phaseoleae, Trifolieae, and Loteae, respectively), indicating that they are not confined to a small taxon of the Fabaceae.

DISCUSSION
Biochemical analyses and molecular cloning demonstrated that a minor lectin from the bark of black locust is a catalytically inactive homolog of class V chitinases. The lectin behaves as a genuine hemagglutinin and specifically binds, albeit with a relatively low Figure 2. Alignment of the amino acid sequences of RobpsCRA, tobacco class V chitinase NtChi, human chitotriosidase hMChi, and the chitinase-related murine protein Ym1. Identical and homologous residues are indicated by black and white boxes, respectively. Residues involved in the catalytic cleavage of chitin are indicated by black triangles, and residues lining up the catalytic groove of genuine chitinases are indicated by black circles (hydrophobic residues) and asterisks (hydrophilic residues). The Ser residue specifically involved in the catalytic activity of family 18 chitinases (Synstad et al., 2004) is indicated by a black square.
affinity, to high-Man N-glycans. Activity assays using dye-labeled substrates indicated that the protein is devoid of chitinase activity. Molecular modeling and sequence comparisons indicated that the apparent lack of catalytic activity most probably has to be ascribed to the protein's inability to accommodate the chitin substrate in its catalytic groove as a consequence of an extensive replacement of hydrophobic by hydrophilic amino acids.
Although there is no doubt that RobpsCRA is structurally and evolutionarily related to class V chitinases, the available sequence data are insufficient to trace the details of the conversion of a plant chitinase into a lectin. However, even in the absence of full details, one can reasonably assume that RobpsCRA evolved from a GH18 chitinase and not the other way around. GH18 represents an ancient chitinase family because it is found in all kingdoms from bacteria to fungi, animals, and plants (Iseli et al., 1996). Moreover, sequence data clearly indicate that all GH18 chitinases, including those from plants, have a common ancestor. This, taken together with the fact that GH18 chitinases or corresponding genes have been found in numerous plant species, implies that the GH18 structural scaffold is quite common in higher plants. Considering the apparent confinement of RobpsCRA orthologs to the legume family, it is tempting to speculate that a catalytically active chitinase from an ancient legume species or possibly an earlier ancestor served as a structural scaffold for the development of a small family of carbohydrate-binding proteins. This evolutionary process most likely involved gene duplication followed by neofunctionalization. The RobpsCRA orthologs represent a documented example of how plants managed to develop a domain with a specific sugar-binding activity from a functionally unrelated protein in general or an enzyme in particular. Therefore, it should be emphasized that the novel lectin no longer recognizes the substrate of the original hydrolase (in casu chitin or chito-oligosaccharides) but a structurally unrelated glycan (namely, high-Man N-glycans). Interestingly, a similar conversion of a glycosyl hydrolase into a carbohydrate-binding, but catalytically inactive, homolog also occurred in higher animals. A so-called eosinophil chemotactic cytokine has been identified in mouse  and human (Boot et al., 1995) that shares approximately 50% sequence identity with the respective conspecific chitotriosidase but is devoid of chitinase activity. Instead, the mouse protein (referred to as ECF-L or secretory protein Ym1) is a lectin with binding specificity to glucosamine and heparin/heparan sulfate Sun et al., 2001). Although there is certainly a parallel between the conversion of a chitinase into a lectin in animals and plants, there are two major differences. First, in contrast to RobpsCRA, the canonical DxDxE motif is replaced in Ym1 by a motif in which two catalytic acidic residues are substituted (Asn-115, Leu-116, Asp-117, Trp-118, Gln-119;Tsai et al., 2004;Fig. 2). Second, Ym1 binds glucosamine and heparin/heparan sulfate, whereas RobpsCRA interacts exclusively with high-Man N-glycans.
Identification of RobpsCRA orthologs as lectins not only adds a novel carbohydrate-binding domain to the existing list of documented plant lectin families but also illustrates that plants are capable of developing sugar-binding domains from an existing structural scaffold with different activity. At present, the binding affinity of RobpsCRA is relatively low. However, RobpsCRA might just be an intermediate in an evolutionary pathway that eventually will yield lectins with a high affinity. Even if there is no evidence of whether analogous evolutionary mechanisms might have given rise to other carbohydrate-binding domains that are confined to plants, the discovery of a lectin ortholog of class V chitinases puts the evolution of plant lectins in a novel perspective. In addition, legume RobpsCRA orthologs represent a novel example of well-defined neofunctionalization in plants.
It is also worth noting in this context that at least two different cases have been reported of neofunctionalization-related evolutionary events whereby plants used the structural scaffold of a lectin domain for the development of a protein that lost sugar-binding activity but acquired totally different biological activity. Curculin from Curculigo latifolia fruits is a homolog of the GNA-related lectins that possesses no sugar-binding activity but has sweet-tasting properties (Harada et al., 1994). Seeds of several Phaseolus species contain structural homologs of the bean lectin that have no carbohydrate-binding activity but are potent a-amylase inhibitors or insecticidal proteins (called arcelins; Mirkov et al., 1994). It should be emphasized that these lectin homologs are not just binding-defective mutants but proteins with well-defined biological activity. Other binding-defective lectins have been identified in the bark of the legume tree Cladrastis lutea and Sambucus nigra, but the respective legume lectin and type 2 ribosome-inactivating protein homologs have no known biological activity other than a presumed storage function (Van Damme et al., 1995a. . Acidic residues responsible for the electronegative character of the binding groove of chitinases are labeled in white. E to G, Enlarged ribbon diagrams of the catalytic groove of hMChi (E), NtChi (F), and RobpsCRA (G). Catalytic residues (pink) and residues lining up the catalytic groove (orange for hydrophobic and blue for hydrophilic residues) are in stick representation.

Plant Material
Bark was stripped from the stems of 4-to 5-year-old black locust (Robinia pseudoacacia) trees at the end of the winter (beginning of April). All trees were from the same clone (growing in the garden of W. Peumans). The inner bark was collected using a knife, taking care to remove the outer corky bark tissue, cut in small pieces, and stored at 220°C until use.

Isolation of RobpsCRA
Because RobpsCRA is only a very minor bark protein and has a relatively low affinity for carbohydrates, the lectin could not be purified from crude or partially purified bark extracts by affinity chromatography. Therefore, a purification scheme was developed based on a combination of conventional protein purification techniques and affinity chromatography. In a first step, a partially purified protein fraction was isolated by cation ion-exchange chromatography. This protein fraction was in a second step depleted from the major bark lectin (RPAbI) by affinity chromatography on immobilized Gal and concentrated by hydrophobic interaction chromatography. Further purification was achieved by gel filtration and affinity chromatography on immobilized thyroglobulin. Gal and thyroglobulin were coupled to Sepharose 4B by the divinylsulfon method.

Extraction and Removal of the Major Legume Lectin-Type Bark Agglutinins
Batches of 1 kg of frozen bark were transferred into 5 L of a solution of 20 mM acetic acid and, after thawing, homogenized in a blender. The homogenate was passed through a sieve (mesh size approximately 1.5 mm) and centrifuged (3,000g; 10 min). To the supernatant, solid CaCl 2 was added (1 g/L) and the pH adjusted to 9.0 with 1 N NaOH. After standing overnight in the cold (2°C-4°C), the precipitate was removed by centrifugation (3,000g; 10 min). The cleared extract was adjusted to pH 3.2, centrifuged (3,000g; 10 min), and filtered through filter paper. The filtrate was diluted with an equal volume of water and applied onto a column (5 cm 3 10 cm; approximately 200-mL bed volume) of S Fast Flow (Amersham Biosciences) and preequilibrated with 20 mM acetic acid. Loading was continued until the column was 70% saturated with protein. Then the column was washed with 50 mM sodium formate (pH 3.8) until the A 280 fell below 0.1 and the bound proteins eluted with 500 mL 0.2 M NaCl in 0.1 M Tris-HCl (pH 8.7). After regeneration of the column, ionexchange chromatography was repeated with new batches of the crude extract. The eluates of the different runs were pooled, adjusted to 1.8 M ammonium sulfate with solid salt, and centrifuged (9,000g for 15 min). Aliquots of the supernatant (equivalent to approximately 10,000 A 280 units) were applied onto a column (5 cm 3 15 cm; approximately 300-mL bed volume) of Gal-Sepharose 4B. Under these conditions, the major legume lectin-type bark agglutinin (RPAbI) was quantitatively retained on the column. After passing the protein fraction, the column was washed with a solution of 1.8 M ammonium sulfate (adjusted to pH 7.5 with 1 N HCl) until the A 280 fell below 0.1. The run-through and wash fractions were collected and pooled. After washing, the lectin was desorbed with 20 mM unbuffered 1,3diaminopropane and the Gal-Sepharose 4B column regenerated for the next run. The pass-through and wash fractions of the different runs were pooled and rechromatographed on the same column Gal-Sepharose 4B to remove any remaining RPAbI.

Purification of RobpsCRA from the RPAbI-Depleted Fraction
The RPAbI-depleted fraction was applied onto a column (5 cm 3 5 cm; approximately 100-mL bed volume) of phenyl-Sepharose preequilibrated with 1.8 M ammonium sulfate. After loading, the column was washed with 1.8 M ammonium sulfate until the A 280 fell below 0.01 and the bound proteins eluted with 0.1 M Tris-HCl (pH 10.0). Fractions with an A 280 . 3 were pooled, adjusted to pH 7.5 with 1 N HCl, and applied in 25-mL aliquots on a column (5 cm 3 50 cm; approximately 1,000-mL bed volume) of Sephacryl 200 (Pharmacia) equilibrated with 0.2 M NaCl in 20 mM Tris-HCl (pH 7.5). Gel filtration yielded two major peaks. Agglutination assays and SDS-PAGE indicated that the first peak contained predominantly RobpsCRA. The fractions eluting in the first peak of the different runs were pooled, concentrated on a small column (1.5 cm 3 10 cm; approximately 14-mL bed volume) of phenyl-Sepharose, and rechromatographed on Sephacryl 200 using a longer column (2.5 cm 3 70 cm; approximately 350-mL bed volume). The peak fractions, which consisted almost exclusively of RobpsCRA were pooled, brought at 1.5 M ammonium sulfate with solid salt, and applied onto a column (2.5 cm 3 10 cm; approximately 50-mL bed volume) of thyroglobulin-Sepharose 4B. After loading, the column was washed with 1.5 M ammonium sulfate (adjusted to pH 7.5 with HCl) until the A 280 fell below 0.01 and the lectin desorbed with 0.1 M Tris-HCl (pH 10.0). The affinity-purified lectin was dialyzed against water or an appropriate buffer and stored at 220°C until use.
The overall yield of RobpsCRA was approximately 10 mg/kg bark tissue. Starting from the same extracts, about 2 g RPAbI were recovered per kilogram of bark tissue. This implies that RobpsCRA is roughly 200 times less abundant than the major bark lectin.

Analytical Methods
RobpsCRA was analyzed by SDS-PAGE using 15% (w/v) acrylamide gradient gels. For N-terminal amino acid sequencing, purified RobpsCRA was separated by SDS-PAGE, electroblotted on a polyvinylidene difluoride membrane, and sequenced on an Applied Biosystems model 477A protein sequencer interfaced with an Applied Biosystems model 120A online analyzer. Cyanogen bromide cleavage of RobpsCRA (2 mg) was done in 0.1 mL of 70% formic acid containing 10 mg of cyanogen bromide. After incubation for 15 h at 37°C (in the dark), peptides were recovered by evaporation under vacuum, separated by SDS-PAGE, and electroblotted on a polyvinylidene difluoride membrane for subsequent sequencing.
Total neutral sugar was determined by the phenol/H 2 SO 4 method (Dubois et al., 1956), with D-Glc as a standard. Analytical gel filtration was performed on a Pharmacia Superose 12 column (Amersham Biosciences) using phosphate-buffered saline as running buffer at a flow rate of 20 mL/h.
For MALDI-TOF mass spectrometry, samples (0.75 mL) of a 0.5 mg/mL solution of RobpsCRA in 50 mM phosphate buffer (pH 7.5) containing 50 mM NaCl were cocrystallized on the MALDI plate with 0.75 mL of a 0.6 mM 3,5-dimethoxy-4-hydroxy cinnamic acid (sinapinic acid) solution made up in 50% (w/v) aqueous azidonitrile. Desorption and ionization of crystallized samples were carried out on a Voyager-DE STR (Perspective Biosystems) mass spectrometer in positive linear mode using an accelerating voltage of 25 kV, a grid voltage of 93%, and an extraction delay time of 800 ns. Acquisition mass was performed between 4,000 and 50,000 D using a mixture of proteins of known molecular mass for internal calibration.

PCR Amplification of Sequence Encoding RobpsCRA
Genomic DNA was isolated from seeds of black locust using the FastDNA spin kit in an automatic homogenizer (FastPrep instrument; MP Biomedicals and Qbiogene) following the manufacturer's recommendations. DNA fragments encoding RobpsCRA were amplified using degenerate PCR primers derived from the N-terminal sequence of RobpsCRA and the C-terminal sequence of MedtrCRA (see Supplemental Fig. S4). The reaction mixture for amplification of genomic sequences contained 10 mM Tris-HCl (pH 8.3), 50 mM KCl, 1.5 mM MgCl 2 , 100 mg/L gelatin, 0.4 mM of each dNTP, 2.5 units of Taq polymerase (Invitrogen), 5 mL of cDNA, and 20 mL of the appropriate primer mixtures (5 mM), in a 25-mL reaction volume. After denaturation of the DNA for 5 min at 95°C, amplification was performed for 30 cycles through a regime of 15-s template denaturation at 92°C, followed by 30-s primer annealing at 45°C to 50°C and 1-min primer extension at 72°C. The PCR fragments were cloned in TOPO pCR2.1-TOPO cloning vector using the TOPO cloning kit from Invitrogen. Plasmids were isolated from purified single colonies on a miniprep scale using the alkaline lysis method (Mierendorf and Pfeffer, 1987) and sequenced by the dideoxy method (Sanger et al., 1977).

Hemagglutination Activity
Agglutination assays were carried out in small glass tubes or in the wells of 96 U-welled microtiter plates in a final volume of 50 mL containing 40 mL of a 1% (v/v) suspension of trypsin-treated human erythrocytes and 10 mL of lectin solution. Agglutination was monitored visually after 1 h of incubation at room temperature. To determine the specific agglutinating activity, the lectin was serially diluted with 2-fold increments and the dilution end point determined.