Galectins are a family of proteins first identified as galactoside-binding lectins in extracts of vertebrate tissue (Barondes et al., 1994a; Various authors, 1997). Sequencing of such proteins isolated from amphibians, birds, fish, and mammals revealed extensive sequence similarity, and in 1994, the galectin family was formally defined (Barondes et al., 1994b) on the basis of both shared sequence and galactoside binding. Four human galectins, which had been discovered in various contexts, and which bore multiple names, were renamed galectin-1 through -4. Since that time, advances in molecular genetics have led to the identification of many new members of the galectin gene family, which have been discovered on the basis of sequence similarity. Here we call attention to these newly established galectins and to additional genes, whose sequence suggests that they too are galectin family members.
Galectin family features
All galectins share a core sequence consisting of about 130 amino acids, many of which are highly conserved. Crystallography has been used to determine the structure of several galectins, most recently for galectin-7 (Leonidas et al., 1998). The portion of the core sequence which represents the carbohydrate recognition domain (CRD) (Figure 1) is contained between about residues 30 and 90, a segment generally encoded by a single exon. Of the original mammalian members, galectins 1–3 include just one CRD, whereas galectin-4 is composed of two nonidentical tandem core structures with two separate CRDs. Another shared feature was surprising: although galectins are found both in the cytoplasm and extracellularly, none has a secretion signal peptide. Instead, several galectins have been shown to be secreted by an unorthodox mechanism (Hughes, 1997).
Since the formal naming of the galectin family, seven more mammalian galectins (−5 through −11) have been discovered (Table I), sharing the basic structural features and galactosidebinding of the original four. Four of these have one CRD (Ackerman et al., 1993; Gitt et al., 1995; Magnaldo et al., 1995; Madsen et al., 1995; Ogden et al., 1998), and the other three have two tandem CRDs (Hadari et al., 1995; Leal-Pinto et al., 1997; Tureci et al., 1997; Wada and Kanwar, 1997; Gitt et al., 1998; Matsumoto et al., 1998) like galectin-4. Unlike the original members of this family, which were discovered on the basis of their lectin activity, the new galectins have primarily been identified in other ways, even in multiple contexts. Only when sequenced were they found to be members of the galectin family.
For one of these galectins, an apparent lack of carbohydrate binding activity resulted in its initial exclusion from the galectin family. The Charcot-Leyden crystal protein, an abundant lysophospholipase of eosinophils, could only be named galectin-10 after it was shown to weakly bind to the same galactoside affinity columns that avidly bound other galectins (Leonidas et al., 1995).
A galectin-like protein apparently specific to the lens of the eye, GRIFIN (galectin-related interfiber protein), was recently discovered, but this candidate can not be numbered as an official galectin, because it does not have detectable lectin activity in assays standardly used for other galectins (Ogden et al., 1998).
Novel candidate galectins
Using search algorithms based on the structure of these known galectins, we have screened the GenBank databases and identified seven additional mammalian candidates for membership in this family (Figure 1; only the exon-II encoded CRD domains are shown, but there is also considerable sequence similarity in other parts of each protein). All but one of these sequences (AC005515-II) appears not only in human genomic DNA, but also in expressed messages (Table I), proving that they are not pseudogenes. Based on sequence comparison it seems quite likely that most of these galectin-like sequences have galactoside-binding activity, but one (N90645) lacks the tryptophan residue otherwise conserved in all galectins with established carbohydrate binding activity. Four of the novel candidate galectins (N30757, R31311, AI138230, and AC005515-II) are also candidates for lysophospholipase activity, because they are very similar in sequence to galectin-10 (also located close to the galectin-10 gene on chromosome 19q13.1). Two additional sequences like galectin-10 appear to have stop codons interrupting the CRD. One of these appears to be expressed (EST = N40740), but no cDNA sequence has yet been recorded for the other gene. We have established a web address (URL: http://www.sacs.ucsf.edu/home/cooper/galectins.htm) giving further documentation for each of these candidate galectins, including complete deduced protein sequence.
Similar hunting for novel galectins in other genomes is also productive. In the worm, Caenorhabditis elegans, two galectins have been isolated and shown to bind galactosides (Hirabayashi et al., 1996; Arata et al., 1997). By searching the GenBank databases we have identified 26 more candidate galectin genes (not shown) for a tentative total of 28 galectins among the ∼20,000 genes in the C.elegans genome. Candidate galectins are also apparent in the genomes of other important model organisms (Table II), including Drosophila (LP06039), zebrafish (AI384777 and G47571), and Arabadopsis (AC000348, T7N9.14). The galectin-like sequence in Arabadopsis represents the first evidence for galectins in plants, where the whole class of lectin proteins was first discovered. Candidate galectin genes are even evident in two viruses, an adenovirus (Perillo et al., 1998; U25120) and a lymphocystis disease virus (L63545, 26549–27313 = 053R).
Evidence for multiple galectin functions
The presence of galectins in so many evolutionarily divergent species suggests that they participate in basic cellular functions. On the other hand, the evidence that there may be dozens of galectins within a single species suggests that they have evolved to participate in a variety of more specific functions. Indeed, there is abundant evidence that members of this family interact with glycoconjugates on or around cells and influence adhesion, migration, chemotaxis, proliferation, apoptosis, and neurite elongation (Barondes et al., 1994a; Puche et al., 1996; Hughes, 1997; Various authors, 1997; Matsumoto et al., 1998).
Even a single galectin can apparently affect cells in a variety of ways depending on the cell type and circumstances. For instance, galectin-1 can either stimulate or inhibit cell proliferation (Wells and Mallucci, 1991; Adams et al., 1996; Yamaoka et al., 1996) and can either stimulate or inhibit cell adhesion to extracellular matrix (Cooper et al., 1991; Van Den Brule et al., 1995). There is also evidence that galectins can simultaneously have distinct intracellular and extracellular functions. For instance, both galectin-1 and galectin-3 have been implicated in pre-mRNA splicing (Dagher et al., 1995; Vyakarnam et al., 1997).
Several recent studies have focused attention on possible galectin functions in regulating immune responses. For example, it has been found that galectin-1 or galectin-9 can induce apoptosis of activated T-cells by binding to cell surface oligosaccharides (Perillo et al., 1995; Wada et al., 1997; Allione et al., 1998; Vespa et al., 1998; Rabinovich et al., 1998; Novelli et al., 1999;), galectin-3 can activate neutrophils (Yamaoka et al., 1995; Karlsson et al., 1998), and galectin-9 is a potent and specific chemoattractant for eosinophils (Matsumoto et al., 1998).
Another approach to study galectin function is to knock-out expression of individual galectin genes. Such mice lacking galectin-1 have so far been shown to have intriguing deficits in olfactory axon pathfinding (Puche et al., 1996). Mice lacking galectin-3 have so far been shown to have abnormalities in neutrophil accumulation during inflammation (Colnot et al., 1998). Although this approach can fail to detect normal biological functions of the missing protein, apparently because many functions can be performed by alternative or redundant systems, these initial positive results are very encouraging for further analysis of these and mice engineered to eliminate other galectin family members.
This work was supported in part by Grant R01-HL56199 from the USPHS to D.N.W.C.