Increases in the Number of SNARE Genes Parallels the Rise of Multicellularity among the Green Plants 1[W][OA]

The green plant lineage is the second major multicellular expansion among the eukaryotes, arising from unicellular ancestors to produce the incredible diversity of morphologies and habitats observed today. In the unicellular ancestors, secretion of material through the endomembrane system was the major mechanism for interacting and shaping the external environment. In a multicellular organism, the external environment can be made of other cells, some of which may have vastly different developmental fates, or be part of different tissues or organs. In this context, a given cell must ﬁnd ways to organize its secretory pathway at a level beyond that of the unicellular ancestor. Recently, sequence information from many green plants have become available, allowing an examination of the genomes for the machinery involved in the secretory pathway. In this work, the SNARE proteins of several green plants have been identiﬁed. While little increase in gene number was seen in the SNAREs of the early secretory system, many new SNARE genes and gene families have appeared in the multicellular green plants with respect to the unicellular plants, suggesting that this increase in the number of SNARE genes may have some relation to the rise of multicellularity in green plants.

The green plant lineage is the second major multicellular expansion among the eukaryotes, arising from unicellular ancestors to produce the incredible diversity of morphologies and habitats observed today. In the unicellular ancestors, secretion of material through the endomembrane system was the major mechanism for interacting and shaping the external environment. In a multicellular organism, the external environment can be made of other cells, some of which may have vastly different developmental fates, or be part of different tissues or organs. In this context, a given cell must find ways to organize its secretory pathway at a level beyond that of the unicellular ancestor. Recently, sequence information from many green plants have become available, allowing an examination of the genomes for the machinery involved in the secretory pathway. In this work, the SNARE proteins of several green plants have been identified. While little increase in gene number was seen in the SNAREs of the early secretory system, many new SNARE genes and gene families have appeared in the multicellular green plants with respect to the unicellular plants, suggesting that this increase in the number of SNARE genes may have some relation to the rise of multicellularity in green plants.
The flowering plants are the most recent of a long lineage of green plants that first set to land nearly 450 million years ago in the Ordovician period (for review, see Sanderson et al., 2004). In turn, the early land plants arose from a group of aquatic green algae related to Chara, which themselves are related to familiar unicellular green algae like Chlamydomonas. Thus, the multicellular plants arose from unicellular ancestors completely independent from the animal lineage, and represent a chance to examine a distinct way to assemble a multicellular organism. Multicellularity requires both the ability to perform as a cell and as part of a larger entity. One major requirement for both of these abilities is the proper management of the endomembrane system, since within a multicellular organism, the neighbors of a given cell can have a different developmental fate, be part of a different tissue, or even a different organ. Basically, different neighbors may require different aspects of cell-to-cell signaling (developmental signals, cell wall components, etc.), and thus it becomes important to control the secretory aspects of the cell. Thanks to the large-scale sequencing and annotation of many green plant genomes, we can now examine some of the molecular basis of how one major player in the endomembrane system has evolved during the rise of multicellular plants, that of the SNARE family of proteins.
Proteins of the SNARE family function as molecular machines, self-assembling into a cluster of four coiledcoil helices that drive membrane fusion (for review, see Hong, 2005). The helices, termed Qa, Qb, Qc, and R are each contributed by individual SNARE proteins, except in the SNAP25 family, which carries two SNARE helices (Qb and Qc) in a single protein. As a vesicle forms at the donor membrane, a single type of SNARE is incorporated into the vesicle membrane. Termed a vesicle (v)-SNARE, this protein has some level of specificity in its ability to interact with a cognate set of target (t)-SNAREs that contribute the other helices. The classic example is the mammal synaptic complex where the v-SNARE of the synaptic vesicle, R-synaptobrevin/ VAMP-2, interacts with the presynaptic t-SNARE complex of Qa-Syntaxin and Qb 1 Qc-SNAP25 (Hong, 2005). Similarly, the v-SNARE of the endoplasmic reticulum (ER)-derived COPII vesicle, Qc-Bet1p, interacts with the cis-Golgi t-SNARE complex of Qa-Sed5p, Qb-Bos1p, and R-Sec22p (Hong, 2005). This theory has been well supported by both in vivo and in vitro systems, and likely represents the typical situation throughout the endomembrane system of eukaryotes.
Full genome sequence assemblies have been produced for three angiosperms, two dicots and one monocot: Arabidopsis (Arabidopsis thaliana; eudicot: eurosids: malvids: Brassicales), Populus trichocarpa (eudicot: eurosid: fabids: Malpighiales), and rice (Oryza sativa; monocot: commelinids: Poales). In addition, four chlorophyte algae have also been sequenced: Chlamydomonas reinhardtii (Chlorophyta: Chlorophyceae), Volvox carteri (Chlorophyta: Chlorophyceae), Ostreococcus tauri, and Ostreococcus lucimarinus (Chlorophyta: Prasinophyceae). Recently, these extremes of the green plant lineage were connected with the release of the genome sequence of the moss Physcomitrella patens (Bryophyta). For each of these genome sequences, all genes of the SNARE family were identified and classified with respect to the major clades identified in previous analyses of the Arabidopsis SNAREs Pratelli et al., 2004;Uemura et al., 2004), the rice SNAREs (Sutter et al., 2006), and a recent algorithmbased analysis of SNAREs for several sequenced eukaryotes (Yoshizawa et al., 2006). In addition to those clades, some additional SNAREs were identified from new information in other organisms, and by comparison among the green plant sequences.
Overall, the number of SNARE-encoding genes increases among the land plants compared to the unicellular plants. In addition to an increase in the net number of genes, new gene families and subfamilies appear among the SNARE genes with products that are associated with secretion and with the endosomal system. The increase in gene number for the endosomal SNAREs might be associated with the development of the typical large central vacuole morphology of the plant cell. Meanwhile, the large increase in genes encoding secretory SNAREs seems to occur in multicellular plants in comparison to their presumed unicellular ancestors. Others have hypothesized that these kinds of increases in gene number may be associated with the rise of multicellularity (e.g. Dacks and Doolittle, 2002). For example, once the cell above is different from the cell beside, it may become important to be able to deliver different material (cell-cell signals, extracellular matrix components, etc.) to the top of the cell than to the side. One mechanism for this polarization of secretion can be provided by duplication and specialization of the secretory SNAREs. Already, some evidence support the specialization of secretory SNAREs in distinct processes (Lukowitz et al., 1996;Collins et al., 2003;Sutter et al., 2006). Collecting and identifying the potentially specialized SNAREs of the green plants with particular achievements in multicellularity may lead to more efficient studies of this process.

RESULTS AND DISCUSSION
Examining the sequences available from many eukaryotes has indicated that most eukaryotes have a common set of SNARE proteins, a core set that likely represents the genes from the last common ancestor of the extant eukaryotes (Table I; Doolittle, 2002, 2004;Yoshizawa et al., 2006). Often, SNAREs that are single genes in unicellular organisms are present as small gene families, or even seem to have diverged into novel gene families that have gained specialized functions in several organisms. In particular, an increase in the net number of SNARE-encoding genes is seen in at least two places, among the green plants and among the animals (Fig. 1). This observation that the increase in SNAREs may be associated with the rise of multicellularity in animals and land plants has been made before (Dacks and Doolittle, 2002), though the increase in the sequence information from the presumed unicellular ancestors of these groups has produced more support for this hypothesis.
Overall, the green plants examined in this work have a remarkably consistent and homologous set of SNARE proteins that generally recapitulate their presumed evolutionary relationships (see below). In contrast, based upon the genome sequences of two unicellular thermoacidophilic red algae and EST evidence from the multicellular red algae Porphyra yezoensis, the SNARE proteins of the red algae are no more similar to those of the green plants than either are to the heterokont diatoms or oomycetes (see below for examples). Therefore, even though red algae are often considered to be a part of a larger plant kingdom along with the green and glaucophyte algae (defined by those organisms that are descendants of the primary plastid symbiosis; see Cavalier-Smith, 1998;Adl et al., 2005), this work will focus on the more clearly monophyletic green lineage (i.e. green plants).

The SNAREs of the Early Secretory Pathway Have Changed Little in Land Plants
Green plants have proteins similar to the yeast (Saccharomyces cerevisiae) or mammalian SNAREs that operate between the ER and Golgi apparatus ( Fig. 2; Supplemental Figs. S1, S2, and S6; Table I). In yeast cells, the SNARE complex that mediates retrograde traffic back to the ER seems to consist of Qa-Ufe1p 1 Qb-Sec20 1 Qc-Use1p 1 R-Sec22p (for review, see Hong, 2005). As in mammals, plants have proteins that are similar to each of these SNAREs: Qa-SYP81, Qb-SEC20, Qc-USE1, and R-SEC22. These proteins are found in all angiosperms examined, and also in the chlorophyte algae, indicating that this complex was inherited vertically through the green plant lineage. In some angiosperms, multiple copies of some of these genes have been identified in completely sequenced genomes, though any specialized functions for the additional copies have not been examined. Due to a misannotation in an early version of the Arabidopsis genome, the gene encoding Qc-USE11 (At1g54110) was fused to an adjacent gene that encodes a cation exchanger (CAX10; At1g54115). Though this gene model was later split, the CAX10 name stayed with the USE1-encoding gene, and has subsequently been passed through annotation of several other plant genomes (and missed in the Yoshizawa et al., 2006 analysis). Nonetheless, this gene does encode a protein with a structure similar to other SNARE proteins, groups with the USE1-like genes of animals and fungi in phylogenetic analysis (Supplemental Fig. S1) and is found in all green plants (Table I).
Similarly, proteins similar to those of the yeast or animal Golgi SNARE complex (Hong, 2005) are also found in green plants: Qa-SYP3, Qb-MEMB1, Qc-BET1, and R-SEC22. Investigations into a few of these genes has indicated that the Arabidopsis proteins function similar to the yeast and animal homologs (Chatre et al., 2005). Again, these proteins are present in all green plants ( Fig. 2; Supplemental Figs. S2, S3, S5, and S6). In Arabidopsis, two of these genes (SYP31 and SYP32) may have specialized functions: Instead of the cis-Golgi, SYP31 is found on the peripheral ER and may participate in a specialized role in cytokinesis (Rancour et al., 2002).
Later in the Golgi stacks, it has been shown that some of the SNAREs in the early Golgi complex are replaced by other SNAREs (Parlati et al., 2002;Volchuk et al., 2004). In mammals, Qb-Membrin is replaced by Qb-GS28 and Qc-mBet1 is replaced by a related pro-tein (Qc-GS15; Volchuk et al., 2004). Plants also have homologs of the Qb-GS28/Gos1p and a BET1-like protein called Qc-SFT1 (Table I; Fig. 2; Supplemental Figs. S2, S3, and S5). An annotation error led to the association of the Arabidopsis SFT1-family genes with the KOG (clusters of orthologous groups for eukaryotic complete genomes; Tatusov et al., 2003) for the mitochondrial termination factor (KOG1267). This led to spread of this name for the SFT1-like genes throughout subsequent autoannotations of genomes. As with USE1, the plant SFT1-like genes have a clear SNARElike structure and group within the brevin-like Qc-SNAREs (Supplemental Fig. S5), suggesting that it is reasonable to identify these as a green plant-specific clade of SNARE. As, Angiosperms; Gs, gymnosperms; Bp, bryophytes (mosses); Ch, Chlorophyceae; Pr, Prasinophyceae; Rp, rhodophytes; Hk, heterokonts; Am, amoebozoa; Fn, fungi; An, animals; *, at least one SNARE of this type is present. When one has been assigned, a KOG number is given. Group names in bold represent green-plant-specific groups or subgroups.

As
Gs

Additional Endosomal SNARE Genes May Be Associated with Development of the Large Central Vacuole
At the far end of the Golgi complex are those complexes of the trans-Golgi network (TGN) that function in anterograde traffic from earlier Golgi stacks or in retrograde traffic from the endosomes. Plants again have the SNAREs similar to the mammal and fungal proteins that are posited to be involved in these steps ( Fig. 2; Supplemental Figs. S2-S4, and S6). In green plants, Qa-SYP4 is similar to mammalian syntaxin 16 or fungal Tlg2p, and at least in Arabidopsis, has been shown to localize to the TGN like the mammalian equivalents (Bassham et al., 2000). Some plants have multiple copies of Qa-SYP4, and in Arabidopsis, SYP41 and SYP42 have been shown to localize to distinct domains of the same TGN (Bassham et al., 2000), indicating some functional distinction. Qa-SYP41 has been shown to interact with Qb-VTI12 and Qc-SYP61 (Bassham et al., 2000), similar to homologous proteins from yeast and mammals (Mallard et al., 2002), and to also interact with R-YKT6 in Arabidopsis (Chen et al., 2005). It was also shown that Qa-SYP41 can distinguish between two members of the Qb-VTI1 gene family, indicating that some specialization has occurred among the multiple copies of these SNAREs in angiosperms (Bassham et al., 2000;Sanderfoot et al., 2001).
Defining the SNARE complexes that function in the late endosomes and vacuole/lysosomes has been difficult and may be very lineage specific. For example, yeast has two Qa-SNAREs that are functionally distinct: Pep12p that functions on the late endosome/ prevacuolar compartment and Vam3p that functions on the vacuole (for review, see Burri and Lithgow, 2004). Meanwhile, most other fungi have only a single Pep12p-like Qa-SNARE. Mammals also have two related Qa-SNAREs, syntaxin 7 and syntaxin 13, that may also have a functional distinction between the lysosome and endosomes (Mullock et al., 2000;Sun et al., 2003); though, again, some animals have only one Qa-SNARE that presumably does both. Similarly, while chlorophyte algae have a single Qa-SYP2, angiosperms often have multiple genes that appear to have functional differences. For example, in Arabidopsis, SYP21 appears to function on the late endosomes, while SGR3/SYP22 functions on the vacuole (Rojo et al., 2003;Yano et al., 2003). Regardless, these Qa-SNAREs each make complexes with Qb-VTI11 and Qc-SYP5 (homologous to mammalian syntaxin 8; Supplemental Fig. S4), and like Qa-SYP41, can distinguish between Qb-VTI11 and -VTI12 (Sanderfoot et al., 2001). The R-SNARE partner has yet to be conclusively demonstrated, but preliminary indications are that R-YKT61 can interact with each of these SNARE complexes (A. Sanderfoot, unpublished data). Moreover, members of the R-VAMP71 clade have been shown to localize to the vacuolar membrane (Carter et al., 2004;Uemura et al., 2004), suggesting that these may also be involved with vacuolar/late endosomal trafficking Figure 1. SNARE genes in several eukaryotic genomes. SNAREs are divided into three basic modules based upon the site of action: ER/Golgi, TGN/endosomal (including vacuole), and secretory/PM. Among the unicellular eukaryotes, this division is based upon homology to SNAREs with known functions and does not exclude multiple roles for some proteins (especially in Cyanidioschyzon where the Qb, Qc, and R roles in secretion must be played by proteins from other modules). SNARE genes are indicated by individual boxes with the type of SNARE labeled with a color (Qa, orange; Qb, purple; Qb 1 Qc, violet; Qc, red; R, blue). Each box indicates a single genomic locus and does not include alternatively spliced isoforms of SNARE genes that are common in some lineages (especially vertebrates). Lineages are specified to the left with arrows indicating relationships among the major groups (Adl et al., 2005): Green, Green plants; embryo., embryophytes (land plants); chloro., chlorophytes (green algae); red, red algae; hetero., heterokonts (brown algae, diatoms, oomycetes); amoeba, amoebozoa (slime molds); fungi; animals. Species abbreviations: Arath, Arabidopsis; Poptr, P. trichocarpa; Orysa, O. sativa; Phypa, P. patens; Chlre, C. reinhardtii; Volca, V. carteri; Ostta, O. tauri; Ostlu, O. lucimarinus; Cyame, Cyanidioschyzon merolae; Thaps, Thalassiosira pseudonana; Phatr, Phaeodactylum tricornutum; Physo, Phytophthora sojae; Phyra, Phytophthora ramorum; Dicdi, Dictyostelium discoideum; Sacce, S. cerevisiae; Schpo, Schizosaccharomyces pombe; Caeel, Caenorhabditis elegans; Drome, Drosophila melanogaster; Homsa, Homo sapiens. similar to the some of the roles indicated for VAMP7 in mammals (Ward et al., 2000;Pryor et al., 2004). This role may have been elaborated in the land plants because a second clade of VAMP71-type SNAREs (the VAMP714-like subgroup) was found in angiosperms (Table I; Fig. 2), though in the absence of experimental evidence, the role of this angiosperm-specific subclade is unclear. Ultimately, this very complicated and essential (Rojo et al., 2001) region of the plant cell will require further study to clearly identify the roles for the different complexes in traffic to and from the endosomes and the vacuole.

Is the Expansion of Secretory SNAREs Associated with Multicellularity?
The land plants have greatly expanded the more simple complement of secretory SNAREs present in their unicellular ancestors ( Fig. 1; Supplemental Figs. S1 and S6). Among green plants, it is likely that SYP1 (and related proteins) represents the Qa-SNAREs of the plasma membrane (PM), which work along with a Qb 1 Qc-SNARE similar to Arabidopsis SNAP33 (or alternately with the Qb-NPSN1 and Qc-SYP7). Lacking brevins, green plants are widely believed to use a special branch of the VAMP7-type R-SNAREs, the VAMP72-clade, which appears to be specific to green plants (Table I; Supplemental Fig. S6). Many members of the VAMP72-type of SNARE have been localized to the PM in Arabidopsis (Uemura et al., 2004), though that these proteins act as the R-SNARE for a secretory complex has yet to be shown conclusively. Since both the Qa-SYP1 and the VAMP7 clades have greatly increased among the transition from chlorophytes to land plants (Fig. 2), these groups deserve more investigation.

The SYP1 Lineage and Specialization and Differentiation
Among the vertebrates, many distinct types of Qa-SNAREs (syntaxins) reside on the PM (syntaxins 1, 2, 3, 4, 11, and 19) that have diverse and specialized functions in delivery of vesicles to various domains of the PM (Hong, 2005). Other animals typically have a syntaxin 1-like protein and then at least one other SNARE distantly related to this syntaxin 1 to 4 clade (Bock et al., 2001). Fungi tend to have a single (or a paralogous pair) of PM-type syntaxins (e.g. Burri and Lithgow, 2004), as do most other unicellular eukaryotes that have been examined (Fig. 3). Others have argued that this diversity of vertebrate syntaxins reflects their complex multicellular lifestyle in contrast to their simpler unicellular ancestors (Bock et al., 2001;Dacks and Doolittle, 2002).
Based on phylogenetic analysis, the single-copy chlorophyte SYP1 lies at the base of the embryophyte SYP1 clade (Fig. 3). Among the embryophytes, the SYP1 clade splits into two large, well-supported groups: SYP12 and SYP13 (Fig. 3). A common feature of the genes of the SYP13 group is that all have multiple introns, almost all of which occur at equivalent positions with the open  Table S1 for a list of individual genes for each organism.
reading frames (Supplemental Fig. S7). Little work has been done on this type of syntaxin in the land plants, aside from the finding that the protein is found on the PM when overexpressed in Arabidopsis protoplasts (Uemura et al., 2004) and that it plays some role in Rhizobium symbiogenesis in legumes (Catalano et al., 2007). The second group of PM syntaxins, the SYP12 group, includes members from all the embryophytes examined (Fig. 3). The SYP12 group is distinct from the SYP13 in that all the genes are encoded by a single exon, or have only one intron, typically in an equivalent position within the ORF (Supplemental Fig. S7). Like the SYP13 group, members of the SYP12 group have been shown to be found on the PM and be involved in aspects of secretion (Lauber et al., 1997;Geelen et al., 2002). The SYP12 group is itself split into three reasonably well-supported subgroups (Fig. 3).
A preangiosperm subgroup includes all of the SYP12-like proteins from moss and SYP12-like proteins found encoded in the EST collections of gymnosperms (Fig. 3). The angiosperm SYP12 subgroup is represented in two further subgroups, the PEN1/ ROR2 and the SYP124-like subgroups. In addition to a role in general secretion (Geelen et al., 2002), the PEN1/ROR2 subgroup may have taken on a specialized role in defense to fungal pathogens (Collins et al., 2003;Nü hse et al., 2003;Assaad et al., 2004). Arabidopsis mutants that lack both members of the PEN1/ ROR2 subgroup (i.e. syp121/pen1, syp122 double mutants) are severely dwarfed and show necrosis (Assaad et al., 2004), suggesting that the function of the subgroup cannot be completely replaced by the SYP124 subgroup. Nonetheless, any specialized role for the SYP124 subgroup syntaxins among the angiosperms has yet to be shown. The third subgroup of SYP12 syntaxins is represented by the SYP11 group. A SYP11like protein is encoded among the gymnosperm ESTs (Fig. 3), suggesting that this group may be common to all the seed plants, though no SYP11-like protein is found in the moss genome. Among the angiosperms, this class of syntaxin is best represented by the KNOLLE-like proteins. In Arabidopsis, KNOLLE protein is produced only during cell division, is found on the cell plate, and is essential for completion of cytokinesis (Lukowitz et al., 1996;Lauber et al., 1997). Interestingly, the cell-plate-mediated form of cytokinesis is found throughout all the land plants and is even found in many green algae (Lopez-Bautista et al., Figure 3. Qa-SNAREs of the PM have greatly expanded in the land plants. Full-length protein sequences of the Qa-PM orthologs from various organisms were aligned by ClustalW, distances were estimated with a neighbor-joining algorithm and visualized in a phylogram. The tree was rooted using the Qa-PM SNAREs of heterokonts. Bootstrap support is indicated to the left of branches. The clades and subclades of land plant syntaxins are also indicated as described in the text (also see Supplemental Fig. S7). See Figure  1 for species abbreviations. 2003), suggesting that other PM syntaxins must mediate this process in plants that lack the SYP11 group, and that KNOLLE was a later invention of the seed plants. A further subgroup of PM syntaxins (SYP112) is found only among the eudicots Arabidopsis and poplar (Fig. 3), and no obvious relative to SYP112 has been found in the EST databases of other eudicots. Because this clade is only found in two complete genomes and has no known role (Muller et al., 2003), it is unclear what role this SNARE may play.
The presence of three major groups (and several subgroups) of PM syntaxins suggests a complexity and specialization of roles that occurs at a point in evolutionary time coincident with the rise of land plants and the associated multicellularity (see Fig. 5). The hypothesis that the expansion of the plant PM syntaxins may be linked to multicellularity has been made before (e.g. Dacks and Doolittle, 2002), but the new information in this work now suggests this point clearly lies closer to the radiation of the land plants. One could hypothesize that the SYP13 group represents the basal PM syntaxin inherited vertically from the chlorophyte ancestors, and would be involved in general housekeeping roles (e.g. constitutive secretion). The SYP12 group would arise in the mosses (or earlier) to support more specialized roles of secretion (e.g. defense related, or perhaps cell-plate formation). Finally, a specialized syntaxin (SYP11/KNOLLE) would evolve to exclusively operate the essential process of cytokinesis in the seed plants (or earlier). Later specializations (PEN1/ ROR2 and SYP124) would arise to further differentiate some additional functions. Intensive research into members of these clades in particular model systems may support this hypothesis, as could additional sequence information from other plants that lie at evolutionarily significant points.

The VAMP7 Lineage in the Green Plants
Within the animal and fungi lineage, the brevins serve the functions associated with secretion (Hong, 2005), and some other lineages (e.g. Dictyostelium) have similar brevins that may act similarly (Supplemental Table S1). On the other hand, the only potential brevins in the green plant lineage are found in Chlamydomonas, where a paralogous pair of genes (VAMP like or VMPL) with unknown function are found. Instead, plants only have the longin types of R-SNARE ( Fig. 2;  Supplemental Fig. S6). Longin proteins have a conserved N-terminal domain that contains a profilinrelated fold that is also found in other non-SNARE proteins (Rossi et al., 2004). In particular, plants have a large group of longin SNAREs similar to the mammalian VAMP7 (Table I). In mammals, VAMP7 (also called TI-VAMP) is involved in intraendosomal trafficking as well as specialized aspects of exocytosis (Pryor et al., 2004). Green plants seem to have two major groups of VAMP7-like proteins (Table I; Supplemental Fig. S6). The VAMP71 group is most similar to the mammalian VAMP7 and likely plays a similar role to the mammalian VAMP7 in the endosomal system (see above). On the other hand, the second group (VAMP72) appears to be specific to the green plant lineage (Table I; Fig. 4) and likely represents the R-SNARE component for secretion among the green plants.
As in the VAMP71 group (see above), there has been a divergence among the VAMP72 group in the seed plants, where VAMP724 and VAMP727 subgroups can be identified (Table I; Fig. 4). Some results have indicated that the VAMP727 of Arabidopsis is localized to the early endosomes while other VAMP72 gene family members from angiosperms have been found on the PM at steady state (Marmagne et al., 2004;Uemura et al., 2004). The VAMP727-like proteins of the seed plants have a common 20-residue insertion in the N-terminal domain that distinguishes these proteins from the other VAMP72 proteins (Supplemental Fig.  S8). The split to make the VAMP724 subgroup seems to have happened in the seed plants and perhaps in the angiosperms, suggesting a more recent specialization of function. Importantly, together with the PM syntaxins, this parallel enlargement of the VAMP72 group seems to have occurred during the time of the rise of complex multicellularity among the green plant lineage (Fig. 5), and suggests that specializations among the secretory machinery may have played an essential role in the rise of multicellularity.
The Strange Case of Qb 1 Qc-SNAREs (SNAP25 Like) SNAP25, a protein with two tandem SNARE domains (an N-terminal Qb and a C-terminal Qc domain) was one of the original proteins to be defined as a SNARE, and is also a member of the prototypical SNARE complex of the mammalian synapse (for review, see Hong, 2005). In SNAP25, a stretch of Cys residues following the N-terminal Qb-SNARE domain are palmitoylated vivo and serve to increase the membrane interactions of these proteins (Gonzalo et al., 1999). A third protein, SNAP29, is also found in mammalian cells, and is thought to be involved in similar fusion events among endosomal membranes (Steegmaier et al., 1998). A fourth type of Qb 1 Qc-SNARE, SNAP47, has recently been identified in mammals (Holt et al., 2006). Fungi also have a Qb 1 Qc-SNARE, Sec9p, which operates in the secretory SNARE complex similar to SNAP25 in their cells (for review, see Burri and Lithgow, 2004). Though the fungal proteins are highly divergent from their animal equivalents, it is generally thought that they are each derived from a common ophistokont ancestor.
Among the green plants, the SNAP33-type proteins are another example of a Qb 1 Qc-SNARE and have been shown to play similar roles to that of the mammalian SNAP25. SNAP33-like proteins have been shown to be essential in such roles as general secretion (Kargul et al., 2001), pathogen defense (Collins et al., 2003), and for cell plate formation during cytokinesis (Heese et al., 2001). Searches in the sequences of other green plants have identified proteins similar to SNAP33 in gymnosperms and mosses (Fig. 2), indicating that these proteins have been in green plants since the adaptation to land. Compared to SNAP25, the land plant SNAP33 proteins have an N-terminal extension (that does not bear any strong resemblance to the N-terminal extensions of SNAP29, SNAP47, or Sec9p), and lack the palmitoylation site of the SNAP25-like proteins. Outside of the land plants, the chlorophyte algae Chlamydomonas and Volvox also encode SNAP25-like proteins (SNAP34). Oddly enough, these proteins have an N-terminal extension similar to that of the land plant SNAP33, yet these proteins also have a Cys-rich sequence after the Qb-SNARE domain similar to the sequence that is palmitoylated in SNAP25 (A. Sanderfoot, unpublished data). Whether this domain is palmitoylated in these algae has not been tested. The prasinophyte green algae Ostreococcus, two thermoacidophilic red algae (Cyanidioschyzon and Galdieria), and all the heterokonts examined do not have any SNAP25-like SNAREs (Table I; Fig. 2).
Since the divergence between the animal and fungi and green plant lineages span the many SNAP25lacking unicellular protist groups (Table I; Dacks and Doolittle, 2001), the presence of this gene in these disparate branches is difficult to reconcile by strict parsimony. Furthermore, that Chlamydomonas and Volvox could have a Qb 1 Qc-SNARE that is palmitoylated Figure 4. R-SNAREs of the VAMP7 group have greatly expanded in the land plants. Full-length protein sequences of the R-VAMP7 orthologs from various organisms were aligned by ClustalW, distances were estimated with a neighbor-joining algorithm and visualized in a phylogram (top). The tree was rooted using the VAMP7 SNAREs from heterokonts. Bootstrap support is indicated to the left of branches. The ubiquitous VAMP71 and the green-plant-specific VAMP72 clades are indicated at right, along with the subclades of the land plants (see text and Supplemental Fig. S8). See Figure 1 for species abbreviations. at a remarkably similar sequence to that of the vertebrate SNAP25 proteins is strange, especially considering that this trait is not shared with organisms within their separate lineages. Did the green plants invent SNAP33 distinct from the animals and fungi, or have many other lineages lost the ancestral gene retained in the plants, animals, and fungi. Finally, considering the essential role of SNAP25-like proteins in the secretory process of those cells that possess them, what SNAREs have replaced SNAP25 in those eukaryotes that lack this class of SNARE?
Additional SNAREs First Identified in Plants: Qb-NPSN1 and Qc-SYP7 Through searches in the genomic sequence of the model plant Arabidopsis, new types of Qb-and Qc-SNAREs that were not found in animal or fungal genomes were identified . As more sequence information later became available from nonanimal/fungal unicellular eukaryotes, homologs of SYP7 and NPSN have turned up in alveolates, chromists/heterokonts, and amoebae, though never in an animal or fungi (Table I; Fig. 2; Supplemental Figs. S3 and S4). Since these proteins do not exist in the heavily studied animal or fungal systems, they did not receive as much attention as other types of SNAREs, but recent work has begun to indicate that these proteins may be worthy of more study.
Outside of green plants, these types of Qb-and Qc-SNAREs are found in most eukaryotes that lack the Qb 1 Qc-SNAP25-type SNARE that seems to be involved in secretion (see above). This leads to the hypothesis that these SNAREs have replaced (or were replaced by) SNAP25-like SNAREs in secretion/exocytosis in these important groups of eukaryotes. Evidence for a role in secretion comes from experiments in the angiosperm Arabidopsis where it has been shown that Qb-NSPN11 and Qc-SYP71 interact with the cell-plate-specific Qa-KNOLLE, and have a role in  the secretion-related process of cytokinesis (Zheng et al., 2003;L. Conner and A. Sanderfoot, unpublished data). Moreover, these proteins are found on various organelles in the late secretory system in nondividing cells (Zheng et al., 2003;Marmagne et al., 2004;Mongrand et al., 2004;Morel et al., 2006). When members of the Qc-SYP7 group were fused to a fluorescent protein and overexpressed in protoplasts, these fusion proteins were reported to localize to the ER (Uemura et al., 2004). This result is at odds with the results of others when examining tobacco (Nicotiana tabacum) cells (Marmagne et al., 2004;Mongrand et al., 2004;Morel et al., 2006). Whether this result is simply an overexpression artifact or a reflection of a specialized role for these proteins under certain conditions remains to be seen. Still, it seems possible that these proteins may be the major mediators of secretion in organisms that lack SNAP25-like Qb 1 Qc-SNAREs, which includes many medically relevant organisms like the unicellular pathogens and parasites. Obviously, this requires more work in both plants and in other nonanimal/fungal eukaryotes, where the mechanical aspects of secretion have not been thoroughly investigated. Meanwhile, as an additional set of secretory Qb-and Qc-SNAREs in the green plant lineage (along with the Qb 1 Qc-SNAP33 group), they provide an additional flexibility for the SNARE complexes that can mediate secretory vesicle trafficking, and thus can be considered to be part of the enlarged complement of secretory SNAREs among the multicellular green plants.

CONCLUSION
The SNARE family of proteins has greatly enlarged among the land plants with respect to their unicellular ancestors. This is apparent in Figures 1 and 2, as well as in the ordered list of SNAREs for each organism given in Supplemental Table S1. The chlorophyte algae, perhaps representing the state of unicellular green plants, have the basic core of the eukaryotic SNARE proteins among their 30 to 35 SNARE-encoding genes. The moss Physcomitrella, arguably representing the state of the first land plants, increases the number of SNAREs to a total of 63 that represent almost all of the major groups and subgroups of the angiosperm SNAREs. Among the angiosperms, the numbers of SNAREs vary from 62 to 76 with only a small number of new groups with respect to the mosses. Clearly, the doubling of SNARE-encoding genes that occurs between the chlorophyte algae and the mosses spans several morphological innovations. The observation that most of these new SNAREs are related to secretory function, and the importance of a specialized secretory pathway in multicellular organisms, it seems reasonable to hypothesize that this increase in SNARE number is related to multicellularity. Others have made this suggestion in the past, both in relation to the green plants and the animals Doolittle, 2002, 2004), and the increased genomic information available today has supported and extended this hypothesis. Other lineages, such as some groups of red algae and the brown algal kelps, have also undergone well-documented transitions to multicellularity from unicellular ancestors. Unfortunately, there is no genome sequence currently available from these groups to test if a comparable expansion of SNARE proteins is found among these multicellular eukaryotes. Additional sequence information may one day be available from representatives of multicellular brown and red algae, as well as some of the charophyte ancestors of plants. Such data may provide more evidence on the link between SNAREs and multicellularity in the near future.
Even with eight green plants, we have only examined the tips of plant diversity, and many evolutionarily significant green plants are not yet available to be examined. In the next few years, three relatives of Arabidopsis (Arabidopsis lyrata, Capsella rubella, and Thellungiella halophila), fabids like Medicago truncatula, and asterids like monkey flower (Mimulus) and tomato (Solanum lycopersicum) will become available for more thorough examination of the dicot SNARE gene families. There will also be several grass and nongrass monocots available soon to better define the angiosperm clades, as well as the lycophyte Selaginella moellendorffii to aid in broader comparisons among the basal plants. Based upon the relatively small differences between the already available angiosperms and moss, there may not be great differences in the final numbers of SNAREs revealed in these plants, but each may help to narrow down the appearance of particular gene families and better define how each has helped to establish the unique aspects of the very successful green plant lineage.

Sequence Databases
The genome assemblies searched for this work are listed in Table II. In addition to the green plant sequences, SNAREs were also examined in several other nongreen plant algae for whom sequence assemblies were available (see Supplemental Table S1). The Volvox assembly has not been publicly released, but can be accessed by searches through the Joint Genome Institute (JGI) Chlamydomonas site. The plant gene indices (currently available through the Dana Farber Computational Biology and Functional Genomics portal: http://compbio. dfci.harvard.edu/tgi/) were also used to identify ESTs for several related or taxonomically relevant plants. Though ESTs will definitely underestimate the number of SNAREs in a given organism, the encoded protein sequences were often useful for preventing long-branch issues (barley [Hordeum vulgare] as a second representative of grasses), and for bridging gaps in the evolutionary series (loblolly pine [Pinus taeda] for the gymnosperms). All other sequences were acquired from GenBank accessions of the respective assemblies. The tables and figures in this work refer to either the unique protein ID of the organism-specific genome databases (see Table II), or to a GenBank accession.

Searches and Annotations
The seed used for searches was the annotated list of Arabidopsis (Arabidopsis thaliana) SNAREs kept by the author based upon prior work (Sanderfoot et  information from other similar works (Bock et al., 2001;Hong, 2005;Yoshizawa et al., 2006) and unpublished information. These proteins were used to iteratively search through the genome databases using BLASTp, tBLASTn, and related searches (including keyword searches of autoannotations when possible) to assure that all possible SNARE-encoding genes were identified in the databases. When possible, EST information was used to confirm exon structure, otherwise the best possible model was chosen for each gene based upon homology with a related genome sequence (either an internal paralog or a homolog in a related organism). Each predicted SNARE protein was iteratively compared (by BLASTp or PSI-BLAST) to the nonredundant protein database at the National Center for Biotechnology Information (http:// www.ncbi.nlm.nih.gov), the predicted proteins of the individual genome assemblies (including that of the source organism to identify gene family members and help identify pseudogenes and poor gene predictions), the Conserved Domain Database (http://www.ncbi.nlm.nih.gov/Structure/ cdd/cdd.shtml), and the KOGnitor (http://www.ncbi.nlm.nih.gov/COG/ grace/kognitor.html). Through such analysis, typically a single group of SNAREs (e.g. a KOG) was identified as the most likely candidate based upon a highly significant score (greater than 1e-20), and a significant distinction from other groups. In cases where such a KOG (or other similar group) was not identified, a temporary group was created to facilitate identification of groups that lie outside the previously established groups. Finally, each protein was subjected to phylogenetic analysis with members of well-known SNARE groups (i.e. Sanderfoot et al., 2000;Bock et al., 2001;Pratelli et al., 2004) to ensure that the assignment of the particular SNARE group based upon the profile-based searches is consistent with the well-known phylogeny. As new genome information was added, each subsequent set of predicted SNARE proteins was added to the analysis until the groups indicated in this work were apparent.
Because this work was based upon the information present in the genome assemblies, it was common to identify poor gene predictions based upon the automated gene prediction algorithms. Some of these predictions could be fixed by additional cDNA/EST information, by changing to a different algorithm model, or making adjustments to the model in attempts to create a gene prediction with a typical SNARE structure (through addition or adjustments to exons). In some cases, this was not possible, and the most likely predictions appeared to be fragmentary genes, or pseudogenes. Gene fragments and pseudogenes represent cases where the most similar sequence (both protein and nucleotide) would be from a related gene in the same genome, but the predicted gene would only represent a truncated protein or a protein that would be missing functionally relevant domains found in SNARE proteins. In this analysis, none of the predicted gene fragments had any evidence of expression. Some pseudogenes are represented in EST databases, but in many cases this expression was very limited. For example, Arabidopsis VAMP723 (At2g33110) contains several mutations in the third exon (with respect to other gene family members; this exon encodes the conserved central core of the SNARE motif), and has evidence of only very low expression across many tissues (as derived from AtGenExpress data; Schmid et al., 2005). Even if such a protein were produced, it would be very unlikely to produce a functional SNARE protein. Although such genes were included in Supplemental Table S1, they were generally not used in phylogenetic analysis since pseudogenes and gene fragments can significantly affect alignment and distance estimates.

Alignment and Phylogeny
The SNARE protein sequences, either full length or just the SNARE domain were aligned by ClustalW (MEGA3.1 software; Kumar et al., 2004). Alignments of only the SNARE domains were created by hand through deletion of all residues except the traditionally defined heptad-repeat motif common to all SNAREs. Trees were produced using a boot-strapped neighborjoining algorithm (Saitou and Nei, 1987; as implemented in MEGA3.1) using standard settings. Other methods of phylogenetic analysis did not result in any significant changes to topology identified by neighbor joining (data not shown), nor did such methods help to resolve some poorly supported branches. Among the green plant sequences, very little change in topology occurred between analyses using full-length sequences and those using just the SNARE domain. The latter were only used in cases where truncated SNAREs (brevins, or the BET1/SFT1 groups) that lack the N-terminal extensions were the object of study to prevent spurious alignments and in cases where all-SNARE alignments were used to allow useful distance estimations. Examination of the gene structure (intron position and number, etc.) for several SNARE groups also helped support some weakly supported clusters.
Other information, such as the well-established evolutionary series or green plants and algae (e.g. Sanderson et al., 2004) also helped to support some of the phylogenetic analysis.

Supplemental Data
The following materials are available in the online version of this article.
Supplemental Figure S1. Phylogenetic tree of ER Qb-and Qc-SNAREs.
Supplemental Figure S2. Phylogenetic tree of Qa-SNAREs from five major clades.
Supplemental Figure S3. Phylogenetic tree of Qb-SNAREs of the Golgi, endosomes, and PM.
Supplemental Figure S4. Phylogenetic tree of Qc-SNAREs of the TGN, endosomes, and PM.
Supplemental Table S1. List of predicted SNARE proteins from green plant genomes.