An increasing number of plant-insect studies using phylogenetic analysis suggest that cospeciation events are rare in plant–insect systems. Instead, nonrandom patterns of phylogenetic congruence are produced by phylogenetically conserved host switching (to related plants) or tracking of particular resources or traits (e.g., chemical). The dominance of host switching in many phytophagous insect groups may make the detection of genuine cospeciation events difficult. One important test of putative cospeciation events is to verify whether reciprocal speciation is temporally plausible. We explored techniques for double-dating of both plant and insect phylogenies. We use dated molecular phylogenies of a psyllid (Hemiptera)–Genisteae (Fabaceae) system, a predominantly monophagous insect–plant association widespread on the Atlantic Macaronesian islands. Phylogenetic reconciliation analysis suggests high levels of parallel cladogenesis between legumes and psyllids. However, dating using molecular clocks calibrated on known geological ages of the Macaronesian islands revealed that the legume and psyllid radiations were not contemporaneous but sequential. Whereas the main plant radiation occurred some 8 million years ago, the insect radiation occurred about 3 million years ago. We estimated that >60% of the psyllid speciation has resulted from host switching between related hosts. The only evidence for true cospeciation is in the much more recent and localized radiation of genistoid legumes in the Canary Islands, where the psyllid and legume radiations have been partially contemporaneous. The identification of specific cospeciation events over this time period, however, is hindered by the phylogenetic uncertainty in both legume and psyllid phylogenies due to the apparent rapidity of the species radiations.
Insect–plant interactions are temporally, spatially, and ecologically dynamic, resulting in complicated patterns of association that are extremely challenging to analyze (Mitter et al., 1991; Thompson, 1994; Funk et al., 1995; Becerra and Venable, 1999). There is evidence that host phylogeny, biogeography, chemistry, and within-population (and even within-individual) variation influence host selection, specificity, and speciation in phytophagous insects (Whitham et al., 1984; Bernays and Chapman, 1994; Becerra, 1997; Janz and Nylin, 1998; Berenbaum, 2001). Modern phylogenetic methods and molecular techniques have greatly increased the ability of researchers to acquire comparatively sampled data sets and to generate phylogenetic hypotheses for multiple interacting lineages. Most methods for analyzing cospeciation have been developed around animal–parasite systems (e.g., Page, 1994, 1996, 2002; Charleston, 1998; Legendre et al., 2002) because the clearest demonstration of cospeciation was found in a few classic studies of animals and their parasitic lice (Hafner and Nadler, 1988; Page and Hafner, 1996). The term coevolution, however, which is often used to describe processes behind cospeciation, was first used in an analysis of an insect–plant system by Ehrlich and Raven (1964). Their original use of coevolution was closer to coadaptation (reciprocal adaptive selection, e.g., an arms race resulting in speciation events that are not necessarily contemporaneous) than to cospeciation (parallel cladogenesis with contemporaneous speciation events). Whereas cospeciation is specifically concerned with events leading to reciprocal and contemporaneous speciation in the interacting lineages, coevolution involves the broader implications of ecological processes, adaptative and maladaptive landscapes, and “escape and radiate” models (Thompson, 1999a, 1999b).
The detection of cospeciation events appears, theoretically at least, to be more straightforward than the detection of coevolutionary processes. Cospeciation events are identified by the presence of two patterns: phylogenetic branching that is identical and speciation events that are contemporaneous. Because of the limitations of using only extant taxa, cospeciation followed by lineage sorting and extinction may give patterns identical to those of host switching within a host lineage. Levels of host specificity among phytophagous insects may vary according to feeding type, biogeographic region, and diversity of the host group (Novotny et al., 2002). In groups that are predominantly host species specific, there appears to be a contradiction because host specificity is often combined with high levels of host switching (Menken, 1996; Schoonhoven et al., 1998). Because switches are often phylogenetically conserved (i.e., switches between related plants), we encounter significantly nonrandom patterns of association between plants and insects when reconciling the two phylogenies (Charleston and Robertson, 2002). To illustrate graphically some potentially confounding processes that are evident in plant–insect systems, consider a hypothetical plant lineage that fluctuates over time (Fig. 1). The insect lineage responds in an adaptive manner that is both opportunistic and constrained. The objective is to show how the changing host evolutionary dynamic presents a “moving target” for an insect lineage. Host switching is promoted in those cases where an insect lineage colonizes preexisting host plant diversity with a large number of unoccupied hosts rather than a history of cospeciating contemporaneously with the host. This colonization scenario may be difficult to distinguish from the cospeciation scenario because of phylogenetic tracking due to preferential host switching to closely related species.
It is difficult to determine the degree of cospeciation and host switching from host and parasite phylogenies alone because many different scenarios can plausibly give rise to the same patterns. The program TreeMap (Page, 1994) implements a simple model to reconcile tree topologies by minimizing host switching. However, in situations where host switching is known to be common, such as in plant–insect systems (Funk et al., 1995; Weintraub et al., 1995; Futuyma and Mitter, 1996; Moran et al., 1999; Ronquist and Liljeblad, 2001), this methodology may be inappropriate. All current methods for analyzing the extent of cospeciation (e.g., TreeMap, Jungles, TreeFitter, ParaFit) become less optimal as cospeciation becomes less common. Weights can be assigned to different sorting and cospeciation phenomena and events can be optimized, but these manipulations lead to increasingly difficult optimization problems (Charleston, 1998). The most obvious test of whether a host and parasite node may be considered as a cospeciation event is whether they are plausibly contemporaneous (Moran et al., 1999; Farrell, 2001). Accurately dated phylogenies therefore may be very useful for distinguishing among various modes of speciation in host–parasite relations. However, methods for estimating divergence times are still in development, and models that consider localized variation within taxonomic groups and particular genes will result in increasingly refined molecular dating methods (Sanderson and Doyle, 2001; Arbogast et al., 2002). We have used a double-dating comparison in a psyllid (Hemiptera)–Genisteae (Leguminosae) system to examine the plausibility of cospeciation events identified by reconciling the two phylogenies.
The Psylloidea (jumping plant-lice) are a group well known for their strict host specificity (Hodkinson, 1974, 1980). These lice are usually monophagous, with a one-to-one relationship with a host plant at the species level (exhaustive taxon sampling in the Macaronesian islands indicates that 70% of species occur on a single host and 26% occur on two hosts; only one species was found on three hosts; Percy, 2003a). One psyllid may occur on two or more host plants that are closely related (only in rare instances is the same psyllid species found on unrelated hosts). More frequently, one plant may host more than one psyllid, but psyllids sympatric on the same host are rarely closely related (Percy, 2003b). A group of monophyletic psyllids (five genera in the Arytaininae) is restricted to a group of leguminous plants, Genisteae (gorses and brooms) (Hodkinson and Hollis, 1987; Percy, 2002, 2003a). Both psyllids and brooms appear to have undergone an older continental radiation in Europe and North Africa and a more recent island radiation in Madeira and the Canary Islands in the Macaronesian archipelago (Percy and Cronk, 2002; Percy, 2003b). These islands are well known for the evolutionary radiations of their biotas, both animals (Juan et al., 2000) and plants (Francisco-Ortega et al., 1996). The islands are volcanic, and their geology and ages have been accurately established (Ancochea et al., 1994) and have been used to date evolutionary events in the Macaronesian archipelago (Emerson et al., 2000). In this study, the age of La Palma (2 MY [million years]) was used to provide calibration points for psyllid and legume molecular phylogenies.
We collected legume-feeding psyllids (Percy, 2002, 2003a) and their Genisteae hosts (Percy and Cronk, 2002) from Europe, North Africa, and Macaronesia and used molecular sequence data to construct trees (Fig. 2). Species referred to here follow published taxonomic descriptions based on morphology. The legume tree is based on an internal transcribed spacer (ITS) nuclear ribosomal data set, and the psyllid tree is from a combined 12S ribosomal and cytochrome oxidase (COI and COII) mitochondrial data set (Percy and Cronk, 2002; Percy, 2003b). Taxon sampling was extensive for Iberia and Morocco, including psyllid taxa from all known host groups in each psyllid clade, and sampling was complete for Macaronesia. We also examined in detail the ecology (phenology, oviposition, and feeding sites) of each Macaronesian psyllid throughout its range to determine the ecological context of recent evolutionary events (e.g., conditions associated with host switching) in the island clades.
DNA Sequencing and Phylogenetic Analyses
The DNA sequences for the phylogenetic analyses were generated on an ABI 377 sequencer using standard procedures (Percy and Cronk, 2002; Percy, 2003b). Nine additional sequences from GenBank (Käss and Wink, 1997; Aïnouche and Bayer, 1999) were used to supplement the sampling in the legume data set. The length of the 12S plus COI-tRNA-COII aligned sequence is 961 characters (mean of 875 base pairs [bp]; 474 variable sites with 378 parsimony informative). The length of the ITS1-5.8S-ITS2 aligned sequence is 646 characters (mean of 586 bp, 211 variable sites with 116 parsimony informative). Three small regions (3–11 bp) of ambiguous alignment in the 12S data were excluded from the analyses. Gaps incurred in alignment were not treated as an additional character. Further sequence details were reported by Percy and Cronk (2002) and Percy (2003b). Maximum parsimony (MP) and maximum likelihood (ML) analyses were performed using the program PAUP* (Swofford, 2001). The following heuristic search parameters were employed for the parsimony analyses: 1,000 random stepwise addition replicates with tree bisection–reconnection branch swapping, saving multiple trees (MULTREES), and collapsing zero-length branches (COLLAPSE). Nucleotide homogeneity tests were performed in PAUP* to check for nonhomogeneity, which might affect topology. Optimal model parameters for the ML analyses were calculated with the program Modeltest (Posada and Crandall, 1998). In both cases (legume and psyllid data sets), the model specified was the general time reversible model with invariable sites and a gamma distribution. ML branch lengths based on ITS and COI and COII were used with the MP topology generated with the full data sets (ITS for legumes and 12S, COI, and COII for psyllids). We used PAUP* to generate the ML branch lengths with the specified model and topology. Information regarding the DNA sequences and phylogenetic analyses, including GenBank accession numbers, full taxon names, alignment, and the trees, are available from TreeBASE (http://www.treebase.org/), accession number S975.
Phylogeny Reconciliation and Plant–Insect Associations
Nonrandom host–parasite associations were determined by reconciling the phylogenies using a heuristic search in TreeMap and by performing 1,000 randomizations of the psyllid tree with the “proportional to distinguishable” option. The use of other tree reconciliation programs such as TreeFitter (Ronquist, 1997) or Jungles (Charleston, 1998) as implemented in TreeMap 2 offer more appropriate means of dealing with host switching, but presently these programs are limited in the size and complexity of data sets that can be input and computed. We then calculated the number and type of events that could explain the presence of each psyllid on its host given that asynchronous nodes rule out cospeciation over most of the association. We characterized these events, based on tree topology and temporal plausibility, into three categories: cospeciation, near host switching to related hosts within legume groups, and wide host switching between legume groups. When the divergence of the legume and psyllid taxa was asynchronous, we assumed that the psyllid arrived on its host via a host switch. Whether this was a near or wide host switch was determined by mapping the host groups onto the psyllid phylogeny (i.e., determining the likely ancestral host group).
A molecular clock hypothesis was rejected for all legume and psyllid genes sequenced (based on the χ2 likelihood ratio test with and without the molecular clock enforced, i.e., evolutionary rates are variable across both legume and psyllid lineages). We therefore imposed rate homogeneity using nonparametric rate smoothing (NPRS) (Sanderson, 1997) of the ITS and CO data. The program r8s implements NPRS by relaxing the assumption of a molecular clock (given a phylogenetic tree and estimated branch lengths) by using a least-squares smoothing of local estimates of substitution rates. Age calibration is then performed by assignment of a fixed age to one or more nodes. One node in each of the legume and psyllid phylogenies was fixed, with a second node fixed for a comparison of age estimates in separate analyses. The branch lengths were age calibrated using geological data (Ancochea et al., 1994). Branch lengths from 100 bootstrap replicates for both legume and psyllid data sets were then subjected to rate smoothing as a comparison and indication of sensitivity. The branch lengths from each bootstrap replicate were then age calibrated using the date of the geological formation of La Palma Island (Canary Islands).
Both the psyllids and legumes have single-island endemic species on the island of La Palma, and the deepest La Palma endemic node (base of the La Palma stem lineage) was therefore calibrated with the age of this island (2 MY; Ancochea et al., 1994) to give minimum ages. Using maximum ages only increases the asynchrony between legume and psyllid lineages. Calibrating with young dates introduces high date sensitivity into the legume data set, but using dates from the oldest islands results in impossibly old dates for the colonization of the younger islands.
MP analysis resulted in a single most-parsimonious tree for the psyllid combined 12S, COI, and COII data sets. MP analysis of the legumes recovered eight trees, and a reweighted analysis using the rescaled consistency index resulted in a single most-parsimonious tree (identical to one of the eight MP trees) for the legumes. Homogeneity of nucleotide composition across taxa was not rejected for the legume ITS data set (P = 1.000) or for the psyllid 12S (P = 0.999) and CO (P = 0.074) data sets. The ML analyses (based on single genes for the psyllids) produced trees with topologies similar to those of the trees produced by the MP analyses, differing in the placement of some individual outlying taxa not included in the TreeMap analysis. The legume branch lengths (and therefore age estimates) are much more sensitive to bootstrapping than are the psyllid branches because of the relatively short ITS branch lengths and calibration high in the tree, as shown by the SDs in Figure 3. However, the psyllids are sufficiently younger than the legumes for the difference in ages to be marked, even taking onto account the large SD. The Canary Islands and Madeira appear to have been colonized by legumes and psyllids a long time after these Islands first formed. Calibration based on La Palma gives dates that are highly consistent with the likely colonization dates of all younger islands. Two independent La Palma calibrations in different clades of both legumes and psyllids give nearly identical results for both trees.
Figure 2 shows the patterns of host–parasite association as a tanglegram. The four major legume lineages (Adenocarpus, Retama, Cytisus, and Genista groups) are strongly and nonrandomly associated with corresponding psyllid lineages. Six of the eight clades (four legume and four psyllid) treated as the major groups in this study have > 80% bootstrap support (the lack of strong support for the monophyly of one legume group and one psyllid group does not change the overall interpretation of host colonization). When the legume–psyllid associations were analyzed by TreeMap (which minimizes host-switching events), the program reconciled the tree topologies without inferring any host switches but assumed 16 cospeciation events (parallel cladogenesis), 29 duplications (parasite speciation without host speciation), and 220 sorting events (parasite loss from host lineage), suggesting that the nonrandom pattern (P < 0.005) is plausibly the result of a significant amount of cospeciation followed by extensive sorting events. However, the lack of host switching in this analysis would be unusual for a plant–herbivore system (Funk et al., 1995; Weintraub et al., 1995; Futuyma and Mitter, 1996; Moran et al., 1999; Ronquist and Liljeblad, 2001), and comparison of the ages of the putative cospeciation nodes (Fig. 3) shows that cospeciation is temporally impossible in nearly all cases. All but one of the psyllid clades are considerably younger than their associated legume clades.
Wider phylogenetic analysis of the Psyllidae revealed that the taxa most closely related to the Genisteae-feeding group are members of the genus Cacopsylla (feeding on Rhamnaceae and Rosaceae), and the initial colonization of the Genisteae may have involved a wide host switch from these plant families (Percy, 2003b). Other psyllids feeding on legumes of subfamilies Mimosoideae and Caesalpinioideae are not closely related. The dated phylogenies suggest that the Genisteae radiation dates from 6.9–9.5 MY ago (MYA). The Genisteae are monophyletic and are predominantly plants of the Mediterranean region adapted to semiarid conditions (Käss and Wink, 1997; Aïnouche and Bayer, 1999). This date corresponds to the commencement of dryer conditions in Europe and North Africa during the Pliocene (Quézel, 1978) and is consistent with other evidence for the origin of Mediterranean biomes both in southern Europe and southern Africa (Richardson et al., 2001b). By contrast, the Genisteae-feeding psyllid radiation is not contemporaneous with that of its hosts but is dated at 2.9–3.4 MYA. Two alternative calibrations of the legumes and psyllid trees using Canary Island dates give the same result, and this calibration is consistent with available data on the rates of ITS evolution in other plants (Richardson et al., 2001a) and insect mitochondrial molecular evolution (Brower, 1994). Rates of evolution for the ITS sequences are 2.5–7.0×10− 9 substitutions/site/year, those for the 12S sequences are 0.95–1.90×10− 8 substitutions/site/year, and those for the CO sequences are 2.35–3.15×10−8 substitutions/site/year.
Groups of related monophagous herbivorous insects are often found associated with groups of related plants, but perhaps the reason why cospeciation analysis has been focused on animal parasites is that parallel cladogenesis may be more common in animal–parasite systems (e.g., Page, 1996; Clark et al., 2000; Paterson et al., 2000) than in plant–insect systems (e.g., Mitter et al., 1991; Funk et al., 1995; Futuyma and Mitter, 1996; Menken, 1996). In recent studies of fig wasps, Weiblen and Bush (2002) found that differences in biology and life cycles of parasitic and mutualistic wasps may have resulted in different degrees of cospeciation with their fig hosts.
Molecular phylogenetic evidence suggests that the original colonization of the Genisteae by psyllids is likely to have been from the Rosaceae or Rhamnaceae (Percy, 2003b), but it is clear that this host switch occurred well after (ca. 4–5 MY) the initial diversification of the Genisteae. Psyllid diversification was therefore not contemporaneous with but was sequential to legume diversification. Why were the Genisteae not colonized earlier? The high toxicity of these plants, which are rich in quinolizidine alkaloids (Wink, 1992), may have presented a significant barrier to colonization. Once successful adaptation to overcome this barrier had been achieved, psyllid diversification could proceed rapidly, and the phylogenetic data are consistent with an initial radiation in the Genisteae-feeding psyllids. Under this scenario, genistoid herbivore defenses resulted in about 5 MY free from psyllid herbivory, and the breach of genistoid defenses may then have resulted in a significant new herbivore load on this plant group. The late colonization of the Genisteae rules out parallel cladogenesis over much of the legume tree. Psyllids are found on all major lineages of the Genisteae not because of contemporaneous codiversification but because of extensive host switching between and within Genisteae clades. However, this pattern does not explain the apparent phylogenetic tracking between psyllid and host and the nonrandom association between legume and psyllid clades.
The loose phylogenetic tracking pattern exhibited by the psyllid–legume system can be explained by a predomination of phylogenetically constrained host switching with a lesser degree of phylogenetically wide host switching. For the four major lineages of the Genisteae, Figure 4 shows a breakdown of how the associated psyllid species arrived on their hosts based on tree topology (i.e., ancestral host group) and temporal plausibility: by cospeciation (< 2%), wide host switching (extralineage, 38%), or related host switching (intralineage, 61%). In psyllids therefore, speciation appears to have been driven mainly by switching to preexisting closely related hosts rather than by cospeciation with the host. This study is therefore in agreement with the theory of sequential colonization (Jermy, 1976; Menken, 1996) as a dominant pattern in plantinsect interactions.
The only period in which potential cospeciation events are consistent with the dating is in the initial stages of the island radiation because of the near contemporaneous colonization of these islands by one of the island legume and psyllid groups. Psyllids from this group first appeared in the islands very shortly (2.5 MYA) after the legumes (2.9 MYA), followed by significant radiations in both the legumes and associated psyllids (2.5–2.7 MYA). Thus, two distinct phases of phylogenetic tracking can be seen: (1) the initial psyllid radiation onto preexisting legume diversity involving host switching onto available hosts and (2) an island psyllid radiation driven by a contemporaneous island legume radiation and involving a mixture of possible cospeciation events and host switching. The identification of specific cospeciation events during early island diversification is problematic though, given the level of phylogenetic uncertainty in both legume and psyllid phylogenies during this period. The lack of resolution appears to be due to rapid speciation after colonization of the islands (internal branches are very short; Percy, 2003b), or to recent speciation (terminal branches are short; Percy and Cronk, 2002).
There are clear examples of host switching in the island radiation, and we examined the ecology of these cases in the field. This examination suggested the following three susceptibility factors for host switching.
Host population size. The few legumes that have multiple psyllids are generally abundant (e.g., Chamaecytisus proliferus with numerous wild and cultivated populations on five islands). It seems that host switching onto abundant hosts, possibly due to increased frequency of contact, is more likely. Conversely, most of the Canary Island legumes without an associated psyllid are rare (e.g., Teline nervosa with ca. 700 individuals and T. gomerae with ca. 2,000 individuals). These host population sizes may be below a critical threshold for maintaining a host specific psyllid population.
Geographical proximity. There are two examples of host switching between unrelated but sympatric legumes in the putative host races of Arytinnis modica on La Palma and El Hierro, where sympatry may be promoting wide host switching.
Unoccupied host. Most legumes have only one psyllid, and legumes that are already colonized may be more resistant to incoming host-switching psyllids. Most of the examples of host switching appear to be to unexploited legumes (e.g., Laburnum, Genista segonnei, Teline splendens). Over time, hosts that become rare may lose their psyllid, but if these hosts become common again, they may present an unexploited host (as illustrated in Fig. 1).
Phytophagous insects appear to adapt to the challenge of fluctuations in plant lineages by opportunistic colonization of new hosts. Specialization to particular hosts and even individual plants appears to be selective in the face of complex plant chemistries and assortative mating behavior (Feder et al., 1994; Bernays, 1998; Agrawal, 2000) but in turn restricts the range of opportunistic colonization of other plants. Much of the localized spatial and geographic complexity evident in extant plant–insect associations cannot be recovered through phylogenetic analysis, but dating the plant and insect lineages considerably simplifies the search for periods during the association when cospeciation could have occurred. The effects of tree topology, ecology, and biogeography on determining cospeciation can then be analyzed in more detail. A novel use of dating plant–insect lineages is the interpretation of historical biogeography of both the taxa and the interaction (Pellmyr et al., 1998). Double-dating, as used here, can provide an important source of additional information for distinguishing among competing hypotheses and for detecting cryptic host switching.
This work was supported by the Natural Environment Research Council (NERC GTO4/97/109/TS and NERC GR3/11075), the Carnegie Trust for the Universities of Scotland, and the Louise Hiom Award (University of Glasgow). We thank Brian Farrell, Kevin Johnson, Michael Sanderson, Chris Simon, John Thompson, John Trueman, George Weiblen, and an anonymous reviewer for comments on earlier drafts of the manuscript.