Rodentia is the largest order of placental mammals, with approximately 2,050 species divided into 28 families. It is also one of the most controversial with respect to its monophyly, relationships between families, and divergence dates. Here, we have analyzed and compared the performance of three nuclear genes (von Willebrand Factor, interphotoreceptor retinoid-binding protein, and Alpha 2B adrenergic receptor) for a large taxonomic sampling, covering the whole rodent and placental diversity. The phylogenetic results significantly support rodent monophyly, the association of Rodentia with Lagomorpha (the Glires clade), and a Glires + Euarchonta (Primates, Dermoptera, and Scandentia) clade. The resolution of relationships among rodents is also greatly improved. The currently recognized families are divided here into seven well-defined clades (Anomaluromorpha, Castoridae, Ctenohystrica, Geomyoidea, Gliridae, Myodonta, and Sciuroidea) that can be grouped into three major clades: Ctenohystrica, Gliridae + Sciuroidea, and a mouse-related clade (Anomaluromorpha, Castoridae + Geomyoidea, and Myodonta). Molecular datings based on these three genes suggest that the rodent radiation took place at the transition between Paleocene and Eocene. The divergence between rodents and lagomorphs is placed just at the K-T boundary and the first splits among placentals in the Late Cretaceous. Our results thus tend to reconcile molecular and morphological-paleontological insights.
Rodents form the most abundant and diversified order of living mammals, representing about 40% of the total number of mammalian species. Their tremendous diversity has always been a challenge for those interested in their origins, ways of radiation, and times of diversification. A major debate was raised by the claim that “the guinea pig is not a rodent” (Graur, Hide, and Li 1991<$REFLINK> ; D'Erchia et al. 1996<$REFLINK> ). This proposal not only contradicted the conventional view of a monophyletic origin of the rodents but also conflicted with the familiar association of rodents and lagomorphs into a clade called Glires. Like Rodentia, the Glires are well supported by morphological synapomorphies (Luckett and Hartenberger 1993<$REFLINK> ). As a consequence, these questions relating to rodent relationships are currently disputed between morphology and some molecular approaches. On the one hand, molecular results—mainly based on complete mitochondrial genomes—have been criticized because only a few rodent and lagomorph species were included in the analyses, i.e., mouse, rat, guinea pig (D'Erchia et al. 1996<$REFLINK> ), more recently complemented with dormouse (Reyes, Pesole, and Saccone 1998<$REFLINK> ), squirrel (Reyes et al. 2000<$REFLINK> ), cane rat (Mouchaty et al. 2001<$REFLINK> ), and rabbit (Gissi, Gullberg, and Arnason 1998<$REFLINK> ). It has been suggested that the proposed paraphyly of rodents is an artifact that might be corrected by the analysis of a larger taxonomic sampling (Luckett and Hartenberger 1993<$REFLINK> ; Philippe 1997<$REFLINK> ) and the use of appropriate models of sequence evolution (Sullivan and Swofford 1997<$REFLINK> ). On the other hand, morphological synapomorphies for rodents are mainly based on dental and cranial characters (Luckett and Hartenberger 1993<$REFLINK> ) that could be the result of ecological constraints and homoplasies (e.g., Li et al. 1992<$REFLINK> ). Recently, the growing number of molecular markers—in combination with a broader species sampling within the order Rodentia—indeed provided weak to moderate support for rodent monophyly, both with mitochondrial (Nedbal, Honeycutt, and Schlitter 1996<$REFLINK> ) and nuclear sequences (Huchon, Catzeflis, and Douzery 1999<$REFLINK> , 2000<$REFLINK> ; Adkins et al. 2001<$REFLINK> ; DeBry and Sagel 2001<$REFLINK> ). The combination of numerous independent nuclear markers now even led to a robust support for rodent monophyly (Murphy et al. 2001a<$REFLINK> ).
The relationships between rodent families are also disputed. Morphological approaches have been frustrated by convergent evolution of characters (e.g., Jaeger 1988<$REFLINK> ), and the current intraorder classification is therefore largely unsatisfactory. For example, the long-standing division of rodents into Hystricomorpha, Myomorpha, and Sciuromorpha, on the basis of the insertion patterns of masseter muscles, or alternatively into Hystricognathi and Sciurognathi, on the basis of the plane of incisor insertions, have both been shown to be inadequate (Hartenberger 1985<$REFLINK> ; Nedbal, Honeycutt, and Schlitter 1996<$REFLINK> ; Huchon, Catzeflis, and Douzery 2000<$REFLINK> ; Adkins et al. 2001<$REFLINK> ). However, the monophyly of most rodent families seems well established (Hartenberger 1985<$REFLINK> ). Few molecular studies have until now investigated the relationships among rodent families, and none have included all rodent lineages or the same taxonomic sampling (or both), thus limiting the comparison of the phylogenetic results (Catzeflis et al. 1995<$REFLINK> ; Nedbal, Honeycutt, and Schlitter 1996<$REFLINK> ; Matthee and Robinson 1997<$REFLINK> ; Robinson et al. 1997<$REFLINK> ; Huchon, Catzeflis, and Douzery 1999<$REFLINK> , 2000<$REFLINK> ; Robinson-Rechavi, Ponger, and Mouchiroud 2000<$REFLINK> ; Adkins et al. 2001<$REFLINK> ; DeBry and Sagel 2001<$REFLINK> ; Murphy et al. 2001a;<$REFLINK> Montgelard et al. 2002b<$REFLINK> ). For example, singular families like Ctenodactylidae or Anomaluridae are rarely sampled. In the present study we include for the first time all major rodent lineages—namely all sciurognath families and hystricognath superfamilies—the two lagomorph families, and representatives of all other eutherian orders, in order to investigate rodent monophyly as well as the phylogenetic relationships between rodent families.
The timing of the origin of rodents is also controversial. Fossil evidence indicates a radiation of the rodents 55 MYA (Hartenberger 1998<$REFLINK> ), whereas molecular clocks based on a few rodent species tend to support a Cretaceous origin and diversification of rodents, 89–125 MYA for the divergence of murids or hystricognaths (Janke, Xu, and Arnason 1997<$REFLINK> ; Kumar and Hedges 1998<$REFLINK> ; Cao et al. 2000<$REFLINK> ). Recent analyses, including a broader rodent sampling, but based on single sequences, suggest that the radiation of rodent families is older than the K-T limit (75 Myr) (Huchon, Catzeflis, and Douzery 2000<$REFLINK> ; Adkins et al. 2001<$REFLINK> ). One should note that all molecular datings for rodents are based on different genes, different sampling, and different methods. We compare here the dating performance of three genes using quartet dating, a method to estimate molecular ages of divergence that allows for rate heterogeneity between lineages (Rambaut and Bromham 1998<$REFLINK> ).
We analyze three nuclear genes that are not genetically linked and code for proteins that do not display any known biological interactions: the alpha 2B adrenergic receptor (A2AB), the exon 1 of interphotoreceptor retinoid-binding protein (IRBP), and the exon 28 of von Willebrand factor (vWF). The choice of these nuclear markers was based on three considerations. First, all three sequences have successfully been used to reconstruct the phylogeny of eutherians at various taxonomic levels (Porter, Goodman, and Stanhope 1996<$REFLINK> ; Springer et al. 1997a,<$REFLINK> 1997b;<$REFLINK> DeBry and Sagel 2001<$REFLINK> ; Delsuc et al. 2001<$REFLINK> ; Madsen et al. 2001<$REFLINK> ). Second, they display similar sizes and numbers of variable sites, which favors the comparison of their phylogenetic performance. Third, nuclear genes have been suggested to perform better than mitochondrial ones (Springer et al. 2001<$REFLINK> ). The separate and combined phylogenetic analyses of A2AB, IRBP, and vWF allow to (1) define new clades among rodents, (2) strongly confirm the monophyly of Rodentia and Glires, (3) evaluate the properties of independent markers for dating purposes, and (4) suggest a Tertiary radiation of rodent families, a K-T split between rodents and lagomorphs, and a Late Cretaceous origin for placental orders.
Materials and Methods
Amplification and Sequencing
Twenty-two rodent taxa were selected to cover the current diversity of the order, with at least one representative per family or superfamily (See Supplementary Material at MBE Web site: http://www.molbiolevol.org). All 10 sciurognath lineages were sampled (Muridae, Dipodidae, Geomyidae, Heteromyidae, Gliridae, Sciuridae, Aplodontidae, Castoridae, Anomaluromorpha, Ctenodactylidae) as well as the eight lineages that represent the whole hystricognath diversity (Thryonomyidae, Petromuridae, Bathyergidae, Hystricidae, Chinchilloidea, Octodontoidea, Cavioidea, Erethizontoidea) (Huchon and Douzery 2001<$REFLINK> ). The vWF exon 28 (1236 aligned positions), the 5′ third of the IRBP exon 1 (1227 positions), and the A2AB gene (1170 positions) were newly determined for five (Tachyoryctes, Dipodomys, Thomomys, Castor, and Anomalurus), 21 (all but Mus), and 18 (all but Mus, Rattus, and Cavia) rodent species, respectively. Additionally, the two lagomorphs Lepus and Ochotona were sequenced for IRBP and A2AB.
DNA extractions and PCR reactions were conducted as described before (Huchon, Catzeflis, and Douzery 1999<$REFLINK> ), with slight modifications: 1 M betaine was included in the PCR mixture (Henke et al. 1997<$REFLINK> ), and annealing was performed at 50°C. The PCR primers for amplifying the A2AB, IRBP, and vWF sequences have been described in Stanhope et al. (1992)<$REFLINK> , Springer et al. (1997b<$REFLINK> ), and Huchon, Catzeflis, and Douzery (1999,<$REFLINK> 2000)<$REFLINK> , respectively. When the amount of amplified DNA was insufficient for direct sequencing, smaller overlapping DNA fragments were obtained by reamplification of the initial PCR product and then sequenced. Additional sequencing primers were designed when required. All vWF sequences, the A2AB sequences of Anomalurus, Aplodontia, Bathyergus, Castor, Chinchilla, Dipodomys, Echimys, Glis, Massoutiera, Petromus, Thomomys, Trichys, and the IRBP sequences of Dinomys, Lepus, Macropus, Pedetes, and Tachyoryctes were obtained manually with [α33P]ddNTP. The other IRBP sequences were determined using dye-terminator cycle-sequencing reactions and an Applied Biosystems 373A automatic sequencer. Despite several attempts, it was not possible to amplify the A2AB of Pedetes.
PCR amplifications of A2AB from Dipus, Dryomys, Erethizon, Lepus, Marmota, Tachyoryctes, and Thryonomys were as previously described (Springer et al. 1997b), except for Dryomys and Lepus where 1 M betaine and 1.3% DMSO were added to the reaction mixture. PCR products were cloned into the pGEM-T easy vector (Promega), and sequences were determined using the Big dye-terminator cycle-sequencing kit and an ABI Prism 3700 DNA Analyser (Applied Biosystems). Clones from at least two independent PCR amplifications were sequenced to detect ambiguity caused by the PCR reaction.
Accession numbers of the newly determined sequences (AJ427226–AJ427270) are given in the Supplementary Material, together with those for the other species used in this study.
A2AB, IRBP, and vWF sequences of 40 placentals and two marsupials were manually aligned. Only few gaps had to be introduced, which were coded as missing data. A glutamic acid repeat of variable length in the A2AB gene was excluded from subsequent analyses. Five nucleotide data sets were analyzed: each gene separately, vWF and IRBP combined (to assess the position of Pedetidae among rodents), and the three genes in concatenation. One protein data set comprising the concatenation of the A2AB, IRBP, and vWF amino acid sequences was also analyzed.
The models of sequence evolution used were HKY85 for nucleotides and JTT for amino acids. The HKY85 model was favored, relative to more complex models (i.e., GTR), because of computation time limitations. However, additional Bayesian analyses (data not shown) indicated that the use of either GTR or HKY85 did not have an impact on the phylogenetic conclusions. Rate heterogeneity among DNA and protein sites was described by a discrete Gamma distribution with eight categories (Γ8).
Neighbor-Joining (NJ) analyses with weighted average (WAVE) maximum likelihood distances were performed with PHYLIP 3.573 (Felsenstein 1995<$REFLINK> ) and WAVEBOOT 1.2 (Krajewski et al. 1999<$REFLINK> ). In these analyses the nucleotide data matrix was partitioned as follows: each codon position in each of the three genes was allowed to have its own rate, base frequency, and transition-to-transversion ratio.
Maximum parsimony (MP) and maximum likelihood trees were inferred using PAUP* (Swofford 1999<$REFLINK> ), version 4, releases beta 4 and 8; and TREE-PUZZLE 4.0.2 (Strimmer and von Haeseler 1996<$REFLINK> ). MP nucleotide sequence analyses were conducted with equal or differential weighting of character state changes. In the latter case, the six possible nucleotide substitutions were weighted at each codon position for each of the three genes, according to their consistency index, excluding uninformative characters (Hassanin, Lecointre, and Tillier 1998<$REFLINK> ).
Before running individual and combined heuristic ML PAUP* searches on nucleotide sequences, the program TREE-PUZZLE 4.0.2 was used to estimate the transition-transversion parameter (κ) and the parameter (α) of the Gamma distribution of the HKY + Γ8 model. For amino acid sequences, ML reconstructions under the JTT + Γ8 model were obtained using the quartet-puzzling method with TREE-PUZZLE 4.0.2.
Bayesian phylogenetic analyses were performed with MrBayes 2.1 (Huelsenbeck and Ronquist 2001<$REFLINK> ). The Metropolis-coupled Markov chain Monte Carlo sampling approach was used to calculate posterior probabilities. Prior probabilities for all trees were equal, starting trees were random, tree sampling was done every 20 generations, and burn-in values were determined empirically from the likelihood values. To check for consistency of results, four Markov chains were run simultaneously, twice for 200,000 and twice for 500,000 generations.
For nucleotide sequence analysis, third codon positions were excluded from ML and Bayesian analyses because of their base compositional heterogeneity between taxa (as evaluated by TREE-PUZZLE 4.0.2 and PAUP*), whereas they were kept in NJ-WAVE—where each codon position had its own rate—and MP analyses.
Robustness of the nodes of the phylogenetic trees was assessed by bootstrap (Felsenstein 1985<$REFLINK> ) with PAUP* and reliability percentages (RP) with TREE-PUZZLE 4.0.2. For NJ-WAVE and MP analyses, 1,000 bootstrap replicates were computed. For ML analyses, computing time limitations forced us to estimate bootstrap percentages (BP) after only 100 replicates, with HKY + Γ8 parameters set to the values estimated for the best tree, with NJ starting trees, and with 1,000 rearrangements of TBR branch swapping. RP were estimated after 10,000 puzzling steps.
Comparison of Alternative Phylogenies
The first step was to reconstruct alternative tree topologies. PAUP* heuristic searches under a single HKY + Γ8 model and incorporating a topological constraint were conducted in order to identify the highest-likelihood topology that satisfied a given hypothesis (e.g., the paraphyly of rodents). Second, the alternative topologies previously identified were evaluated and compared relative to the best ML topology found for the nucleotide sequences. Statistical comparisons were conducted with partitioned maximum likelihood on the nucleotide matrix and on the protein matrix of combined data. For the nucleotide sequences, to account for differences in evolutionary substitution processes between codon positions and between the three genetically independent nuclear markers, first and second codon positions were distinguished for A2AB, IRBP, and vWF. This resulted in the definition of six partitions of sites. Six independent HKY + Γ8 models were thus assumed for each of these six partitions. The topologies previously identified after PAUP* heuristic searches were evaluated and compared under the more complex 6-partitions model with PAML (Yang 1997<$REFLINK> ), version 3.0d. In the latter case, all ML parameters—i.e., 6 transition-transversion rate parameters, 6 α parameters, and 6 × 81 branch parameters—were reestimated by PAML for each evaluated topology. Partitioned log-likelihoods were then compared using the Kishino and Hasegawa (1989)<$REFLINK> test with the Shimodaira and Hasegawa (1999)<$REFLINK> correction.
Because of computation time limitations, a similar approach could not be applied for the protein sequences. A single partition was considered, and topology comparisons were done under a JTT + Γ8 model assumed for the concatenated protein data set. The 81 branch parameters were reestimated by PAML for each evaluated topology, and the Kishino and Hasegawa (1989)<$REFLINK> test with the Shimodaira and Hasegawa (1999)<$REFLINK> correction was performed.
Divergence times were estimated using quartet dating (Rambaut and Bromham 1998<$REFLINK> ), a method allowing each lineage to have a different rate of evolution, and implemented for nucleotide sequence analysis in the program QDATE 1.1. This ML method calculates the divergence date between two pairs of calibrating lineages, each one being represented by two species for which the time of divergence is known. Four molecular datings were estimated: the first split within Rodentia (“R” in fig. 2 ), the divergence between the two Glires lineages (“G”), the divergence between Laurasiatheria and [Glires + Euarchonta] (“L/E + G”), and the first split among placental mammals (“P”). These molecular datings were derived from all combinations of, respectively (1) two time-calibrated pairs of rodents (“D1“ or “D2“ vs. “D3“ or “D4“; fig. 2 ), (2) one pair of rodents versus one pair of lagomorphs (“D5“), (3) one pair of glires versus one pair of cetartiodactyls (“D6“), and (4) one pair of paenungulates (“D7“) versus the remaining pairs of placentals. The seven pairs of calibrating taxa are detailed in table 1 .
Rate constancy was tested with likelihood ratio tests (LRT) (Felsenstein 1988<$REFLINK> ), and only quartets that fitted a two-rate constrained model (i.e., each dating pair of the quartet does have its own rate), relative to a free-rate model (i.e., each branch of the quartet has a different rate), were kept in the dating estimations. Because rate heterogeneity between sites can affect estimation of divergence date (Rambaut and Bromham 1998<$REFLINK> ), sequence evolution was described by the HKY + Γ8 model. Quartet dating results were described by the median of the distribution of the quartet date estimates, and the associated lower and upper limits of the 95% confidence interval were those of the median quartet.
Results and Discussion
Base composition homogeneity of all codon positions was evaluated at the 1% level of chi-square tests for the five data sets, A2AB, IRBP, vWF, IRBP + vWF, and A2AB + IRBP + vWF, by comparing the nucleotide composition of each sequence with the frequency distribution assumed in the ML model. These five data sets actually displayed a significant base compositional heterogeneity for 5, 4, 7, 10, and 13 species, respectively—including 2, 1, 2, 5, and 7 Glires, respectively—among the 42 mammals. A closer examination revealed that the base composition at third codon positions was responsible for this heterogeneity. For each codon position of the three genes, we computed the difference in GC content between the GC-richest sequence and the GC-poorest sequence. The difference in GC content ranges between 8.1% and 16.9% for the first codon position, 6.3% and 10.3% for the second codon position, and 24.7% and 35.9% for the third codon position, with vWF showing the most heterogeneous base composition. Some sequences presented high (e.g., Bradypus) or low (e.g., Mus, Orycteropus, Marsupialia) GC content at third codon positions for the three genes. Other sequences showed extreme GC values at third codon positions for one gene but presented average GC levels for the two other genes (e.g., Geomyidae for vWF vs. A2AB and IRBP). After exclusion of third codon positions, base composition became homogeneous for all placental taxa. Because preliminary results indicated that ML reconstructions might be affected by the base composition heterogeneity at third codon positions, all subsequent ML nucleotide analyses were conducted on first + second codon positions only. No compositional heterogeneity was detected in the protein sequences.
Phylogenies Reconstructed from Individual Genes
All tree-building methods either on nucleotide or on protein sequences (NJ-WAVE, standard and differentially weighted MP, ML, and Bayesian approach) provided the same overall phylogenies, with minor topological variations involving only weakly supported nodes. Considering ML analysis on nucleotide sequences as best representing the results of the individual genes, only those results will be detailed here. For comparative purposes, quartet puzzling reliability percentage observed on protein sequences (RPPROT) and Bayesian posterior probabilities for nucleotides (PPNUC) and proteins (PPPROT) are however indicated for the combined data set.
The trees reconstructed from first plus second codon positions of each of the three genes are given in figure 1 . All three trees suggest the monophyly of rodents and that of Sciuroidea (Marmota + Aplodontia), Geomyoidea (Dipodomys + Thomomys), Hystricognathi (Bathyergus, Thryonomys, Petromus, Trichys, Cavia, Chinchilla, Echimys, and Erethizon), and Ctenohystrica (Massoutiera + hystricognaths). Discrepancies between the trees always involve weakly supported nodes (i.e., not involving two conflicting nodes with BPML > 50), except for the position of Dipodidae and the relationships among Cetartiodactyla. According to A2AB and IRBP, Dipodidae are the sister clade of Muridae (BPML = 67 and 74, respectively), whereas vWF clusters Dipodidae with Geomyoidea (BPML = 65). The grouping of Dipodidae with Geomyidae may be an artifact resulting from similar base compositions and rapid rates of evolution of their vWF sequences (data not shown). Within Cetartiodactyla, A2AB and IRBP place Lama in the most basal position relative to other cetartiodactyls (BPML = 51 and 81, respectively), whereas vWF clusters Lama with Sus (BPML = 66).
In spite of having similar lengths, the three nuclear genes do not contain the same phylogenetic signal. Each gene strongly supports some nodes, but these nodes might be different from one gene to another, and none is able to solve the whole rodent phylogeny. For example, only A2AB provides high support for rodent monophyly (BPML = 83), IRBP for Myodonta monophyly (i.e., Muridae + Dipodidae, BPML = 74), and vWF for Ctenohystrica monophyly (BPML = 91). The results also indicate that the resolving power of each gene is not restricted to a given taxonomic level. For example, A2AB, unlike vWF, is able to solve deep relationships like rodent monophyly as well as more recent relationships like Sciuroidea (BPML = 98) but not intermediate clades like Ctenohystrica (BPML = 41), whereas vWF does so.
Phylogenies Reconstructed from the Concatenated Genes
When codon positions 1 and 2 of the combined A2AB + IRBP + vWF genes were analyzed under a single HKY + Γ8 model, the log-likelihood of the best topology was lnL = −26,415.56. Assuming six independent HKY + Γ8 models for each of the six partitions yielded lnL = −25,988.88 and resulted in a significant increase of log-likelihood (LRT statistics = 853.36; df = 430; P < 0.0001). The ML parameters estimated for the six partitions are given in table 2 . All three genes exhibit similar base compositions, on first as well as on second codon positions. The slowest evolving partition is the second codon position of A2AB, followed by—with increasing relative rate—A2AB (first codon position), IRBP (second), vWF (second), IRBP (first), and vWF (first). Second codon positions are more heterogeneous than first positions in terms of substitution rates for the three markers, but they display a higher transition-transversion rate parameter for IRBP and vWF.
Combination of the Three Markers
Although the A2AB, IRBP, and vWF trees do not show any major topological incongruences, crossed Shimodaira-Hasegawa tests indicate that each nucleotide data set rejects the highest-likelihood topology of the two other data sets (table 3 ). However, none of the three genes rejects the ML topology obtained from the combined data set. Consequently, this a posteriori observation suggests that the three genes can be combined and that the combined data tree accordingly appears to be the “best provisional phylogenetic hypothesis” (Adkins et al. 2001<$REFLINK> ). The combination of A2AB, IRBP, and vWF leads to a topology that stabilizes the phylogenetic position of rodents among mammals and contributes to resolve most of the relationships between rodent families.
Position of Rodentia Among Mammals
Phylogenetic analyses based on first plus second codon positions of the three concatenated nuclear genes all indicate the monophyly of rodents, its support being the highest under the weighted MP (not shown) and the maximum likelihood approaches (BPML = 95: fig. 2 ; RPPROT = 74). Maximum likelihood tests of various phylogenetic hypotheses indicate that the alternative to the monophyly of rodents is significantly less likely (table 4 ). The Bayesian approach also provides a posterior probability of 1.00 for the monophyly of rodents for both nucleotide and protein sequences (trees not shown). Our results thus confirm statistically the monophyly of Rodentia with an extended sampling of this order—including all sciurognath families and hystricognath superfamilies—and with representatives of all other placental orders. In fact, with the recent increase of available complete mitochondrial genomes, rodent monophyly is no longer statistically rejected either (Cao et al. 2000<$REFLINK> ; Mouchaty et al. 2001<$REFLINK> ). It thus appears that the paraphyly of rodents will be difficult to defend with the broader taxonomic sampling within the order and with the growing number of molecular markers supporting their monophyly.
In agreement with the independent nuclear markers analyzed by Murphy et al. (2001a,<$REFLINK> 2001b)<$REFLINK> , the concatenation of A2AB, IRBP, and vWF suggests that Rodentia is the sister group of Lagomorpha, constituting the superorder Glires (PPNUC and PPPROT = 1.00; BPML = 84; RPPROT = 37). The best alternative hypothesis to the Glires monophyly appears to be significantly less likely with protein sequences but only marginally significant with nucleotide sequences (table 4 ). We also verify that the Euarchonta (Primates + Dermoptera, and Scandentia) are the sister clade of Glires (PPNUC and PPPROT = 1.00; BPML = 66: fig. 2 ; RPPROT < 20), and the best alternative hypothesis appears to be significantly less likely with the nucleotide sequences only (table 4 ). This superordinal clade “Euarchontoglires” has recently been proposed by molecular studies (Madsen et al. 2001<$REFLINK> ; Murphy et al. 2001a,<$REFLINK> 2001b<$REFLINK> ) but has never been suggested by morphology. Morphological studies generally cluster Glires with Macroscelidae, to form the Anagalida, and Primates, Dermoptera, and Scandentia with the Chiroptera, to form the Archonta (e.g., McKenna and Bell 1997, p. 295<$REFLINK> ). However, Anagalida and Archonta are based on only a few synapomorphies, and these clades have been rejected by molecular studies (e.g., Murphy et al. 2001a;<$REFLINK> Springer et al. 1997a,<$REFLINK> 1999<$REFLINK> ). It is interesting to note that a recent paleontological study clusters Macroscelidae with Proboscidea (Tabuce et al. 2001<$REFLINK> ) and that a relationship between Glires and Primates has been suggested (McKenna 1986<$REFLINK> ). Our results are thus not at odds with morphology and suggest that paleontological evidence should search for a sister clade relationship between Glires and Euarchonta. Relationships among mammals as shown in figure 2 are not further discussed because of insufficient species sampling.
Relationships Among Rodents
Our results suggest the division of Rodentia into three major infraordinal clades. The first one comprises squirrel- and dormouse-related animals: Sciuroidea (Sciuridae [squirrels] + Aplodontidae [mountain beavers]) and Gliridae (dormice). The second clade contains mouse-related rodents: Myodonta (Muridae [mouse, rats] + Dipodidae [jerboas]), Castoridae (beavers), Geomyoidea (Geomyidae [pocket gophers] + Heteromyidae [pocket mice]), and Anomaluromorpha (Anomaluridae [scaly-tailed flying squirrels] + Pedetidae [springhares]). The third clade (Ctenohystrica sensu Huchon, Catzeflis, and Douzery 2000<$REFLINK> ) contains gundi and Guinea pig–related rodents: Ctenodactylidae and Hystricognathi. The interrelationships between these three clades are poorly resolved. None of the three bifurcating topologies connecting them involves significantly different log-likelihood (0.28 < PSH < 0.32), and the results are sensitive to the method of reconstruction and the data set considered. Ctenohystrica clusters with the mouse-related clade for ML and Bayesian nucleotide analyses (BPML = 40: fig. 2 ; PPNUC = 0.83). However, Bayesian protein analysis clusters Ctenohystrica with the squirrel- and dormouse-related clade (PPPROT = 0.70), and quartet puzzling protein analysis clusters the mouse-related clade with the squirrel- and dormouse-related clade (RPPROT = 42).
The Squirrel- and Dormouse-Related Clade
The monophyly of Sciuroidea (Sciuridae and Aplodontidae) has been supported by morphological (e.g., Lavocat and Parent 1985<$REFLINK> ; Meng 1990<$REFLINK> ) and molecular data (Huchon, Catzeflis, and Douzery 1999<$REFLINK> , 2000<$REFLINK> ; Adkins et al. 2001<$REFLINK> ; DeBry and Sagel 2001<$REFLINK> ). The increase in taxonomic sampling did not reduce the support for this clade (BPML = 100; RPPROT = 85). In agreement with DeBry and Sagel (2001)<$REFLINK> , we observe that Sciuroidea are characterized by a unique insertion of three amino acids in the IRBP gene. The sister clade of Sciuroidea appears to be the Gliridae (fig. 2 ). The existence of this suprafamilial clade has been suggested by morphological (e.g., Lavocat and Parent 1985<$REFLINK> ; Meng 1990<$REFLINK> ) and mitochondrial (Reyes et al. 2000<$REFLINK> ) studies; however, similar to other molecular studies (e.g., Huchon, Catzeflis, and Douzery 1999<$REFLINK> ; Adkins et al. 2001<$REFLINK> ; DeBry and Sagel 2001<$REFLINK> ; Montgelard et al. 2002b<$REFLINK> ), the support here remains moderate to strong. It is noteworthy that the support is higher with protein sequences (PPPROT = 1.00; RPPROT = 75; PSH < 0.05) than with nucleotide sequences (PPNUC = 1.00; BPML = 58; PSH < 0.15).
The Mouse-Related Clade
The grouping of Anomaluromorpha, Castoridae, Geomyoidea, and Myodonta (fig. 2 ) has never been suggested by morphological and paleontological observations. Castoridae has usually been related to Sciuridae because both families share the sciuromorph and sciurognath states (Brandt 1855<$REFLINK> ; Tullberg 1899<$REFLINK> ). However, some morphological studies could not confirm this relationship (Bugge 1985<$REFLINK> ; Lavocat and Parent 1985<$REFLINK> ; Meng 1990<$REFLINK> ), and it has even been suggested that Castoridae might be more closely related to Muridae than to Sciuridae (Meng 1990<$REFLINK> ). Anomaluridae and Pedetidae were considered as enigmatic families, possibly related to Ctenodactylidae or Hystricognathi (e.g., Luckett and Hartenberger 1985<$REFLINK> ; Jaeger 1988<$REFLINK> ). Geomyoidea has, less ambiguously, been associated with the Muridae (e.g., Wahlert 1985<$REFLINK> ; Ryan 1989<$REFLINK> ).
The mouse-related clade is moderately to strongly supported (PPNUC and PPPROT = 1.00; BPML = 65: fig. 2 ; RPPROT = 43), and alternative hypotheses cannot be significantly rejected (table 4 ). Such a molecular support and lack of morphological evidence might reflect that we are dealing with a phylogenetic artifact, but independent molecular studies identified the same node, although it was not explicitly noticed and discussed (Adkins et al. 2001<$REFLINK> ; Murphy et al. 2001a<$REFLINK> ). Consequently, the naturalness of this new superfamilial arrangement must be seriously considered in future studies. Another interesting aspect of this mouse-related clade is that it includes animals displaying different jaw patterns and having different geographical origins. Myodonta has been suggested to have originated from Asian hystricomorph rodents, Anomaluromorpha might have originated from African hystricomorph rodents, and Geomyidae and Castoridae are sciuromorph rodents from North America (e.g., Vianey-Liaud 1985<$REFLINK> ; Hartenberger 1998<$REFLINK> ). This suggests a complicated biogeographical history and an ancient origin for this rodent clade. This group might have a Paleocene or an Early-Eocene origin because Muridae, Dipodidae, Anomaluridae, Geomyoidea, and Castoridae are rooted in—or related to—Early- and Middle-Eocene families: Cricetidae, Zapodidae, Zegdoumyidae, Eomyidae, and Eutypomyidae, respectively (Hartenberger 1998<$REFLINK> ).
The mouse-related clade is divided into three subclades of morphologically distinct rodents, of which the relationships are unclear. Two subclades have been recognized earlier: (1) Myodonta, i.e., murids and jerboas (Luckett and Hartenberger 1985<$REFLINK> ; Nedbal, Honeycutt, and Schlitter 1996<$REFLINK> ; Huchon, Catzeflis, and Douzery 1999<$REFLINK> ; DeBry and Sagel 2001<$REFLINK> ), and (2) Anomaluromorpha, i.e., scaly-tailed flying squirrels and springhares (Bugge 1985<$REFLINK> ; Lavocat and Parent 1985<$REFLINK> ; Montgelard et al. 2002a<$REFLINK> ). The third subclade unexpectedly clusters beavers with pocket mice and pocket gophers (Adkins et al. 2001<$REFLINK> ). Here, Myodonta is highly supported (PPNUC and PPPROT = 1.00; BPML = 81, PSH < 0.03: fig. 2 , table 4 ; RPPROT = 57). With the vWF + IRBP combined data set, Anomaluromorpha is highly supported (BPML > 86, whatever the reconstruction method used; data not shown). Alternative hypotheses do not appear significantly less likely (PSH < 0.10), but the analysis is based on a shorter data set. The Geomyoidea plus Castoridae clade is moderately to strongly supported (PPNUC and PPPROT = 1.00; BPML = 63; RPPROT = 41; PSH < 0.13). This enigmatic clade has never been clearly suggested by morphological studies but had already been evidenced in molecular studies based on the GHR gene (Adkins et al. 2001<$REFLINK> ), on a combined data set of 15 genes (Murphy et al. 2001a<$REFLINK> ), and on 12S rRNA and cytochrome b sequences (Montgelard et al. 2002b<$REFLINK> ). In contrast, DeBry and Sagel (2001)<$REFLINK> found different results with the IRBP gene: Geomyoidea are the sister clade of Ctenohystrica and Castoridae clusters with Sciuroidea—an observation which might be the consequence of a smaller rodent sampling (i.e., Anomaluromorpha are missing) because our IRBP tree clusters Geomyidae and Castoridae (BPML = 57; fig. 1 ).
The lack of resolution for the branching order of the three clades Anomaluromorpha, Castoridae + Geomyidae, and Myodonta is illustrated by the ML and Bayesian analyses. Anomalurus connects either with Castor + Geomyidae (PPNUC = 0.56 and PPPROT = 0.33; BPML = 29; fig. 2 ) or with Myodonta (PPNUC = 0.40 and PPPROT = 0.36; BPML = 37).
The grouping of Ctenodactylidae and Hystricognathi in a Ctenohystrica clade has been strongly supported by the vWF gene (Huchon, Catzeflis, and Douzery 2000<$REFLINK> ) and the GHR gene (Adkins et al. 2001<$REFLINK> ). Our results confirm these observations (PPNUC and PPPROT = 1.00; BPML = 100; RPPROT = 93), and breaking the monophyly of Ctenohystrica is a significantly worse alternative (PSH < 0.03; table 4 ), as is the case for constraining Ctenodactylidae to branch with other sciurognaths (PSH < 0.03). Among Hystricognathi, the relationships obtained agree with the conclusions of Huchon and Douzery (2001)<$REFLINK> . Hystricognathi is divided into three clades: Hystricidae, Phiomorpha s. s. (i.e., Bathyergidae, Thryonomyidae, and Petromuridae) and Caviomorpha (i.e., Octodontoidea, Cavioidea, Erethizontoidea, and Chinchilloidea). It is interesting to note that the increase of sequence length favors a basal position of Hystricidae (PPNUC and PPPROT = 1.00; BPML = 87; RPPROT = 74), but alternative hypotheses are not significantly less likely (PSH < 0.14; table 4 ).
Robustness of Alternative Topologies Suggested by Morphology
Following Tullberg (1899)<$REFLINK> , rodents have been divided into two suborders—Sciurognathi and Hystricognathi. According to our molecular data, Hystricognathi remains a valid clade (BPML = 100; fig. 2 , RPPROT = 93), but the monophyly of Sciurognathi is statistically rejected (see earlier; table 4 ). An alternative classification divided rodents into Myomorpha, Sciuromorpha, and Hystricomorpha (Brandt 1855<$REFLINK> ). The contents of these groups changed according to the authors, and we here follow the classification of McKenna and Bell (1997)<$REFLINK> . Constraining the monophyly of Myomorpha, i.e., clustering Muridae + Dipodidae with Geomyoidea and Gliridae, is significantly less likely than the best ML topology with our A2AB + IRBP + vWF nucleotide and protein data (PSH < 0.05; table 4 ). The grouping of Aplodontidae, Sciuridae, and Castoridae into Sciuromorpha is marginally rejected (PSH < 0.10). Hystricomorpha actually corresponds to hystricognaths and thus appears monophyletic. These results suggest that the current subordinal classification of rodents should be thoroughly revised.
Quartet Dating of Divergence Times
Impact of the Choice of Markers for Dating Purposes
The combination of the independent paleontological calibration points, D1–D7 (table 1 and fig. 2 ), allowed the construction and evaluation of 59 quartets of species for A2AB, IRBP, vWF, and their combination. Among them, only 11 quartets fitted the two-rate constrained model for the three genes. Divergence times were estimated by quartet dating for the splits between the mouse-related clade and Ctenohystrica, the rodent superfamilies, the glires, and the Laurasiatheria versus Euarchonta + Glires (fig. 3 ). The results illustrate the impact of the choice of calibration points on the date estimates (cf. Huchon, Catzeflis, and Douzery 2000<$REFLINK> ). For example, A2AB estimates the divergence between Laurasiatheria and Euarchonta + Glires at 79.5 MYA with the quartet Lama-Physeter versus Marmota-Aplodontia (fig. 3 , square in quartet 11), whereas it is 111.9 Myr with the quartet Lama-Sus versus Echimys-Cavia (fig. 3 , square in quartet 8). There also is an impact of the choice of the gene on the dating results. Two different genes can lead to very different date estimates, even for the same quartet of species. For example, Mus-Rattus versus Echimys-Erethizon gives a date of 42.2 Myr with vWF (fig. 3 , circle in quartet 1), against 94.2 Myr with A2AB (fig. 3 , square in quartet 1). In this case, even the confidence intervals do not overlap: 34.3–53.2 Myr (vWF) against 66.9–137.3 Myr (A2AB). Most of the A2AB estimates appear to give older dates and larger confidence intervals (fig. 3 ), but this might just reflect the small number of quartets considered. Even when the medians of the quartet dates are compared between genes, the dating results remain divergent. For example, the median date for the divergence between Laurasiatheria and Euarchonta-Glires is 98.9 Myr for A2AB, 81.8 Myr for IRBP, and 76.9 Myr for vWF over 15, 24, and 5 quartets, respectively (data not shown). Because the three genes have similar lengths, this indicates that they contain quite different dating information. Therefore, because dating estimation becomes more accurate with longer sequences (e.g., Rambaut and Bromham 1998<$REFLINK> ), we combined the three nuclear genes. As expected, the distribution of the quartet estimations clearly becomes narrower than with the single genes (fig. 3 ). Confidence intervals are also up to four times smaller when the three genes are concatenated, as for example illustrated by the comparison between A2AB versus combination of markers (fig. 3 , square vs. diamond in quartet 1).
Molecular Dating Using Combined Markers
The timing of the diversification of extant placental orders is heavily debated. According to paleontology, they diverged in the Paleocene, 65–55 MYA (e.g., Alroy 1999<$REFLINK> ). Molecules rather suggest that placentals were already diversified in the Cretaceous (review in Bromham, Phillips, and Penny 1999<$REFLINK> ). However, various molecular studies give deviating datings. Their estimations are generally based on few calibration points and use either distance (e.g., Hedges et al. 1996<$REFLINK> ; Janke, Xu, and Arnason 1997<$REFLINK> ; Springer et al. 1997b;<$REFLINK> Kumar and Hedges 1998<$REFLINK> ; Waddell et al. 1999<$REFLINK> ; review in Bromham, Phillips, and Penny 1999<$REFLINK> ), or Bayesian (Cao et al. 2000<$REFLINK> ), or ML (Eizirik, Murphy, and O'Brien 2001<$REFLINK> ) approaches.
Among the 59 quartets of species involving calibration points D1–D7, only 41 fitted the two-rates model for the concatenation of the three markers. Among them, respectively, 1, 3, 6, 7, 17, and 7 allowed to estimate the age of the split between Sciuroidea and Gliridae (not shown), Muridae and Ctenohystrica (not shown), rodent superfamilies, glires, Laurasiatheria and Euarchonta + Glires, and placentals (fig. 4 ). Our results, being based on ML estimates and several independent calibrating pairs of taxa, are congruent with paleontology about ages of rodent radiation and Glires divergence. They support a rodent diversification at the transition between Paleocene and Eocene, 55.8 MYA (median of confidence intervals: 49.4–63.7; fig. 4 ). Paleontological studies place the rodent radiation at 55 MYA at the Paleocene-Eocene boundary (Hartenberger 1998<$REFLINK> ). The split between rodents and lagomorphs is estimated to be 64.5 MYA (57.3–73.3), just at the Cretaceous-Tertiary boundary. These results indicate that rodents and lagomorphs diverged only at the beginning of the Tertiary (fig. 4 ).
Molecular dating indicates a Late Cretaceous divergence, 83.2 MYA (74.1–94.4; fig. 4 ), for the last common ancestor of Glires and Laurasiatheria (represented here by cetartiodactyls). This date is congruent with paleontological studies: ungulates have been linked to the 85 MYA zhelestids (Archibald 1996<$REFLINK> ), and Glires have been strongly related to the zalambdalestids whose oldest fossil is estimated to be 90 MYA (Archibald, Averianov, and Ekdale 2001<$REFLINK> ). This estimate also agrees with the quartet dating (63.5–95.3 Myr) of Eizirik, Murphy, and O'Brien (2001)<$REFLINK> using independent calibration points. First divergences within placentals seem to occur 101.2 MYA (88.5–116.4; fig. 4 ), at the transition between Early and Late Cretaceous, but there are huge variations between ages provided by different quartets. Additional calibration points within Laurasiatheria are likely to stabilize these estimates (Eizirik, Murphy, and O'Brien 2001<$REFLINK> ). These molecular dates should also be confirmed in the future using other dating methods, such as for example the Bayesian approach of Thorne, Kishino, and Painter (1998)<$REFLINK> .
Our dating results on Glires agree with the hypothesis that Mesozoic placental divergences only involved the first basal clades of the placental tree, whereas the diversification of extant lineages occurred after the K-T boundary (Hedges et al. 1996<$REFLINK> ; Alroy 1999<$REFLINK> ; Eizirik, Murphy, and O'Brien 2001<$REFLINK> ; Madsen et al. 2001<$REFLINK> ). Dating involving other placental clades are required to verify whether the pattern observed for rodents (i.e., a diversification only after the K-T boundary) is valid for other placental orders too.
Dan Graur, Reviewing Editor
Present address: Molecular Evolution Laboratory, Faculty of Bioscience and Biotechnology, Tokyo Institute of Technology, Nagatsuta-cho, Midori-ku, Yokohama, Japan
Present address: Bioinformatics, GlaxoSmithKline, UP1345, Collegeville Pennsylvania
Keywords: Rodentia Glires Eutheria vWF IRBP A2AB phylogeny molecular dating
Address for correspondence and reprints: Emmanuel J. P. Douzery, Laboratoire de Paléontologie, Paléobiologie et Phylogénie-CC064, Institut des Sciences de l'Evolution UMR 5554/CNRS, Université Montpellier II, Place E. Bataillon, 34 095 Montpellier Cedex 05, France. email@example.com
This work would not have been possible without the essential contribution of all collectors who provided mammalian tissues now housed in the collection of Montpellier: Marina Baskevich (for Dryomys), Dona L. Dittmann (Thomomys; Collection of Genetic Resources, Louisiana State University Museum of Natural Sciences), Jean-Marc Duplantier (Lepus), Chris G. Faulkes (Bathyergus), Piotr Gambarian (Dipus), Laurent Granjon (Thryonomys, Lepus), Jean-Claude Gautun (Anomalurus), Patrick Gouat (Massoutiera), Jack Hayes and Marilyn Banta (Dipodomys), Robert S. Hoffmann (Marmota), Reginald Hoyt and John Trupkiewicz (Petromus; Zoological Society of Philadelphia), John A. W. Kirsh (Erethizon), Vincent Laudet (Chinchilla), Conrad Matthee (Pedetes), David Nolta (Aplodontia), E. Pelé and Vitaly Volobouev (Tachyoryctes), Thierry Petit (Macropus; Zoo de La Palmyre, France), Philippe Perret (Cavia), Rob Stuebing (Trichys), and the Zoo du Lunaret in Montpellier (Castor).
We wish to thank Ron DeBry for sharing the Dipodomys spectabilis vWF sequence, Christophe Douady for providing the Ochotona IRBP sequence, Jean-Louis Hartenberger, Jean-Jacques Jaeger, and Laurent Marivaux for useful paleontological discussions and comments, Jacques Demaille and Denis Pugnère (Institut de Génétique Humaine de Montpellier, CNRS UPR 1142) for computing facilities, and Emma Teeling for bench advice with regard to the automatic sequencer.
This work was supported by ACC-SV7 (Réseau National de Biosystématique), ACC-SV3 (Réseau coordonné par D. Mouchiroud), European Community TMR Network “Mammalian phylogeny” FMRX-CT98-0221 (to M.S., F.C., and W. de J.), and the Genopole Région Montpellier Languedoc-Roussillon and the Action inter-EPST Bioinformatique 2000–2002 (to E.D.). D.H. acknowledges the financial support of a Lavoisier grant from the French Ministry of Foreign Affairs. This is contribution number 2002-008 of the Institut des Sciences de l'Evolution de Montpellier (UMR 5554-CNRS).