For the first time the phylogenetic relationships of early eureptiles, consisting of captorhinids, diapsids, and protorothyridids, are investigated in a modern phylogenetic context using both parsimony and Bayesian approaches. Ninety parsimony-informative characters and 25 taxa were included in the analyses. The Bayesian analysis was run with and without a gamma-shape parameter allowing for variable rates across characters. In addition, we ran two more Bayesian analyses that included 42 autapomorphies and thus parsimony-uninformative characters in order to test the effect of variable branch lengths. The different analyses largely converged to the same topology, suggesting that the “protorothyridid” Coelostegus is the sister taxon of all other eureptiles and that the remaining “protorothyridids” are paraphyletic. Also, there is a close relationship between diapsids and Anthracodromeus, Cephalerpeton, and Protorothyris, a grouping of Thuringothyris with captorhinids, and a variable position of the “protorothyridids” Brouffia, Hylonomus, and Paleothyris. The lack of resolution in some parts of the tree might be due to “hard polytomies” and short divergence times between the respective taxa. The tree topology is consistent with the hypothesis that the temporal fenestrations of diapsid reptiles appear to be the consequence of a more lightly built skeleton, indicating a significant ecological shift in the early stages of diapsid evolution. Bayesian analysis is a very useful additional approach in studies of fossil taxa in which more traditional statistical support like the bootstrap is often weak. However, the exclusive use of the Mk model appears suitable only if autapomorphic characters are included, whereas the Mk+gamma model performed well with or without autapomorphies.
The Eureptilia are one of the most important clades of Amniota. Among present-day vertebrates, lizards, snakes, the Tuatara, crocodiles, birds, and probably also turtles belong to Eureptilia. However, despite significant improvements in our understanding of the origin of amniotes (Reisz, 1997), little is known about the early evolutionary history of eureptiles. Basal eureptiles have traditionally been subdivided into three major assemblages (Carroll, 1988): (a) Diapsida, which include all modern reptiles, birds, and popular extinct groups such as dinosaurs; the fossil record of this major clade extends across 305 million years of history into the Late Carboniferous; (b) Captorhinidae, a Late Paleozoic clade that includes very large and possibly herbivorous animals; it is the first group of reptiles to diversify extensively and has a cosmopolitan distribution; (c) Protorothyrididae, a Late Carboniferous to Early Permian assemblage of relatively small and rather generalized reptiles from Laurasia that includes some of the oldest-known amniotes. Our understanding of the early history of Eureptilia is intimately tied to the interrelationships of these assemblages.
Because of their stratigraphic position and generalized morphology, protorothyridids have originally been considered as ancestral to all other amniote lineages (Carroll, 1982), but today this view is no longer accepted (Heaton and Reisz, 1986; Reisz, 1997). Alternatively, there have been suggestions that Protorothyrididae is the sister taxon of diapsids (Fig. 1; Heaton and Reisz, 1986; Laurin and Reisz, 1995; DeBraga and Rieppel, 1997). The problem remains that in contrast to frequent investigations on the phylogeny of captorhinids and basal diapsids (Benton, 1985; Carroll and Currie, 1991; Evans, 1988; Dilkes, 1998; Laurin, 1991; Dodick and Modesto, 1995; Modesto and Smith, 2001; Müller, 2003, 2004; Müller and Reisz, 2005; Müller et al., 2006), the phylogenetic relationships of protorothyridids had been ignored. In fact, there is no large-scale investigation in which more than one protorothyridid taxon was included. As a result there is no information about the relationships within the group, let alone any support of their monophyly, and their affinities to other early eureptiles remain uncertain. When using paleontological data sets for phylogenetic studies, there are only limited possibilities of testing the consistency of the result. This is especially problematic when the relationships of basal clades are investigated, because the relatively plesiomorphic nature of many taxa and thus the low amount of discriminating characters make a proper phylogenetic assessment difficult. This is because these taxa had only a restricted time of independent evolution and so the splits are fairly recent relative to each other. The frequent poor fossil preservation adds additional ambiguity, and may contribute to low bootstrap support of the most parsimonious tree topology (Müller, 2004). Alternative lines of evidence would be desirable, but until recently paleontologists were able to use only parsimony as a method of phylogenetic analysis, which is a strong disadvantage in comparison to other fields such as molecular biology. The recent advent of Bayesian analysis now allows for additional possibilities of predicting phylogenies (Yang and Rannala, 1997; Huelsenbeck and Ronquist, 2001), and might also be a promising approach for morphological studies (Lewis, 2001; Nylander et al., 2004). The present contribution will specifically address the question if the Bayesian approach is also useful for phylogenetic analysis of basal fossil taxa.
Material and Methods
In order to investigate the relationships within early eureptiles, we constructed a data set of 90 morphological, parsimony-informative characters. The data set represents an expansion of previously used data matrices for the evaluation of captorhinid phylogeny (Dodick and Modesto, 1995; Modesto and Smith, 2001; Müller and Reisz, 2005; Müller et al., 2006) and was combined with new characters as well as modified characters from Laurin and Reisz (1995) (see Appendix 1, http://systematicbiology.org). The following taxa were entered in the analysis: Outgroups: 1) Diadectomorpha, 2) Seymouriamorpha, 3) Caseidae, 4) Mesosauridae, 5) Millerettidae, 6) Procolophonidae; Captorhinidae: 7) Romeria texana, 8) Protocaptorhinus, 9)Rhiodenticulatus, 10) Captorhinus laticeps, 11) Captorhinus aguti, 12) Labidosaurus, 13) Labidosaurikos, 14) Saurorictus, 15)Concordia; Protorothyrididae: 16)Protorothyris, 17) Paleothyris, 18) Cephalerpeton, 19) Anthracodromeus, 20) Brouffia, 21) Coelostegus, 22) Hylonomus; Eureptilia inc. sed.: 23) Thuringothyris; Diapsida: 24)Petrolacosaurus, 25) Araeoscelis. Scoring information was based on personal observations and on the literature (Berman et al., 2004; Brough and Brough, 1967; Carroll, 1963, 1969; Carroll and Baird, 1972; Clark and Carroll, 1973; Dilkes and Reisz, 1986; Gow 1972; Heaton, 1979; Heaton and Reisz, 1980; Laurin, 1996; Laurin and Reisz, 1995; Modesto, 1999; Reisz, 1981; Reisz and Baird, 1983; Reisz et al., 1984; Sumida, 1987, 1989, 1991; Sumida et al., 1992; Vaughn, 1955). Whenever possible the relevant specimens were reexamined, and when new specimens of particular taxa were available, they were included in the character definitions and descriptions. In particular, new specimens of Protorothyris archeri were used in the analysis, which were helpful for corroboration of the description by Clark and Carroll (1973) and the evaluation of the relationships between the antorbital bones of the skull (no. 80). The complete data matrix can be found in Appendix 2 (http://systematicbiology.org).
We used two different types of analyses for the investigation of the phylogeny. First, we ran a parsimony analysis in PAUP*4.0 (Swofford, 2001) using the branch-and-bound search option with multistate taxa interpreted as polymorphisms. Second, we analyzed our data set by using Bayesian analysis in MrBayes V3.1.1 (Huelsenbeck and Ronquist, 2005). In accordance with Lewis (2001), we applied the Mk model to our data set, which assumes that a character can change its state at any time with equal probability for all instantaneous time intervals along the branch (the datatype was set as “standard,” which allows for a variable number of character states as it is necessary for morphological data). In a second approach we also employed a gamma distribution within our model (four rate categories), thus allowing the rate of character change to be different across characters (Nylander et al., 2004; Wiens et al., 2005). We ran 5,000,000 generations (four chains, two independent runs) with a tree sampled every 100 generations; the first 5000 trees, the “burn-in,” were disregarded for the final evaluation of the results.
In addition, we performed a third type of Bayesian analysis in accordance with recommendations by Lewis (2001), in which parsimony-uninformative characters (autapomorphies) were included in order to test if the different branch lengths have an effect on the tree topology. Personal observations and the literature listed in the parsimony section (see above) were used to identify 42 autapomorphic characters (see Appendices 1 and 2, http://systematicbiology.org), which means that we used taxon diagnoses as well as autapomorphic features listed in phylogenetic analyses for determination. In the case of caseids and procolophonids characters defining synapsids and procolophonoids were also included. Again, we ran the data set with and without a gamma shape parameter. Although the different settings for the Bayesian analyses led to slightly different topologies (see below), the independent runs of each Bayesian analysis converged to the same topology.
The nexus file of the data matrix and the trees are stored at www.treebase.org (study accession number = S1462, matrix accession number = M2628).
The parsimony analysis resulted in 4 most parsimonious trees (Fig. 2; TL = 252; CI = 0.4405; HI = 0.6032; RI = 0.6493; RC = 0.2860). Within eureptiles, Coelostegus is the sister taxon of all other taxa, and diapsids and the remaining “protorothyridids” form the sister clade of the monophyletic Captorhinidae and Thuringothyris. However, the relationships between diapsids and “protorothyridids” are variable in the four trees, with three favoring a sister-group relationship between diapsids and Cephalerpeton/Anthracodromeus/Protorothyris to the exclusion of the remaining taxa, whereas in one tree Brouffia, Hylonomys, and Paleothyris are nested together with Protorothyris and Anthracodromeus, forming the sister clade of diapsids and Cephalerpeton.
The monophyly of eureptiles is supported by the following unequivocal synapomorphies: (1) the narrow iliac blade (60); (2) the absence of a supratemporal/postorbital contact (71); and (3) the strong ventrolateral constriction of the dorsal centra (81).
The monophyly of all remaining eureptiles other than Coelostegus received unequivocal support by (1) the bilaterally embayed posterior skull margin (16); (2) the small supratemporal (47); (3) the participation of both parietal and supratemporal in the formation of the posterolateral corner of the skull roof (48); (4) the squamosal contribution to the post-temporal fenestra (49); and (5) the nasal being shorter than the frontal (74).
The grouping of Anthracodromeus, Brouffia, Cephalerpeton, Hylonomus, Paleothyris, Protorothyris, and diapsids is unequivocally supported by (1) the slender and lightly built stylo- and zeugopodia (51), and (2) the long and slender manus and pes (52). The monophyly of diapsids received unequivocal support by (1) the narrow and tongue-like pterygoid transverse flange (18); (2) the presence of a deep ventral groove on the parasphenoid (19); (3) the presence of swollen dorsal neural arches (50); (4) the presence of alternation in dorsal neural spine height (56); (5) the presence of an upper temporal fenestra (62); (6) the equal length of humerus and radius (77) and (7) tibia and fibula (78); (8) the short 4th metatarsal relative to the tibia (82); (9) the short 5th metatarsal relative to the 4th metatarsal (84); (10) the elongate cervical centra (85); and (11) the well-developed suborbital fenestra (86).
The grouping of captorhinids with Thuringothyris is unequivocally supported by (1) the well developed suture between lacrimal and jugal (4); (2) the anterior position of the pineal foramen (13); (3) the absence of posterolateral frontal processes (57); and (4) the anterior extent of the jugal beyond the anterior orbital margin (64). Relaxing parsimony by one step produces 72 trees in which most of the major clades remain stable, which means that Coelostegus remains the basal most taxon and Thuringothyris forms a clade with captorhinids, and only the relationships between diapsids and the remaining “protorothyridids” collapse in 3% of the trees. Bootstrap support (1000 replicates), however, is low for many clades, and only some more inclusive clades such as derived captorhinids or diapsids receive support values above 50% (Fig. 2).
For the Bayesian analysis, when the topology of the highest posterior probability was considered, the resulting topologies of all four Bayesian runs are relatively similar to the parsimony result. The only differences within eureptiles, though showing weak support, are a switch in the positions of Protocaptorhinus and Rhiodenticulatus, and a clustering of diapsids with Protorothyris/Anthracodromeus/Cephalerpeton to the exclusion of Hylonomus, Brouffia, and Paleothyris, which group with captorhinids and Thuringothyris. In all analyses, the monophyly of the latter two taxa is strongly supported with a posterior probability above 0.9, and the monophyly of diapsids with a value of 1 (Fig. 3).
Despite these overall similarities, the four Bayesian analyses show several differences. In the run without autapomorphies and gamma shape parameter (Fig. 3a), mesosaurs are the sister taxon of eureptiles, but the posterior probability for this grouping is very low (0.31). Also, this analysis shows only poor support for the monophyly of Coelostegus and remaining eureptiles, having a value of 0.41, and the dichotomy between diapsids/Protorothyris/Anthracodromeus/Cephalerpeton and remaining eureptiles shows a posterior probability of only 0.49. In the second run with a gamma distribution included (Fig. 3b), mesosaurs are the sister taxon of both parareptiles and eureptiles, as in all following runs. The basal position of Coelostegus shows a posterior probability of 0.6, and the dichotomy between diapsids/ Protorothyris/Anthracodromeus/Cephalerpeton and remaining eureptiles is moderately well supported by a value of 0.92. The Bayesian run including autapomorphies but no gamma-shape parameter (Fig. 3c) is largely similar to the second run, but presents the highest posterior probability for the monophyly of diapsids and Protorothyris/Anthracodromeus/Cephalerpeton (0.56), as well as for Paleothyris and Thuringothyris/captorhinids (0.56). In the Bayesian run with both autapomorphies and a gamma distribution (Fig. 3d), the support for the monophyly of Coelostegus and all other eureptiles is low (0.45).
In order to decide which of the four Bayesian analyses fits the fossil data best, we compared the different harmonic means of the log-likelihood. The harmonic means are important to determine if the addition of rate variation improved the fit of the model to the data (see e.g., Wiens et al., 2005). We calculated a Bayes factor for the four analyses, which is two times the difference in the harmonic means of the log-likelihoods; a value of > 10 is usually considered strong support (Kass and Raftery, 1995). The total harmonic means and the resulting Bayes factors are: without/with gamma and without autapomorphies: −901.82 and −886.69, Bayes factor = 30.26; without/with gamma and with autapomorphies: −1142.54 and −1107.63, Bayes factor = 69.82. On the basis of these values, the implementation of a gamma shape parameter appears to be a better choice for the present data set.
Despite some variability in the branch lengths there are no dramatic differences between the four types of Bayesian analyses (Fig. 3). Also, there appears to be no general correlation between higher posterior probabilities and longer branches. For example, in the Bayesian run without autapomorphies and gamma-shape parameter, the poorly supported monophyly of diapsids/Protorothyris/Anthracodromeus/Cephalerpeton and Thuringothyris/captorhinids shows a higher mean branch length (0.126534) than in the analysis that included autapomorphies and provided much stronger support for the node (0.077037). By contrast, in the analysis without autapomorphies but with a gamma distribution the same node shows both strong support and a longer branch (0.153269), whereas in the run including autapomorphies and a gamma distribution the node has a low value despite high posterior probabilities (0.083179). These findings indicate that there is variation in the branch lengths depending on the implemented model, but that the inclusion of autapomorphies and/or a gamma distribution does not necessarily result in longer internal branches.
Using Bayesian Analysis for Paleontological Data Sets
There have been only a few investigations dealing with Bayesian analysis of morphological characters (Lewis, 2001; Nylander et al., 2004; Lee, 2005; Wiens et al., 2005), and there is only one study that deals explicitly with fossil taxa (Snively et al., 2004). Any result should therefore be treated with caution because its use in this type of analysis is still in its infancy for morphological data. In contrast to the above investigations, the present study examined a very basal fossil clade in which the relationships are poorly known and have never been studied in a modern phylogenetic context.
The Bayesian analysis of our data set supports a topology that is relatively similar to the parsimony result. This could not be necessarily expected because Bayesian methodology, like any likelihood approach, is different from parsimony in using a specified model as optimality criterion. The Mk model used in the present investigation is a generalized JC69 model (Jukes and Cantor, 1969), which originally was developed for molecular data and assumes equal probabilities for all types of nucleotide substitutions. In contrast to some likelihood models whose results are virtually identical to those of parsimony analyses (G90 and TS97; Goldman, 1990; Tuffley and Steel, 1997), the Mk model does not essentially favor those trees that are also the most parsimonious. One of the most important differences between the Mk model and the two ‘parsimony models’ is that the former does not include incidental parameters, i.e., parameters that appear in the likelihood function for only some, but not all characters. Instead, the Mk model relies on structural parameters, which apply to all characters used in the likelihood analysis (Lewis, 2001).
It should be noted that when translated into a parsimony tree the topologies of the different Bayesian runs are three to four steps longer than the topology of the parsimony analysis, which also results in a change of the apomorphies supporting each node. For example, the Bayesian tree without autapomorphies but with a gamma shape parameter (Fig. 3b) has a tree length of 256 steps, and the characters which in the parsimony tree support the grouping of Brouffia, Hylonomus, and Paleothyris with diapsids and the other “protorothyridids” (51, 52) turn out to be equivocal and positioned elsewhere in the tree depending on the optimization. Instead, the monophyly of the Brouffia, Hylonomus, and Paleothyris with captorhinids and Thuringothyris is unequivocally supported by the presence of low neural spines (53), which is a character that, like other vertebral features, is important for our understanding of early eureptile evolution (see also Müller et al., 2006, for a discussion) and should be taken seriously in hypotheses on the phylogeny of this clade. In the chosen Bayesian example, characters 51 and 52, which pull Brouffia, Hylonomus, and Paleothyris away from captorhinids in the parsimony analysis, have a lower consistency index than in the parsimony tree (0.333 as compared to 0.5), and it seems that they were reconstructed with a higher evolutionary rate and thus became less decisive. Clearly, Bayesian analysis can help to decide if slightly longer trees of a parsimony analysis should be disregarded or not, and which of the longer trees should be selected for comparisons.
Several nodes in the analysis present high posterior probabilities but low bootstrap support. For example, the nodes Thuringothyris/Captorhinidae and Concordia/ remaining captorhinids show bootstrap values of 34 and 47, respectively, but posterior probabilities above 0.9. On the other hand, the bootstrap support for the monophyly of Captorhinus laticeps and C. aguti is higher than the posterior probabilities (98/0.85–0.94). There have been several studies in which the dramatic differences between bootstrap values and posterior probabilities were examined (Suzuki et al., 2002; Wilcox et al., 2002; Alfaro et al., 2003; Cummings et al., 2003; Lewis et al., 2005; Yang and Rannala, 2005). Bootstrap values are often considered to be too conservative whereas posterior probabilities are interpreted as over inflated, but so far there is no clear explanation for this discrepancy (Kelly, 2005). However, it should be emphasized that posterior probabilities and bootstrap values are very different methods of measuring statistical support, and it is questionable if they can be properly compared to each other. Furthermore, the above studies focused on molecular data only, mostly comparing maximum likelihood with Bayesian analysis, whereas a thorough treatment of morphological data still has to be completed. In addition to a critical evaluation of potential problems associated with the way MrBayes searches for trees and calculates the posterior probabilities, such a study should also include the consideration of missing characters in incompletely preserved fossil taxa and their influence on the statistical support.
In a likelihood approach like the Bayesian analysis, the length of the branch and thus the overall amount of evolutionary change is crucial for estimating the phylogenetic relationships (Lewis, 2001). As a result, autapomorphies can be an important factor for predicting a phylogeny under the likelihood criterion because it takes into account the evolutionary distance of a taxon from the node in which it is nested. This means that autapomorphies will always help provide a better estimate of the terminal branch lengths, which affects the likelihood for other characters in relation to a certain tree topology, and thus influences the preferred tree. In parsimony, on the other hand, it is not the branch length but the shortest number of steps leading to clades supported by shared derived character states that is most important. Autapomorphies are therefore discarded as uninformative. In the present investigation, the additional inclusion of autapomorphies in the Bayesian analysis resulted in sometimes dramatically different support values (Fig. 3), indicating, for example, that the monophyly of Coelostegus and all remaining eureptiles is not very stable and requires further investigation. It cannot be excluded that in different studies the inclusion of parsimony-uninformative characters might have an even stronger effect on the final result (see also Lewis, 2001, for a theoretical example). Morphologists and paleontologists have to deal frequently with clades showing a high number of convergent features that can obscure the ‘true’ phylogeny, well-known examples being fossil marine reptiles or burrowing squamates (Rieppel and Reisz, 1999; Rieppel and Kearney, 2001). Using autapomorphies in addition to the parsimony-informative characters within a Bayesian analysis might allow for a more thorough phylogenetic assessment of highly convergent taxa. Even though we are aware that the consideration of autapomorphies in a phylogenetic analysis using morphological characters is controversial, we think that the inclusion of all the available data can provide many useful insights, especially with respect to paleontological studies where the number of suitable characters is limited by preservational bias.
The Bayes factors calculated above indicated that the implementation of a gamma-shape parameter fits best to our data set. However, the analysis without a gamma distribution but with autapomorphies was very similar to the two gamma runs in both topology and posterior probabilities. On the other hand, the analysis without gamma distribution and autapomorphies does not only show weaker support for several important nodes (Fig. 3a), but its suggested sister-group relationship between mesosaurs and eureptiles also differs from what is generally accepted about early amniote relationships. Mesosaurs are usually considered to either group with parareptiles or to be the sister taxon of all other reptiles (Laurin and Reisz, 1995; Modesto, 1999). We therefore conclude that at least in the present case, the Mk model alone might not be suitable for the analysis of basal fossil taxa as long as there is no additional inclusion of parsimony-uninformative characters. In the long term, maximum likelihood approaches might be added as another line of evidence to complicated phylogenies involving morphological characters, but currently a true ML analysis for morphological data sets is impossible (see Lee et al., 2006). In light of the recent studies by Steel and Penny (2004), (2005) showing that under certain circumstances the trees derived from maximum likelihood and parsimony are similar, it might turn out that the results of a maximum likelihood analysis will resemble a parsimony tree more than a Bayesian approach. In contrast to the latter, maximum likelihood is not only an approximation to the ‘true topology’ consisting of a number of plausible trees, but presents a single, most likely tree topology depending on the implemented model. Future studies will hopefully shed more light on this issue.
Eureptilian Relationships and the Origin of Diapsids
An unexpected but very interesting result of our analyses is the position of Coelostegus as sister taxon of all other eureptiles. Coelostegus displays several important morphological characters that are different from the remaining in-group taxa, such as the elongated nasal and squamosal and the relatively large supratemporal, features that are absent in other basal eureptiles but not uncommon in synapsids or parareptiles. However, the lack of a contact between the supratemporal and the postorbital, which is also listed as one of the synapomorphies of eureptiles, is in our opinion strongly indicative of the eureptilian affinities of Coelostegus, because it is a feature that is not found elsewhere among basal amniotes. The present result emphasizes the necessity for a detailed re-investigation of Coelostegus.
In a recent study (Müller et al., 2006), Thuringothyris turned out to be the sister taxon of captorhinids, and in the present investigation the grouping of Thuringothyris and captorhinids is again well supported. The combination of plesiomorphic and derived character states makes this taxon a good example of how the origin of captorhinids must have occurred anatomically. For example, Thuringothyris shows unswollen neural arches, which is the plesiomorphic condition for eureptiles that became modified in captorhinids, but it also possesses the stout limb morphology characteristic of Captorhinidae.
Although Carroll (1982) considered protorothyridids to be ancestral to all other amniotes and thus to be paraphyletic, other studies (e.g., Heaton and Reisz, 1986; Boy and Martens, 1991) also suggested a potential paraphyly of Protorothyrididae, albeit nested within eureptiles. In the present study, the latter idea is supported due to the basal position of Coelostegus and the placement of diapsids within “protorothyridids.” In this context, Brouffia, Hylonomus, and Paleothyris appear to be the most difficult taxa with regard to their position in the tree. In the parsimony analysis they showed a variable placement within a clade including Anthracodromeus, Cephalerpeton, Protorothyris, and diapsids, whereas the Bayesian run provided support for a position outside of this node and a grading into the clade Thuringothyris/ captorhinids. We selectively deleted the six “protorothyridid” taxa under consideration using the heuristic search option in PAUP* (random-stepwise addition, 10 replicates) in order to test their effect on the tree topology (Müller, 2004), and found that Anthracodromeus, Cephalerpeton, and Protorothyris always fall with diapsids, whereas this is not true for the other three taxa, which often grouped closer to Thuringothyris and captorhinids. One might be tempted to relate this ambiguous pattern to the number of missing parsimony-informative characters in each taxon, assuming that the three variable taxa also show the highest amount of missing information. However, the numbers differ significantly from each other, with low percentages in Paleothyris (5.6%) and Protorothyris (8.8.%), and moderate to very high percentages in Brouffia (16.6%), Hylonomus (25.6%), Cephalerpeton (38.8%), and Anthracodromeus (50.0%). In addition, there are two other taxa with high numbers of missing data, Coelostegus (44.4%) and Saurorictus (54.5%, which is the highest percentage of all taxa), but both show a very stable placement in either analysis. This finding is consistent with previous studies (Kearney, 2002; Müller, 2004), in which the specific character distribution of a taxon is considered to be more influential than the number of missing characters. Thus, Brouffia, Hylonomus, and Paleothyris appear to be a typical example of problematic basal clades; specifically, they apparently diverged from each other shortly before they became documented in the fossil record. In the present case, this view is supported by the close stratigraphic and regional association of Paleothyris and Hylonomus, both coming from Upper Carboniferous localities of Nova Scotia, Canada, and by the stratigraphic proximity of Brouffia from the Upper Carboniferous of Nyrany, Czech Republic, the overall difference in age being 7 to 10 Mya. Anatomical reexaminations might help to clarify this issue, but it cannot be excluded that we are dealing with a “hard polytomy” whose effect on the tree topology is even worsened by the restricted amount of morphological characters for which the respective taxa can be scored. However, if future investigations will corroborate the close association of Anthracodromeus, Cephalerpeton, Protorothyris, and diapsids to the exclusion of Brouffia, Hylonomus, and Paleothyris, it might be possible to redefine the Protorothyrididae in a proper phylogenetic sense.
Diapsid (araeoscelid) reptiles are one of the strongest supported clades in either analysis. As mentioned above, previous investigations (e.g., Heaton and Reisz, 1986; Gauthier et al., 1988; Laurin and Reisz, 1995; DeBraga and Rieppel, 1997) already suggested close affinities between “protorothyridids” and diapsids (Fig. 1). However, this assumption was only based on Paleothyris, because no other basal “protorothyridid” was included in the respective analyses. As shown in the apomorphy listings, one of the major characteristics of basal eureptiles is the presence of slender, elongate limbs. However, basal diapsids are the only eureptiles in which the lower limb also gets significantly elongated. The lightly built skeleton and the gracile appearance of araeoscelids are caused by the modified limbs, the elongated neck, and the fenestrated skull, all characters being autapomorphic for the clade. Thus, the lateral openings in the skull so typical for diapsids might be a result of evolutionary changes that initially affected the entire skeleton, indicating a significant ecological shift in early diapsid/eureptilian evolution. Unfortunately, there is no functional investigation of the araeoscelid skeleton; trackways in the Permo-Carboniferous of Germany, which are most probably of araeoscelid origin (J. Boy, personal communications), could prove useful for a better understanding but still await detailed examination. The present evidence indicates that the fenestrated diapsid skull has resulted from the evolution for a lighter and less heavily ossified skeleton, which is in accordance with the hypothesis by Reisz (1981) and Carroll (1982) who suggested that the temporal fenestrations initially evolved to lighten the skull. Similar morphological phenomena can be found in snakes and theropod dinosaurs, which either reduced bones or evolved additional openings in order to decrease skull weight (Rieppel, 1993; Witmer, 1997). Also, the fenestrations offer a more advantageous way of muscle attachment, which means that muscle tendons merge with the periosteum and spread tensile forces around the rim of the fenestra, making the attachment site less susceptible to being torn loose from the bone (Kardong, 2002). Alternatively, the explanation that the temporal openings provided additional space for the jaw musculature (Frazzetta, 1968) may apply to synapsids (Reisz, 1972) but is less likely for diapsids; this is indicated by Araeoscelis, which seems to have obliterated the lower temporal fenestra secondarily for having stronger muscle attachment sites and a stronger bite (Reisz et al., 1984). Despite its novelties, the diapsid condition can be regarded as a functional continuation of the initial eureptilian morphology of a light skeleton. Captorhinids, on the other hand, took a different path by evolving a heavily ossified skeleton with a large and massive skull lacking any fenestrations.
The current distribution of taxa within the tree does not permit an unequivocal interpretation of the biogeographic origin of the Eureptilia because the European and the North American taxa are equivocally distributed in the tree. The problem is further complicated by the Parareptilia, the sister group of eureptiles, whose origin is still a matter of debate but has been suggested to be in Gondwana (Modesto, 2000). However, the basal most parareptiles are still unknown (Reisz et al., in preparation), and thus any definite statement must remain doubtful. Interestingly, the oldest-known amniote, Hylonomus, does not occupy the basal most position within eureptiles, and the Lower Permian Thuringothyris is stratigraphically younger than the closest captorhinid, Concordia from the Pennsylvanian of Kansas (Müller and Reisz, 2005). As already suggested by studies on the large-scale phylogenetic relationships of early amniotes (Laurin and Reisz, 1995; DeBraga and Rieppel, 1997), the present pattern indicates that a significant amount of evolution had already passed by the time the oldest-known amniotes/eureptiles became fossilized.
Phylogenetic investigations of basal fossil taxa have always been hampered by a low number of discriminating characters and preservational biases, resulting in poorly supported or highly contradicting tree topologies. We regard the use of model-based approaches such as Bayesian analysis in addition to the more traditional parsimony approach as very helpful for predicting the phylogeny of extinct taxa for which we know only morphological characters. Because of the inclusion of branch lengths in the phylogeny estimate, Bayesian analysis emphasizes the evolutionary process of a taxon more than parsimony, and can contribute to our understanding of evolutionary history from a different perspective. However, a better knowledge of the influence of different model parameters is surely needed. The present study shows that the combination of Bayesian analysis and parsimony provides important insights into the early evolution of eureptiles; in future studies this approach might help to decipher other problems of reptile phylogeny, such as the relationships of basal diapsids or the origin of turtles.
We wish to thank Belinda Chang (Toronto) and Karen Cranston (Edmonton) for critically reading earlier drafts of this paper, and David Evans (Toronto) for stimulating discussions. Michael Lee (Adelaide), Mike Steel (Christchurch), and Stuart Sumida (San Bernardino) made useful comments improving the manuscript. Diane Scott (Toronto) provided helpful technical support. This study was financially supported by the Deutsche Forschungsgemeinschaft (MU 1760/2-1) and a Discovery Research Grant from NSERC (Canada).