Abstract

Summary: MATICCE is a new software package in the R language for mapping phylogenetic transitions in organismal traits that have continuous distributions. MATICCE integrates over phylogenetic and model uncertainty and provides simulation functions for visualizing evolutionary scenarios based on estimated parameter values.

Availability and Implementation: MATICCE is written in the open source R language and freely available through the Comprehensive R Archive Network (http://cran.r-project.org/web/packages/maticce).

Contact:ahipp@mortonarb.org

1 INTRODUCTION

Inferring the evolutionary history of species traits is an issue of practical concern for any researcher testing biological hypotheses using phylogenetic data. Maximum likelihood and Bayesian approaches are commonly used to map phylogenetic transitions in the evolution of organismal traits that can be described in terms of discrete states (Huelsenbeck et al., 2003; Pagel, 1999). However, there is no standardized approach for mapping phylogenetic transitions in the distribution of continuous traits of ecological and evolutionary relevance such as body size and shape. Several models have been introduced to describe phylogenetic transitions in the evolutionary dynamics of such continuous traits, given an a priori specification of where on the phylogeny those transitions may have occurred (e.g. Butler and King, 2004; Hansen et al., 2008; O'Meara et al., 2006; Revell and Collar, 2009). A generalized framework is needed for identifying where on a phylogeny such transitions may have occurred as a guide to hypothesis testing and to evaluate whether a priori transitions in continuous character distribution are supported as strongly as other possible transitions.

In this article, we describe a new software package for mapping phylogenetic transitions in the evolution of any continuous character on a molecular or morphological phylogeny with branch lengths scaled proportional to time (i.e. ultrametric). The software, MATICCE, implements an information-theoretic approach to quantifying the statistical support for continuous character transitions. The software is written in the cross-platform R statistical environment (R Development Core Team, 2009) and integrates with the phylogenetic comparative methods in OUCH (Butler and King, 2004) and APE (Paradis et al., 2004). We describe the approach and implementation in this article. Worked examples and sample data are available in the online documentation and vignette.

2 DESCRIPTION

MATICCE models transitions in continuous character distribution according to an Ornstein–Uhlenbeck (O–U) process (Butler and King, 2004; Martins and Hansen, 1997), utilizing likelihood calculations implemented in OUCH. Under an O–U process, a continuous character is modeled as evolving stochastically toward a stationary distribution with mean θ and (at stationarity) variance σ2/2α, where α determines the rate of evolution toward the stationary distribution (Butler and King, 2004). The approach implemented in MATICCE evaluates the relative support for alternative models of continuous character distribution shifts by (i) specifying n nodes at which a change may have occurred, (ii) specifying and evaluating support for models that allow all permutations of change at those nodes up to a maximum number of nodes defined by the user, and (iii) estimating the relative support for a shift in character distribution at each node as the cumulative information criterion (e.g. small-sample Akaike information criterion, AICc, or Bayes Information criterion, BIC) weight for all models entailing a shift at that node. The method is reminiscent of stepwise AIC model selection (e.g. Alfaro et al., 2009) but, by taking a model-averaging approach, avoids sensitivity to AIC significance thresholds.

MATICCE makes it easy to generate the potentially large number of model specifications required to test character transition hypotheses and provides a set of tools for flexibly defining nodes for analysis. To relax computational limitations, users can limit the maximum number of nodes at which a character is allowed to change distribution (e.g. 30 candidate nodes could be investigated for models allowing up to a maximum of three character distribution transitions; this is a manageable 4526 models, compared with the 1.07E09 models defined by all possible permutations of 30 candidate nodes). Model specifications are stored within each MATICCE analysis object alongside analysis results and viewable in OUCH or the simulation functions of MATICCE. MATICCE performs and summarizes analyses over sets of trees, integrating over phylogenetic and model uncertainty and allowing analyses to be performed even for nodes that are not found in all trees in a set. Because sets of phylogenetic trees (e.g. bootstrap sets) are not typically identical in topology, each node is defined by the set of taxa descendent from it. MATICCE utilizes a user-defined list of taxa describing a set of nodes to create the subset of models applicable to each tree. Analyses are conducted and summarized conditional on the models possible for each tree. User-defined model sets can also be created to test specific character evolution scenarios, or individual models created for testing in OUCH. Model specifications are compatible with the analysis, simulation and visualization functions in OUCH, and objects created in the course of analysis can be analyzed directly in MATICCE or used as input for discrete-time character simulations. Running a MATICCE simulation on a MATICCE analysis object, for example, returns a simulation of character evolution using model-averaged parameter values for σ, α and the character mean (θ) for each branch (Fig. 1B).

Fig. 1.

(Carex: Cyperaceae) based on Bayesian analysis of nuclear ribosomal DNA data, with diploid chromosome numbers (2n) for each tip (Hipp, 2007). Mean 2n was log-transformed for analysis. Analysis of 256 models over 100 trees sampled in a Bayesian phylogenetic analysis took 1 h 55 min on an Intel Core 2 Duo at 3.00 GHz. Shaded portions of the pie charts at the nodes indicate relative support for a shift in distribution of ln(2n) at that node, based on BIC weights. Gray branches indicate one clade at which there is support for a shift in character distribution (cumulative BIC weight=0.876). (B) Character evolution on the same phylogeny in MATICCE; clear circles denote nodes, vertical scale indicates simulated ln(2n). Simulation parameters (θ=4.22 for gray branches; θ=4.37 for black branches; σ2=6.3 and α=380 for the entire tree) were estimated from the data, using the model averages over 28=256 models evaluated. Gray branches correspond to the gray clade in (A).

Fig. 1.

(Carex: Cyperaceae) based on Bayesian analysis of nuclear ribosomal DNA data, with diploid chromosome numbers (2n) for each tip (Hipp, 2007). Mean 2n was log-transformed for analysis. Analysis of 256 models over 100 trees sampled in a Bayesian phylogenetic analysis took 1 h 55 min on an Intel Core 2 Duo at 3.00 GHz. Shaded portions of the pie charts at the nodes indicate relative support for a shift in distribution of ln(2n) at that node, based on BIC weights. Gray branches indicate one clade at which there is support for a shift in character distribution (cumulative BIC weight=0.876). (B) Character evolution on the same phylogeny in MATICCE; clear circles denote nodes, vertical scale indicates simulated ln(2n). Simulation parameters (θ=4.22 for gray branches; θ=4.37 for black branches; σ2=6.3 and α=380 for the entire tree) were estimated from the data, using the model averages over 28=256 models evaluated. Gray branches correspond to the gray clade in (A).

MATICCE provides straightforward functions for summarizing analyses over large sets of models and trees, using BIC, AIC or AICc weights to estimate model support and model-averaged parameter values (Burnham and Anderson, 2002). Analyses are summarized for each node both over all trees and over just the trees in which the node occurs (Fig. 1A), so that the evidential support for a shift in character distribution at a given node may be interpreted as the relative support for that particular shift conditioned on the existence of the node. Based on these analyses, users will often want to evaluate the relative support for different models of character evolution at a particular node. MATICCE provides functions for evaluating and summarizing the relative support for alternative models at a single node. These functions lend themselves to customization and can easily accommodate additional models as desired.

3 CONCLUSIONS

While there has been a substantial increase in statistical analysis of continuous character evolution in the past several years, an easy-to-use general framework for reconstructing shifts in continuous character evolution across a phylogeny has been lacking. MATICCE uses existing tools to implement such a framework, providing a needed approach for exploratory analysis of continuous character evolution on a phylogeny.

ACKNOWLEDGEMENTS

We thank A. King, M. Butler, A. Platt, B. O'Meara and two anonymous reviewers for helpful comments on this work.

Funding: NESCent visiting scholar award (to A.L.H.); the National Science Foundation (0743157 to A.L.H.); the Spanish Government (FPU AP2005–3715 to M.E.).

Conflict of Interest: none declared.

REFERENCES

Alfaro
ME
, et al.  . 
Nine exceptional radiations plus high turnover explain species diversity in jawed vertebrates
Proc. Natl Acad. Sci. USA
 , 
2009
, vol. 
106
 (pg. 
13410
-
13414
)
Burnham
KP
Anderson
DR
Model Selection and Multimodel Inference: a Practical Information-Theoretic Approach.
 , 
2002
NewYork
Springer
Butler
MA
King
AA
Phylogenetic comparative analysis: a modeling approach for adaptive evolution
Am. Nat.
 , 
2004
, vol. 
164
 (pg. 
683
-
695
)
Hansen
TF
, et al.  . 
A comparative method for studying adaptation to a randomly evolving environment
Evolution
 , 
2008
, vol. 
62
 (pg. 
1965
-
1977
)
Hipp
AL
Nonuniform processes of chromosome evolution in sedges (Carex: Cyperaceae)
Evolution
 , 
2007
, vol. 
61
 (pg. 
2175
-
2194
)
Huelsenbeck
JP
, et al.  . 
Stochastic mapping of morphological characters
Syst. Biol.
 , 
2003
, vol. 
52
 (pg. 
131
-
158
)
Martins
EP
Hansen
TF
Phylogenies and the comparative method: a general approach to incorporating phylogenetic information into the analysis of interspecific data
Am. Nat.
 , 
1997
, vol. 
149
 (pg. 
646
-
667
)
O'Meara
BC
, et al.  . 
Testing for different rates of continuous trait evolution using likelihood
Evolution
 , 
2006
, vol. 
60
 (pg. 
922
-
923
)
Pagel
M
The maximum likelihood approach to reconstructing ancestral character states of discrete characters on phylogenies
Syst. Biol.
 , 
1999
, vol. 
48
 (pg. 
612
-
622
)
Paradis
E
, et al.  . 
APE: Analyses of Phylogenetics and Evolution in R language
Bioinformatics
 , 
2004
, vol. 
20
 (pg. 
289
-
290
)
R Development Core Team
R: a language and environment for statistical computing.
 , 
2009
Vienna, Austria
R Foundation for Statistical Computing
Revell
LJ
Collar
DC
Phylogenetic analysis of the evolutionary correlation using likelihood
Evolution
 , 
2009
, vol. 
63
 (pg. 
1090
-
1100
)

Author notes

Associate Editor: Martin Bishop

Comments

0 Comments