Summary: Reflecting its continuously increasing versatility and functionality, the popularity of the ape (analysis of phylogenetics and evolution) software package has grown steadily over the years. Among its features, it has a strong distance-based component allowing the user to compute distances from aligned DNA sequences based on most methods from the literature and also build phylogenetic trees from them. However, even data generated with modern genomic approaches can fail to give rise to sufficiently reliable distance estimates. One way to overcome this problem is to exclude such estimates from data analysis giving rise to an incomplete distance data set (as opposed to a complete one). So far their analysis has been out of reach for ape. To remedy this, we have incorporated into ape several methods from the literature for phylogenetic inference from incomplete distance matrices. In addition, we have also extended ape's repertoire for phylogenetic inference from complete distances, added a new object class to efficiently encode sets of splits of taxa, and extended the functionality of some of its existing functions.
Availability: ape is distributed through the Comprehensive R Archive Network: http://cran.r-project.org/web/packages/ape/index.html Further information may be found at http://ape.mpl.ird.fr/pegas/
Supplementary information:Supplementary data are available at Bioinformatics online.
Actively responding to requirements of evolutionary biologists to be able to analyze new types of data as well as larger datasets, these features have been improved upon steadily over the years and new functions and object classes have been added by numerous contributors. At the same time,
Despite these additions, one of
Although the methods we have added to
1 NEW FEATURES
An attractive feature of distance-based phylogenetic reconstruction is that a tree can be constructed in a relatively short amount of time. Thus, there has been considerable interest in developing these methods both from complete and incomplete distances. As an alternative to the NJ, BIONJ and FastME methods mentioned above, which all take complete distances as input, we have added the triangles method (Guénoche and Leclerc, 2001) as
One way to deal with incomplete distances is to first restrict attention to a subset of the data for which complete distance information is available, and then to somehow fit the remaining data into the phylogenetic tree constructed from that subset. This is the philosophy underpinning, for example, the triangles method. An alternative to this is to directly estimate the missing distances from the data without first constructing a phylogenetic tree. Two methods that rely on this idea are the ultrametric and the additive procedure (Makarenkov and Lapointe, 2004), respectively, which we have implemented in
As a partial response to the criticism that supertree reconstruction methods only use tertiary data (i.e. phylogenetic trees obtained from sets of distance matrices), consensus distance matrix approaches have been introduced in the literature. Starting from several overlapping taxa sets, each with complete distance information, this boils down to finding ways to compute the distance between any two taxa that are in the union of all taxa sets but not in the same set. A tool that allows one to do this is the superdistance matrix (SDM; Criscuolo et al., 2006) method which we have incorporated into
To take advantage of a combinatorial description of a phylogenetic tree in terms of a collection of weighted splits, i.e. weighted bipartitions of the tree's leafset see, e.g.,(Semple and Steel, 2003), we have developed a new class,
The authors would like to thank Alexis Criscuolo, Olivier Gascuel and Klaus Schliep for their feedback and suggestions, and the referees for their helpful comments.
Funding: A. -A. P. was supported by the National Evolutionary Synthesis Center (NESCent), NSF #EF-0905606 as part of the 2011 Google Summer of Code program. Constant support has been provided to E. P. by the Scientific Information Service of the Institut de Recherche pour le Développement (IRD) in Montpellier. This is publication ISEM 2012-040.
Conflict of Interest: none declared.