TNT: Tree Analysis Using New Technology. Version 1.0, Beta test v. 0.2. Program and documentation available at http://www.zmuc.dk/ public/phylogeny/TNT/.—Pablo A. Goloboff, James S. Farris, and Kevin Nixon. 2003. Instituto Miguel Lillo, San Miguel de Tucumán, Argentina. $80.
Analyses of data under the parsimony criterion have been thought of as difficult when dealing with large numbers of terminal taxa. Certain implementations such as Hennig86 (Farris, 1988), PAUP (Swofford, 1990, 2002), and Nona (and its descendants; Goloboff, 1994, 1996a, 1996b, 1998) have made our lives more or less easy, but current data sets require more than these programs can give us. Keeping up with the increasingly larger data set size, new algorithms have been developed in areas such as simulated annealing (Goloboff, 1999; Nixon, 1999), genetic algorithms (Goloboff, 1999; Lemmon and Milinkovitch, 2002; Moilanen, 1999, 2001), or divide-and-conquer strategies (Goloboff, 1999; Nakhleh et al., 2001). Developments on forgotten areas such as calculating consensuses efficiently (Goloboff and Farris, 2001), improvements to resampling measures of group support (Goloboff and Farris, 2001; Goloboff et al., 2003a), and directed tree searches (drivers and alike), together with faster swappers, are now all available in a new interactive software, TNT (Goloboff et al., 2003b). Its name comes from Tree Analysis Using New Technology.
TNT version 1 is currently available (since October 2003) for download at http://www.zmuc.dk/public/ phylogeny/TNT/ (and can be also obtained through the webpage http://www.cladistics.com/). The program can be executed up to 10 times for demonstration purposes, but a license needs to be purchased afterwards. The cost of the software is $80. Examples and a PowerPoint tutorial are available free at the download site; I recommend that the beginning user studies the tutorial carefully. However, the program, GUI based, is interactive and easy to explore by playing with the menus. TNT runs in Microsoft Windows computers (different Windows versions), but can also be run on a Macintosh from a virtual PC environment. There are also Linux and Mac OSX versions of TNT. Commands can also be executed in a command line fashion. Furthermore, the program allows storage of batch commands that can be called upon in order to reproduce an analysis automatically, a fairly sophisticated scripting system that allows automatization of almost anything that one wants to do with the program, including a series of dialogs included in the executable zipdruns.exe. Although I have not familiarized myself with this scripting system, I know that it has enormous possibilities.
The beauty of TNT is the integration of fast swappers (TBR in TNT is about 10 to 50 times faster than PAUP*, depending on data set size: the larger, the greater the difference) with the “New Technology” algorithms that allow another level of tree searching. These algorithms include Ratchet (Nixon, 1999), Tree Drifting (Goloboff, 1999), Sectorial Searches (Goloboff, 1999), and Tree Fusing (Goloboff, 1999). The program allows customization of each one of these techniques and makes it easy to integrate the different algorithms (for a clear explanation of the algorithms and how to combine them, see Goloboff, 2002). These fast swappers combined with the newest algorithms in tree searching justify the use of TNT, providing that one deals with large analyses (i.e., more than 80 taxa). These functions are all found under the “New Technology Search” menu option. I have also found the program excellent for teaching purposes because it allows the students to see the importance of using fast and efficient algorithms. As in previous software by the authors, memory requirements are flawless and the software has an impeccable RAM management that allows fast access to data.
Another marvel of TNT is the different options for specifying driven searches (intelligent searches) under the “New Technology Search” menu option. The most obvious driver is to specify a fixed number of times that a minimum tree length has to be found during the search; for example, one can ask to do at least 10 full replicates and then stop after minimum tree length (defined as the minimum length the program is able to find) is hit 5 times. My favorite driver is the one that involves consensus techniques, where one searches until minimum tree length is found a certain number of times and then a consensus is estimated. A second round of searching starts and a new consensus is generated and compared to the previous one, and so on until the consensus stabilizes. The number of hits to minimum tree length as well as the times that the consensus needs to be identical to consider it stable are defined by the user. This method works extremely well for data sets with thousands or millions of equally parsimonious trees, as is typical of some morphological data sets. The use of such drivers allows achieving a stable consensus after finding just a few trees, without the necessity of wasting computation resources in finding all the MPTs, which will be collapsed anyway. The drivers are thus another important component of TNT, although perhaps not as well known as the incorporation of the new tree searching algorithms. Consensus techniques have required important research on quick collapsing methods (Goloboff and Farris, 2001).
However, these are not the sole reasons for using TNT, which incorporates many other functions useful to systematists. TNT also has an implicit enumeration algorithm, and an implied weights algorithm (Goloboff, 1993) that improves on early versions of PeeWee (Goloboff, 1997) by using floating-point calculations (= exact) instead of integer estimation. It also allows calculations of many support measures such as the commonly used resampling techniques—bootstrapping (Felsenstein, 1985) and jackknifing (Farris et al., 1996). Bootstrapping is actually done in two different ways: by the standard resampling with replacement, or following a Poisson independent reweighting (Goloboff et al., 2003a). Furthermore, resampling can be done by symmetric resampling, a method not distorted by differential costs (Goloboff et al., 2003a). The results of the resampling methods can be displayed in different fashions and by defining different cutoff frequencies. The default resampling technique used by TNT is symmetric resampling.
TNT also allows exploration of support by looking at character-based techniques, such as Bremer support (Bremer, 1988) (which they call absolute Bremer support) and relative Bremer support (Goloboff and Farris, 2001), which attempts to correct by the amount of evidence contradicting a given node instead of looking at the absolute Bremer value. A node with Bremer support of 2 can be supported by 2 characters and contradicted by none or supported by 100 characters and contradicted by 98. Although the absolute Bremer support will not be able to differentiate between these cases, the relative support for the first case is much higher. Relative Bremer support was employed for the first time using a beta version of TNT (Giribet and Boyer, 2002). The program also allows comparison of tree topologies, multiple consensus techniques, and for each calculation allows exclusion of a specified group of taxa without the necessity of rerunning the data. Multiple measures of nodal support can be displayed simultaneously over trees by using the Multiple Tags option under the Trees menu.
Character optimizations can be done over different types of trees and such optimizations can be printed or saved in different formats. The menus in TNT are easy to use and self-explanatory, but the aesthetics of its icons and trees is not the best. However, the Trees menu allows multiple functions by using the buttons of the mouse, such as rerooting trees, excluding/including taxa, or moving branches and defining constraints for subsequent tree searches.
Different tree buffer options and modes for storing trees (in parenthetical and compact mode) and output possibilities for buffers and the support of the Hennig/Nona data matrices are found in TNT. TNT also reads and exports simple Nexus files (Maddison et al., 1997) and allows merging of data files. It also incorporates a data matrix editor where one can select editing by taxon or by character, in a one-by-one fashion. The help file and the available tutorial really help to make efficient use of the program.
TNT is impressively fast and efficient at the task it was designed for—the analysis of data sets under parsimony—and it has incorporated lots of the recent research in algorithmic development for analysis of systematic data. What do I miss from this program? A pict tree export function like the one found in the Mac version of PAUP and the possibility of analyzing unaligned data under direct optimization like in POY (Wheeler, 1996; Wheeler et al., 2002). I am sure that others may miss the availability of maximum likelihood as an optimality criterion, but because the program was designed to work with large data sets this is not a surprise (e.g., Sanderson and Kim, 2000). Bayesian approaches (e.g., Huelsenbeck et al., 2001), which can also handle large data sets, have not been incorporated into TNT because they are found in other programs. Overall, TNT is an excellent research tool and an impressive program for teaching systematics. The lead author will also provide free copies for teaching purposes.
Gonzalo Giribet, Department of Organismic and Evolutionary Biology, Museum of Comparative Zoology, Harvard University, 16 Divinity Avenue, Cambridge, Massachusetts 02138, USA; E-mail: email@example.com