-
PDF
- Split View
-
Views
-
Cite
Cite
Benjamin Linard, Nikolai Romashchenko, Fabio Pardi, Eric Rivals, PEWO: a collection of workflows to benchmark phylogenetic placement, Bioinformatics, Volume 36, Issue 21, November 2020, Pages 5264–5266, https://doi.org/10.1093/bioinformatics/btaa657
- Share Icon Share
Abstract
Phylogenetic placement (PP) is a process of taxonomic identification for which several tools are now available. However, it remains difficult to assess which tool is more adapted to particular genomic data or a particular reference taxonomy. We developed Placement Evaluation WOrkflows (PEWO), the first benchmarking tool dedicated to PP assessment. Its automated workflows can evaluate PP at many levels, from parameter optimization for a particular tool, to the selection of the most appropriate genetic marker when PP-based species identifications are targeted. Our goal is that PEWO will become a community effort and a standard support for future developments and applications of PP.
Supplementary data are available at Bioinformatics online.
1 Introduction
When a reference phylogeny is available, taxonomic identification of biological sequences can be achieved with phylogenetic placement (PP). PP provides the most informative type of classification because each query sequence is assigned to its putative origin in the tree. PP can be applied in many contexts, including community ecology, species diversity or medical studies. Several PP tools were developed for these purposes (Berger et al., 2011; Matsen et al., 2010; Mirarab et al., 2012; Zheng et al., 2018), with four recent tools capable of processing larger sequence volumes (Balaban et al., 2020; Barbera et al., 2019; Czech and Stamatakis, 2019; Linard et al., 2019). In the preliminary phase of experimental design, assessing which tools answer the needs of a given application remains a tedious task often involving manual tests (Mangul et al., 2019). Strikingly, PP has a broad range of applications, but lacks user guidelines and benchmarking. Some procedures to evaluate PP accuracy were proposed (Matsen et al., 2010), but never automated via a dedicated software. Benchmarking is essential to determine which tool suits better a given metagenomic task or a specific dataset (Sczyrba et al., 2017).
To fill this gap, we developed Placement Evaluation WOrkflows (PEWO), the first tool dedicated to PP benchmarking. PEWO automatizes evaluation procedures (which were not implemented for the community), and introduces novel procedures. Beyond benchmarking, PEWO can help decision making in any metagenomic or metabarcoding project for PP-based taxonomic identification. With applications ranging from parameter optimization on particular genomic data, to the selection of the most appropriate genetic marker, PEWO provides the user community with standardized workflows for easy and reproducible assessment of PP analyses.
2 Overview
PEWO implements evaluation workflows in Python and Snakemake (Köster and Rahmann, 2012), whose framework ensures flexibility, platform independence and reproducibility. Each workflow automatically performs multiple steps from query generation up to summary plots/tables, and can be tailored via Snakemake configuration files. PEWO and its dependencies are easily installed via a conda virtual environment. Currently, PEWO incorporates five state-of-the-art PP tools, which cover a majority of PP uses: EPA(RAxML), PPlacer, EPA-ng, RAPPAS and APPLES. Four are alignment-based tools, while RAPPAS is alignment-free. As input, each workflow takes a phylogenetic tree and the reference multiple sequence alignment from which it was built (Fig. 1). Optionally, the user can provide a set of query sequences. Below, we describe the workflows and some of their applications.

(A) Overview of PEWO inputs and outputs. (B) An example of plots dynamically generated by the PAC procedure on a 16S rRNA bacterial reference. Measured Mean eND are reported (lower value = better accuracy). Panels report selected conditions for PPlacer and RAPPAS, e.g. different parameter values tested in different rows and columns. For PPlacer, varying parameters are ms (max-strikes, X axis) and sb (strike-box, Y axis). Parameter mp (max-pitches, gray box) is fixed. For RAPPAS, varying parameters are k (phylo-kmer size) and o (omega threshold). Parameters red (alignment reduction) and ar (software used for ancestral reconstruction) are fixed. (C) Four PAC procedures were run for different Coleopteran mitogenome loci (rows) and compiled. Average eND is measured for three tools (columns) using default parameters. For each locus, the lowest average eND is highlighted in bold. For RAPPAS, the last column shows that accuracy can be improved when increasing k-mer size (default is k = 8). Examples B. and C. are more extensively discussed in Supplementary Materials. (Color version of this figure is available at Bioinformatics online.)
2.1 PEWO procedures
Pruning-based accuracy evaluation (PAC): in this standard procedure for assessing placement accuracy (Berger et al., 2011; Matsen et al., 2010), a subset of sequences is randomly pruned from the reference phylogeny and alignment. Each pruned sequence then serves to generate queries for placement, and the accuracy of each tool is measured in number of nodes separating predicted from true placement. PEWO offers two versions of this topological metric: Node Distance and expected Node Distance (eND). The eND accounts for placement uncertainty (e.g. likelihood weight ratios). All selected tools are compared for a user-selected combination of parameters.
Likelihood-based accuracy evaluation (LAC) is a new, faster evaluation procedure introduced in PEWO to assess relative accuracy of PP. It iterates the following process for a set of queries: place the query, extend the phylogeny to include that query, optimize the branch lengths of this extended tree and return its log-likelihood (LL). The user can then compare the LL values obtained with different tools, or different settings of a same tool (e.g. by inspecting the distribution of the differences between LL values obtained with two different tools). See the Supplementary Materials for a more detailed description.
Resource evaluation: outputs the runtime and memory usage of selected tools, with details for each placement step (e.g. profile alignment, database construction, placement, etc.). One can compare the impact on time and memory for tool-specific parameter combinations, while searching for an appropriate accuracy/resource trade-off, or evaluate the tools’ scalability with respect to input size.
2.2 Applications
PEWO procedures cover numerous use cases arising with PP, as illustrated by six exemplar applications provided on GitHub (two are reported in Fig. 1B and C). As new PP tools can be incorporated in PEWO, PEWO procedures enable comparing existing and future tools on resource usage, scalability, or accuracy in a reproducible way. With PEWO, users can optimize their PP pipeline design. For instance, for a given reference (tree and alignment), determine which tool and parameter combination will maximize placement accuracy, and at which computational cost. PEWO facilitates such tests, as in Figure 1B, which shows two plots automatically generated by the PAC procedure running PPlacer and RAPPAS for nine and six parameter combinations, respectively.
As a second example, we show how PEWO can be used to compare different genetic markers available for the same taxa, as the choice of the marker may impact the accuracy of placement. For example, we evaluated the placements for four loci (16S, 12S, cox1 and cyt) on their associated phylogeny for 900 Coleopteran mitochondrial genomes (Linard et al., 2018). Figure 1C displays the results (reproducible via GitHub example 4) highlighting that: (i) 12S yields the most accurate placements, despite being the second shortest locus, (ii) the tool achieving the best accuracy depends on the marker and (iii) with RAPPAS, a longer k-mer size is required to obtain accuracy similar or better than alignment-based methods.
2.3 Availability and implementation
PEWO, with full documentation and example workflows, is freely available from its repository URL: https://github.com/phylo42/PEWO. Its modular, well documented and evolvable source code enables the community to easily extend it by adding new tools, procedures or metrics. Notably, users can develop their own evaluation procedures starting from PEWO Snakemake rules as templates for their own workflows. Any PP tool can be integrated as long as it outputs results in jplace format [a json specification, standard in PP, see Matsen et al. (2012)], can be parameterized via the command line, and is available on a conda or pip repository (see the documentation for guidelines).
3 Conclusion
Reproducibility of computational analyses in life sciences is a crucial issue, even more when large-scale data come into play, as in the case of metagenomics. With PEWO, we provide a resource that facilitates the evaluation and comparison of PP tools under a unified framework. It allies flexibility, extensibility, with ease of use, while it inherits a standardized installation procedure from the conda framework. The set of workflows in PEWO aims to grow as a community effort, and extensions are welcome. In PEWO, we introduce a LAC procedure, which is complementary to existing procedures (Matsen et al., 2010). PEWO will help the community in its efforts to develop future PP tools and will facilitate experimental decisions when PP is chosen as a means to species identification. With the help of future contributors, we hope that PEWO will evolve as a standard for PP benchmarking, and answer forthcoming unforeseen yet auspicious applications.
Acknowledgements
The authors thank Vincent Lefort for technical assistance, the ATGC bioinformatic platform, the Institut Français de Bioinformatique [ANR-11-INBS-0013].
Funding
This work was supported by France Génomique [ANR-10-INBS-0009], MNERT fellowship to N.R.
Conflict of Interest: BL is research scientist in a private company, specialized on the use of eDNA for species detection.
References