-
PDF
- Split View
-
Views
-
Cite
Cite
Liisa Holm, Benchmarking fold detection by DaliLite v.5, Bioinformatics, Volume 35, Issue 24, December 2019, Pages 5326–5327, https://doi.org/10.1093/bioinformatics/btz536
- Share Icon Share
Abstract
Protein structure comparison plays a fundamental role in understanding the evolutionary relationships between proteins. Here, we release a new version of the DaliLite standalone software. The novelties are hierarchical search of the structure database organized into sequence based clusters, and remote access to our knowledge base of structural neighbors. The detection of fold, superfamily and family level similarities by DaliLite and state-of-the-art competitors was benchmarked against a manually curated structural classification.
Database search strategies were evaluated using Fmax with query-specific thresholds. DaliLite and DeepAlign outperformed TM-score based methods at all levels of the benchmark, and DaliLite outperformed DeepAlign at fold level. Hierarchical and knowledge-based searches got close to the performance of systematic pairwise comparison. The knowledge-based search was four times as efficient as the hierarchical search. The knowledge-based search dynamically adjusts the depth of the search, enabling a trade-off between speed and recall.
Supplementary data are available at Bioinformatics online.
1 Introduction
Protein folds display a natural clustering due to physical convergence and evolutionary descent from common ancestors. Newly solved protein structures are routinely searched against the database of known structures in the hope of discovering hidden evolutionary relationships (e.g. Valleau et al., 2018) or defining a new fold. DaliLite (Holm and Park, 2000; Holm et al., 2008) is a standalone implementation of the Dali methodology (Holm and Sander, 1993). Here, we explore the trade-off between speed and classification performance using different database search strategies.
The hierarchical strategy systematically tests the query structure against a representative subset of the Protein Data Bank (PDB), which is followed by expansion to sequence neighbors of the highest scoring representatives. The knowledge-based strategy first has to connect the query structure to a sparse graph of pre-computed structural similarities between PDB structures. The initial step uses fast approximate comparison methods with moderate recall. If no structural similarity is found, the strategy falls back to a systematic comparison against a representative subset of the PDB. Once the query has been connected to the graph, the depth of the search is controlled by a priority queue and dynamic adjustment of the structural similarity threshold for following transitive links.
The performance of the different database search strategies is evaluated on a benchmark that uses a manual structural classification, SCOP (Fox et al., 2014), as ground truth. For a given query domain, the evaluated structure comparison programs return a result list of PDB structures (‘hits’) ordered by decreasing structural alignment score. A hit is incorrect, if the structure does not have any domain from the same SCOP fold class as the query. All hits which share a domain from the same SCOP fold class are counted as correct. Correct hits are stratified by difficulty to fold, superfamily and family level targets (Supplementary Fig. S1).
Classification performance is evaluated using Fmax, which balances recall and precision. Calculating Fmax involves scanning an ordered list of hits for the optimal threshold that maximizes the harmonic mean of precision and recall. We compute four variants of Fmax values: using the full PDB or a non-redundant subset (PDB70), and allowing query-specific thresholds or enforcing uniform thresholds in the calculations. Query-specific thresholds suit Dali, because it quantifies structural similarity using a sum-of-pairs score, which has a strong dependence on the size of the query structure. We compute Fmax separately for each query and report the average query-wise Fmax over the benchmark as the main metric. Sometimes the classification task is formulated so as to require a uniform threshold applied to all queries, such as a probability of same-fold membership. We call this version the pooled Fmax. DaliLite v.5 is compared with state-of-the-art competitors (Dong et al., 2018; Wang et al., 2013; Zhang and Skolnick, 2005). The results (Supplementary Fig. S3) show large differences between evaluation metrics and between different measures of structural similarity.
2 Materials and methods
The DaliLite v.5 standalone software supports data import, pairwise alignment, hierarchical clustering (structural dendrograms) and database searches by hierarchical and knowledge-based strategies. The software runs on Linux and requires gfortran, optionally Open MPI, and Perl. Database searches require BLAST (Camacho et al., 2009). Users must download their own copy of the PDB. The knowledge base contains links between similar structures in the PDB. It is maintained by our group and is accessed remotely. Benchmark results (Table 1) used default parameters of DaliLite described in Supplementary Methods.
Program . | Strategy . | Fold . | Superfamily . | Family . |
---|---|---|---|---|
DaliLitea | Systematic | 0.38 (0.36) | 0.83 (0.81) | 0.96 (0.97) |
DaliLitea | Hierarchical | 0.36 (0.31) | 0.82 (0.77) | 0.94 (0.92) |
DaliLitea | Knowledge-based | 0.36 (0.28) | 0.79 (0.73) | 0.96 (0.96) |
DeepAlignb | Systematic | 0.28 (0.25) | 0.78 (0.76) | 0.97 (0.97) |
mTMalignc | Knowledge-based | 0.13 (0.13) | 0.55 (0.55) | 0.91 (0.93) |
TMalignd | Systematic | 0.12 (0.14) | 0.39 (0.42) | 0.85 (0.87) |
Program . | Strategy . | Fold . | Superfamily . | Family . |
---|---|---|---|---|
DaliLitea | Systematic | 0.38 (0.36) | 0.83 (0.81) | 0.96 (0.97) |
DaliLitea | Hierarchical | 0.36 (0.31) | 0.82 (0.77) | 0.94 (0.92) |
DaliLitea | Knowledge-based | 0.36 (0.28) | 0.79 (0.73) | 0.96 (0.96) |
DeepAlignb | Systematic | 0.28 (0.25) | 0.78 (0.76) | 0.97 (0.97) |
mTMalignc | Knowledge-based | 0.13 (0.13) | 0.55 (0.55) | 0.91 (0.93) |
TMalignd | Systematic | 0.12 (0.14) | 0.39 (0.42) | 0.85 (0.87) |
Note: Fmax values for PDB70 (PDB).
This work.
Program . | Strategy . | Fold . | Superfamily . | Family . |
---|---|---|---|---|
DaliLitea | Systematic | 0.38 (0.36) | 0.83 (0.81) | 0.96 (0.97) |
DaliLitea | Hierarchical | 0.36 (0.31) | 0.82 (0.77) | 0.94 (0.92) |
DaliLitea | Knowledge-based | 0.36 (0.28) | 0.79 (0.73) | 0.96 (0.96) |
DeepAlignb | Systematic | 0.28 (0.25) | 0.78 (0.76) | 0.97 (0.97) |
mTMalignc | Knowledge-based | 0.13 (0.13) | 0.55 (0.55) | 0.91 (0.93) |
TMalignd | Systematic | 0.12 (0.14) | 0.39 (0.42) | 0.85 (0.87) |
Program . | Strategy . | Fold . | Superfamily . | Family . |
---|---|---|---|---|
DaliLitea | Systematic | 0.38 (0.36) | 0.83 (0.81) | 0.96 (0.97) |
DaliLitea | Hierarchical | 0.36 (0.31) | 0.82 (0.77) | 0.94 (0.92) |
DaliLitea | Knowledge-based | 0.36 (0.28) | 0.79 (0.73) | 0.96 (0.96) |
DeepAlignb | Systematic | 0.28 (0.25) | 0.78 (0.76) | 0.97 (0.97) |
mTMalignc | Knowledge-based | 0.13 (0.13) | 0.55 (0.55) | 0.91 (0.93) |
TMalignd | Systematic | 0.12 (0.14) | 0.39 (0.42) | 0.85 (0.87) |
Note: Fmax values for PDB70 (PDB).
This work.
3 Results
The hierarchical and knowledge-based database searches are much faster than systematic pairwise comparison with small losses in classification performance (Table 1). The performance of hierarchical search and knowledge-based search drops by only 0.01–0.08 at all levels relative to the baseline given by systematic pairwise comparison. The hierarchical and knowledge-based strategies searched the PDB in 2–3 min per query domain (using 40 parallel processes), yielding an estimated throughput of 500 query domains per day. In terms of CPU time, the knowledge-based search was four times as efficient as the hierarchical search (Supplementary Table S3). The hierarchical search always scales linearly with database size. On the other hand, the knowledge-based walk continues until transitive closure under a Z-score threshold, and it can be made arbitrarily fast by setting a more stringent threshold.
The reference classification, SCOP, is based on visual inspection and manual curation. Thus, the benchmark evaluates to what extent computational methods are able to capture ‘biologically significant’ similarities. DaliLite and DeepAlign outperform TM-score based methods in query-wise Fmax evaluation (Table 1) at all levels of the benchmark. mTMalign has excellent precision at the cost of lower recall (Supplementary Table S2). DaliLite performs better than the other methods at fold level. Pooled Fmax evaluation (Supplementary Table S1) shows lower performance overall with smaller differences between methods (Supplementary Fig. S3). In contrast to TM-score based methods, DaliLite and DeepAlign showed a large difference between query-wise and pooled Fmax evaluation (Supplementary Fig. S3). This means that they recognize structural similarities in agreement with SCOP, but class boundaries occur at different Z-scores (DaliLite) or bitscores (DeepAlign) for different queries. The margin was largest at the superfamily level, the level where the discovery of remote homology can lead to biological insights.
Funding
This work was supported by HiLife, University of Helsinki.
Conflict of Interest: none declared.
References