-
PDF
- Split View
-
Views
-
Cite
Cite
Elijah A MacCarthy and others, GPU-I-TASSER: a GPU accelerated I-TASSER protein structure prediction tool, Bioinformatics, Volume 38, Issue 6, March 2022, Pages 1754–1755, https://doi.org/10.1093/bioinformatics/btab871
Close - Share Icon Share
Abstract
Accurate and efficient predictions of protein structures play an important role in understanding their functions. Iterative Threading Assembly Refinement (I-TASSER) is one of the most successful and widely used protein structure prediction methods in the recent community-wide CASP experiments. Yet, the computational efficiency of I-TASSER is one of the limiting factors that prevent its application for large-scale structure modeling.
We present I-TASSER for Graphics Processing Units (GPU-I-TASSER), a GPU accelerated I-TASSER protein structure prediction tool for fast and accurate protein structure prediction. Our implementation is based on OpenACC parallelization of the replica-exchange Monte Carlo simulations to enhance the speed of I-TASSER by extending its capabilities to the GPU architecture. On a benchmark dataset of 71 protein structures, GPU-I-TASSER achieves on average a 10× speedup with comparable structure prediction accuracy compared to the CPU version of the I-TASSER.
The complete source code for GPU-I-TASSER can be downloaded and used without restriction from https://zhanggroup.org/GPU-I-TASSER/.
Supplementary data are available at Bioinformatics online.
1 Introduction
The development of computational methods to accurately model protein structures from sequences is one of the important problems in the field of structural bioinformatics. Recently, there have been various advances in the field of protein structure prediction as witnessed in the community-wide CASP experiments (Kryshtafovych et al., 2019, 2021), in which several excellent methods have stood out, including Iterative Threading Assembly Refinement (I-TASSER) (Yang et al., 2015), Rosetta (Rohl et al., 2004), QUARK (Xu and Zhang, 2012), RaptorX (Kallberg et al., 2014) and most recently AlphaFold (Jumper et al., 2021). In I-TASSER, the modeling process starts with template identification from the PDB, where the full-length structure models are constructed by reassembling structural fragments from template alignments using Replica Exchange Monte Carlo (REMC), which performs concurrent simulations of n (=40) different replicas of the MC system, each running under different temperatures.
REMC is the most time-consuming process in the I-TASSER pipeline and this step becomes the bottleneck particularly for large-size proteins. Therefore, we have developed a very fast version of I-TASSER for Graphics Processing Units (GPU-I-TASSER) by implementing a parallel execution of the REMC method. Essentially, we develop GPU-I-TASSER by porting compute-intensive operations to the GPU using OpenACC. The availability of GPU-I-TASSER is likely to facilitate the widespread adoption of GPU-I-TASSER by the structural bioinformatics community. It will complement the existing end-to-end protein structure prediction packages such as AlphaFold2 (Jumper et al., 2021) and RoseTTAFold (Baek et al., 2021), especially when restraints from multiple structural templates need to be imposed for structure prediction.
2 Materials and methods
The I-TASSER pipeline for the protein structure prediction consists of three main steps: threading template identification, iterative REMC structure assembly simulation, and clustering and structure refinement (Yang et al., 2015). REMC within structure assembly consists of expensive energy calculation and Monte Carlo moves/updates and swaps that take up approximately 75% of the total computational time. As a result, this step becomes the ideal target for porting the expensive calculation to the GPU. GPUs accelerate calculations by executing the same instructions simultaneously on different data (i.e. SIMD paradigm). For different replicas, which represent (multiple) different data, the same operations (single instruction) of energy calculations, Monte Carlo moves and swaps need to be performed. Reduction kernels are also used on operations that involve reductions with results synchronized. We describe in detail the porting steps below.
2.1 Energy computations
The knowledge-based force field of I-TASSER consists of three major components: (i) general statistical potential, (ii) hydrogen-bonding energy and (iii) threading-based restraints from LOMETS (Zheng et al., 2019). Computations on these three force field components take a substantial amount of time. Hence, we tackle each of the components on the GPU and utilize the massive number of threads to facilitate the computational process. We copy the data needed for each compute region onto the device (GPU) just before invoking that compute kernel. We define a data region that spans several compute regions to prevent frequent data transfers between the host (CPU) and the device (GPU).
2.2 Monte Carlo updates
Several Monte Carlo moves are attempted in REMC steps in I-TASSER. These updates/moves are accepted or rejected based on the Metropolis criterion (Metropolis et al., 1953). Each of these moves is executed in parallel on the GPU by invoking parallel and kernel regions for the compute portions of the moves, where the data regions are spread over several compute areas. This optimizes the data region and reduces the time for communication.
2.3 Replica exchange
In REMC in I-TASSER, the replica exchange process happens after a set of Monte Carlo updates. The swaps by themselves do not require significant time. However, when multiple replicas are simulated and each swap requests for energy comparison of the involved replica pairs, the computational time becomes significant. Based on the approach of Gross et al. (2011), to lower the computational time for replica exchange, each replica is allocated in its own compute region involving multiple thread blocks, and updates are made available to the host from each replica. The Metropolis criterion is then used to access whether the exchange/swap should be accepted or not.
3 Results
We performed a comparative analysis of GPU-I-TASSER and CPU-I-TASSER (original I-TASSER) using a benchmark dataset of 71 proteins (Supplementary Table S1). After targeting the hardware of the GPU (in this case, Tesla P100 PCIe GPU), we obtained an average speedup of 10.27× compared with the CPU (Intel Xeon E5-2680v3 processor) (Fig. 1). By comparing the predicted models with the native structure (results shown in Supplementary Table S2), we can see that GPU-I-TASSER predicts structures with comparable accuracy to I-TASSER in terms of TM-score (Zhang and Skolnick, 2004). We anticipate that GPU-I-TASSER will be useful in large-scale protein structure modeling applications in which the speed of the pipeline is essential. The complete source code is available for use from https://zhanggroup.org/GPU-I-TASSER/.
Average execution time (in seconds) for GPU-I-TASSER and CPU-I-TASSER (original I-TASSER) for various protein sizes binned according to the number of protein residues
Acknowledgement
The authors thank Mathew Colgrove of Nvidia for his tremendous help on this project and Dr Sunita Chandrasekaran for discussion.
Funding
This work was supported in part by the National Science Foundation [DBI1564606 to D.K., IIS1901191, DBI2030790, MTM2025426 to Y.Z.]; the National Institute of General Medical Sciences [GM136422, S10OD026825 to Y.Z.] and the National Institute of Allergy and Infectious Diseases [AI134678 to Y.Z.].
Conflict of Interest: none declared.
