I-TASSER server: new development for protein structure and function predictions

The I-TASSER server (http://zhanglab.ccmb.med.umich.edu/I-TASSER) is an online resource for automated protein structure prediction and structure-based function annotation. In I-TASSER, structural templates are first recognized from the PDB using multiple threading alignment approaches. Full-length structure models are then constructed by iterative fragment assembly simulations. The functional insights are finally derived by matching the predicted structure models with known proteins in the function databases. Although the server has been widely used for various biological and biomedical investigations, numerous comments and suggestions have been reported from the user community. In this article, we summarize recent developments on the I-TASSER server, which were designed to address the requirements from the user community and to increase the accuracy of modeling predictions. Focuses have been made on the introduction of new methods for atomic-level structure refinement, local structure quality estimation and biological function annotations. We expect that these new developments will improve the quality of the I-TASSER server and further facilitate its use by the community for high-resolution structure and function prediction.

We randomly selected half of the proteins for training and the remaining for test. I-TASSER was used to generate structure predictions for the 635 testing proteins that are non-redundant to the other 635 proteins used to train ResQ. 483 proteins are categorized into Easy and 152 into Hard targets according to the significance score of the LOMETS alignments (3). After excluding the homologous templates with a sequence identity >30% from the template library, the I-TASSER simulations generated the first models with an average TM-score 0.71 and RMSD 5.7 Å; these modeling results are largely consistent with the results from the I-TASSER modeling in the recent CASP experiments (4,5).

Assessment criteria of the residue-specific quality and B-factor profile predictions
Three measures are used to evaluate the accuracy of the residue-specific quality (RSQ) prediction. The first is the Pearson's correlation coefficient (PCC) between the predicted (d p ) and observed (d o ) distances of the model to the native structure. The second measure is the area under the curve (AUC) of the receiver-operating characteristic (ROC), which is designed to evaluate the ability of ResQ in discriminating between well and badly modeled regions, where a residue is defined as 'well modeled' (positive) if the distance from model to the native is <3.8 Å upon the TM-score superposition, otherwise as 'badly modeled' (or negative). These two metrics were also used by the CASP assessors for evaluating the accuracy of model quality estimation (6). Following Kryshtafovych et al, we converted d p into the range of (0, 1) by d p '=1/[1+(d p /5) 2 ] in the AUC calculation, so that a fixed number of divisions can be used for different data samples to draw the ROC curves.
The third metric for evaluating the RSQ prediction is the average difference (∆d) between d p and d o , i.e.
where L is the length of the protein.
The B-factor profile (BFP) prediction is evaluated by the Pearson's correlation between the predicted and the experimental B-factors, which was also used in previous B-factor prediction studies (7,8).
Similar to the RSQ evaluation, we also use AUC for measuring the ability in discriminating between stable and flexible residues in structures, where a residue is defined as stable (positive) if the normalized B-factor is below 0 or as flexible (negative) otherwise. Similarly, for even ROC division we renormalized the predicted B-factor values (b) to the range of (0, 1) by 1/[1+exp (-b)].

Test results of residue-specific quality prediction
ResQ was applied to the first I-TASSER models of the 635 testing proteins to estimate the distance of each residue to the native structure. As shown in Table S1, the average distance predicted by ResQ (d p ) is 3.4 Å, which is consistently lower than the observed distance (d o ) of the residues on the models to that on the native (4.3 Å), resulting in an average difference between d p and d o , ∆d=2.4 Å. This consistent reduction of distance estimation relative to the native structure is mainly due to the lower distance estimation for the residues of large modeling errors (1).
We further split the test proteins into two groups, following the I-TASSER confidence score (C-score), i.e., the high-and low-confidence groups with a C-score above or below -1.5, a cutoff that was proofed to generate the lowest false positive and false negative rates for the I-TASSER modeling (9). As expected, the I-TASSER models with a higher C-score have a much better quality (TM-score=0.8) than that of a lower C-score (TM-score=0.4). Accordingly, the RSQ prediction for the high C-score proteins is much more accurate (∆d=1.4 Å) than that of low C-score ((∆d=6.4 Å), and the average PCC and AUC are 30% and 14% higher, respectively, for the high C-score models than that for the low C-score models (Table S1). In Table S2-S4, we also list the results of ResQ on the CASP9 and CASP10 models in comparison with the top-performing model quality assessment programs (MQAPs). These data showed that ResQ outperforms most of the MQAP methods in the local quality estimation of protein structure predictions.

Results of B-factor prediction
Three approaches of ResQ were tested to generate B-factor predictions. The template-based prediction is generated by transferring the B-factors of the template proteins as detected by threading, while the profile-based prediction is by training the BFP data on the sequence profile generated from the PSI-BLAST search. The third combination-based approach is to train the BFP by a combination of both threading template and sequence profiles. A summary of the PCC and AUC between the observed and predicted B-factors by the three approaches are listed in Table S5.
The profile-based approach generated a slightly higher PCC value (0.59) than the template-based approach (0.54), while the combination of the threading templates and sequence profiles achieves the highest PCC (0.61). The difference between the two methods (profile-based and combined) is statistically significant with the p-value of the student t-test below 10 -12 . A similar tendency is followed by the AUC assessment, where the combined prediction outperforms both template-or profile-based prediction methods.