Abstract

Motivation

Single protein residue mutations may reshape the binding affinity of protein–protein interactions. Therefore, predicting its effects is of great interest in biotechnology and biomedicine. Unfortunately, the availability of experimental data on binding affinity changes upon mutation is limited, which hampers the development of new and more precise algorithms. Here, we propose UEP, a classifier for predicting beneficial and detrimental mutations in protein–protein complexes trained on interactome data.

Results

Regardless of the simplicity of the UEP algorithm, which is based on a simple three-body contact potential derived from interactome data, we report competitive results with the gold standard methods in this field with the advantage of being faster in terms of computational time. Moreover, we propose a consensus selection procedure by involving the combination of three predictors that showed higher classification accuracy in our benchmark: UEP, pyDock and EvoEF1/FoldX. Overall, we demonstrate that the analysis of interactome data allows predicting the impact of protein–protein mutations using UEP, a fast and reliable open-source code.

Availability and implementation

UEP algorithm can be found at: https://github.com/pepamengual/UEP.

Supplementary information

Supplementary data are available at Bioinformatics online.

1 Introduction

Almost all biological processes in the cell are governed by highly selective protein–protein interactions (PPIs), which are especially sensitive to non-synonymous nucleotide mutations; a single residue substitution can improve or decrease the affinity of the whole protein–protein complex. Evaluation of the impact of mutations has been the subject of significant interest at different levels. Recent studies highlight that disease-related mutations are over-represented at the protein–protein interface level, often perturbing the integrity of PPIs (David et al., 2015; Navío et al., 2019; Sahni et al., 2015; Yates and Sternberg, 2013). Moreover, prediction of the effects of mutations is becoming a crucial step in protein engineering and it is especially relevant for the design of PPI, such as for instance, potent antibody variants (Diskin et al., 2011; Rudicell et al., 2014; Warszawski et al., 2019).

From a computational side, prediction has been addressed by diverse approaches aiming to describe affinity changes originated by mutations (ΔΔG). Rigorous approaches (such as alchemical free energy estimators) have the potential of being accurate at the expense of a high computational demand. However, these are typically not used in screening efforts. Most used approaches can be classified as physical energy descriptors, statistical potentials, sequence conservation, shape complementarity and more recently, machine learning-based techniques. Some of the considered state-of-the art methods (and the ones being evaluated in this work) are: pyDock (Cheng et al., 2007; Jiménez-García et al., 2013), FoldX (Guerois et al., 2002; Schymkowitz et al., 2005), EvoEF1 (Pearce et al., 2019), EvoEF2 (Huang et al., 2020), PRODIGY (Vangone et al., 2015; Xue et al., 2016), BeAtMuSiC (Dehouck et al., 2013) and mCSM (Pires et al., 2014). pyDock uses electrostatics and desolvation energy to perform ΔG calculations, and it has been classically linked to the scoring of docking poses. The solid performance of pyDock in the CASP13-CAPRI experiment (Lensink et al., 2019) motivated its evaluation in this work. FoldX offers a platform for modelling mutations and predicts the binding affinity of the complex through the linear combination of multiple energy terms weighted to experimental determinations. For the energy calculation, FoldX accounts for van der Waals contributions, solvation of polar and apolar groups, hydrogen bonds, electrostatic contribution of charged groups and entropic cost penalties. Both EvoEF1 and EvoEF2 also offer a platform for modelling mutations. The former is recommended for ΔΔG predictions while the latter is recommended for de novo protein design. EvoEF scoring function accounts for van der Waals energy, electrostatic interactions between partially charged atoms, hydrogen-bonding interactions, desolvation energy and a reference energy of the unfolded state ensemble. PRODIGY predicts the binding affinity by taking advantage of the polarity features (charged, polar and apolar) of inter-residue contacts corrected by non-interacting surface ones. BeAtMuSiC is only accessible via web server, and it is based on statistical potentials describing correlations between amino acids, pairwise inter-residue distances, torsion angles of the backbone and the accessibility of the solvent. mCSM is also only accessible through a web server, and it is a machine learning-based approach. mCSM takes advantage of graph-based signatures of the protein environment to predict changes in binding affinity originated by the effects of mutations. Hence, there are multiple ways to predict the effects of mutations in protein–proteins complexes, and each approach presents some sort of particular limitations linked to its fundamental basis. For instance, physical energy descriptors mainly rely on force field parameterization and accurate structural modelling of the mutation, while machine learning-based techniques depend on the availability of data (often suffering in de novo predictions). Despite the limitations these methods may suffer, they are positioned as the state-of-the-art in this field because they offer a more realistic speed-accuracy trade-off than rigorous approaches. However, it has been reported that their overall performance is still limited (Geng et al., 2019a,b), which highlights the complexity of the prediction.

A major limitation for evaluating and developing new algorithms for predicting changes in binding affinity is the lack of (reliable) experimental data. Classically, the most widely used technique to experimentally determine the importance of residues in protein–protein binding is the alanine scanning mutagenesis, which aims to identify side chains that are important for binding. This is represented in the largest experimental database of free energy changes on mutation for structurally solved PPIs, SKEMPI 2.0 (Jankauskaitė et al., 2019), where more than half (55.5%) of single point mutations are to alanine. However, mutations to alanine do not (generally) provide information about how interactions could be improved, which would be the case of using saturated mutagenesis; a mutation to alanine often hampers the affinity of the complex. Thus, for the development of algorithms aiming to improve the affinity of protein–protein complexes, one should probably use a small subset of the experimental determinations (those with mutations to residues other than alanine). Experimental data of free energy changes on mutation in SKEMPI 2.0 reveal that only a small amount of mutations increases the binding affinity of the complex, which is often the main goal in PPI engineering. A screening procedure aiming to design an enhanced PPI requires the evaluation of all possible mutations in the protein–protein interface (depending on the interface size, from hundreds to thousands), and this step can be time and cost consuming. Therefore, having a fast and open-source predictive model for classifying mutations into improving and deleterious will be beneficial for this purpose, since it would reduce the number of candidates to further evaluate using expensive simulations.

Recent advances in solving crystal structures together with the improvement of template-based homology modelling algorithms allowed the building of large PPIs networks, such as Interactome3D (Mosca et al., 2013). Those networks implicitly contain the intrinsic rules for protein–protein binding at a three-dimensional level, and their analysis should provide hints for predicting the effects of mutations on binding affinity. Following this assumption, we developed UEP, an open-source and fast classifier for predicting the impact of mutations in protein–protein complexes. UEP algorithm consists of a simple but efficient three-body contact potential of the interactions in Interactome3D. Our results are comparable to those from the most competitive state-of-the art methods, with the advantage of being extremely fast in comparison. Being able to accurately screen hundreds of mutations per second is relevant for pre-screening purposes in industry and at the biomedical level. Importantly, we propose a pipeline for boosting the predictive performance at the expense of higher computational resources (not involving much more resources than running a single state-of-the art method used in this work). This consists of making decisions based on the consensus and unanimous selection of three of the top scoring methods: UEP, pyDock and EvoEF1/FoldX. Overall, we demonstrate that prediction of the impact of mutations in a protein–protein interface can be performed by analysing the interactions observed in interactome data, which opens the door to the design of new, fast and precise algorithms without relying on experimental data on mutation.

2 Materials and methods

2.1 Interactome3D data summary

The representative Interactome3D database (release 2019-01) was used for developing UEP. This release contains 33 607 unique three-dimensional protein–protein complexes from 18 different species of different kingdoms: Plant (1), Eubacteria (8), Animal (6), Fungi (2) and Protist (1). Almost half of the database is composed of experimental structures (44.2%, 14 862 complexes), followed by homology models constructed from generic PDB templates (35.0%, 11 776 complexes) and from domain–domain structural templates (20.8%. 6969 complexes). Interactome3D contains protein–protein complexes that are also present in SKEMPI 2.0. To avoid redundancy towards this effect, we aimed to discard those protein–protein complexes having high sequence similarity to the SKEMPI 2.0 ones. Therefore, we performed global pairwise alignments of all proteins between both databases. Pairwise alignments were performed using the BLOSUM62 matrix and the Needleman–Wunsch algorithm. Sequence identity analysis of those alignments revealed that most of the Interactome3D complexes (31 736 of 33 607) share less than 30% identity to SKEMPI 2.0 proteins (Supplementary Fig. S1). This identity threshold was used to select the training complexes of the UEP contact matrix. Hence, an amount of 13 773 experimental structures, 11 044 models from generic PDB templates and 6919 from domain-domain structural templates were selected.

2.2 Integrating interactome data

UEP training process consists of creating a predictive protein–protein contact matrix (Fig. 1). We designed the algorithm in such a way that it is trained only on the highly packed interface residues of Interactome3D. To locate them, we applied two major restrictions to each protein–protein complex: (i) we used a heavy atom cut-off distance of 5 Å between proteins to determine interface residues (ii) we filtered out the interface residues that are not in contact with at least two residues of the other protein. In this way, UEP contact network follows a three-body scheme, where one residue of one protein is in contact with a pair of residues of the other protein. In case that one residue has more than two contacts, the combination of all possible pairs of residues (without repetition and without order) is performed and accounted into the contact matrix. An example is depicted here: (protein A: ALA 50) is in contact with (protein B: SER 21, TRP 24 and HIS 25). In this way, contacts that would be accounted into the contact matrix for ALA would be SER-TRP, SER-HIS and TRP-HIS. We have observed that this approach shows a better prediction performance than using pairwise contacts (Supplementary Tables S1 and S2).

UEP scheme. UEP algorithm is trained on the interactome data from the representative set of Interactome3D complexes. Three-dimensional complexes of the database were scanned to feed UEP contact matrix. UEP evaluates the mutant suitability for a given protein–protein structure by scanning the highly packed interface residues as observed in the interactome data
Fig. 1.

UEP scheme. UEP algorithm is trained on the interactome data from the representative set of Interactome3D complexes. Three-dimensional complexes of the database were scanned to feed UEP contact matrix. UEP evaluates the mutant suitability for a given protein–protein structure by scanning the highly packed interface residues as observed in the interactome data

2.3 Evaluating and classifying mutations

UEP algorithm was designed to evaluate highly packed residues placed in the protein–protein interface. Hence, positions that are not in contact with at least two residues of the other protein (within a heavy atom cut-off distance of 5 Å) cannot be statistically evaluated by UEP. Despite being a limitation, our approach follows the logical assumption that positions lacking contacts must (i) lesser influence the ΔΔG on mutation, (ii) be more challenging to predict. In fact, we observed these two phenomena’s: (i) highly packed residues influence larger effects on ΔΔG than non-highly packed (Table 1), and (ii) non-highly packed residues are difficult to predict by all methods (Supplementary Tables S3–S6). These observations support the evidence that positions having a minimum amount of contacts impact larger effects on ΔΔG on mutation, and hence, they are easier to be correctly predicted. Moreover, we aimed to design an algorithm that does not require modelling mutations, which is often a time- and memory-consuming step for physical energy-based predictors. The amount of contacts made by each amino acid strongly depends on their size, and therefore, UEP accounts for a gain/loss of contacts by taking into account changes in amino acid size. Importantly, predictions cannot be performed if the wild-type or the mutant residues are predicted to have less than two amino acid contacts. After identifying the protein contacts of the wild-type and mutant residues, UEP constructs the three-body scheme described in the above section. Then, the observed counts in Interactome3D for each three-body contact are extracted from the pre-trained UEP matrix and accounted for. Finally, those scores are normalized by the frequency of finding such residues in the UEP matrix. Thus, a potential ΔΔGUEP can be obtained from the ratio of such mutant and wild-type scores (Eq. 1), as described in Moal et al. (2013). Importantly, UEP does not perform any normalization consisting of physicochemical properties of the amino acids, such as hydrophobicity, polarity or charges. We expected that the network would implicitly address them depending on the observed contact frequencies.

Table 1.

Summary of ΔΔG changes in SKEMPI 2.0

Mutation type
Alanine
Other than alanine
Highly packed interfaceYESNOYESNO
Increase ΔΔG (kcal/mol)−0.79−0.44−2.24−1.28
Decrease ΔΔG (kcal/mol)2.841.233.631.72
Mutation type
Alanine
Other than alanine
Highly packed interfaceYESNOYESNO
Increase ΔΔG (kcal/mol)−0.79−0.44−2.24−1.28
Decrease ΔΔG (kcal/mol)2.841.233.631.72

Note: Average ΔΔG changes upon mutation depending on: (i) mutation nature (to alanine or to other than alanine) and (ii) if they are placed in the highly packed interface.

Table 1.

Summary of ΔΔG changes in SKEMPI 2.0

Mutation type
Alanine
Other than alanine
Highly packed interfaceYESNOYESNO
Increase ΔΔG (kcal/mol)−0.79−0.44−2.24−1.28
Decrease ΔΔG (kcal/mol)2.841.233.631.72
Mutation type
Alanine
Other than alanine
Highly packed interfaceYESNOYESNO
Increase ΔΔG (kcal/mol)−0.79−0.44−2.24−1.28
Decrease ΔΔG (kcal/mol)2.841.233.631.72

Note: Average ΔΔG changes upon mutation depending on: (i) mutation nature (to alanine or to other than alanine) and (ii) if they are placed in the highly packed interface.

2.4 SKEMPI 2.0 data summary

SKEMPI 2.0 is one of the largest databases containing changes in protein–protein binding energy, for which a structure of the complex has been solved and is available publicly in the Protein Data Bank. All single mutations having no discrepancies on their experimental binding energies were considered (i.e. all determinations for the same mutation having higher or lower experimental binding energies than the wild-type counterpart). Therefore, an amount of 2103 mutations to alanine and 1762 mutations to residues other than alanine were evaluated. Since UEP performs a statistical analysis of the highly packed interface, positions having less than two contacts cannot be scored. Hence, UEP scored 985 mutations to alanine and 1251 mutations to other than alanine, representing 264 out of 345 protein–protein complexes in SKEMPI 2.0. As mentioned in the previous section, mutations placed in a highly packed interface exert larger impacts on the experimental binding affinity. A summary of the SKEMPI 2.0 mutations can be found in Table 1. As can be observed, mutations to alanine that increase the binding energy exert larger effects on highly packed positions (in average -0.79 kcal/mol) than non-highly packed (in average of -0.44 kcal/mol). The same pattern can be found for mutations other than alanine (-2.24 kcal/mol for highly packed and -1.28 kcal/mol for non-highly packed). Indeed, the same trend was observed for mutations decreasing the binding energy, where mutations placed in the highly packed interface exert larger effects on the ΔΔG (2.84 kcal/mol and 1.23 kcal/mol for alanine; 3.63 kcal/mol and 1.72 kcal/mol for mutations other than alanine). Moreover, mutations to residues other than alanine exert larger impact on the experimental ΔΔG than mutations to alanine.

2.5 Benchmarking SKEMPI 2.0 subset

An amount of 985 mutations to alanine and 1251 mutations to other than alanine from SKEMPI 2.0 (see above section) were benchmarked using UEP and five physical energy predictors (pyDock, FoldX, EvoEF1, EvoEF2 and PRODIGY), a statistical potential method (BeAtMuSiC) and a machine learning-based one (mCSM). We used the standalone versions when available, otherwise we used their website version. The five physical energy predictors require explicit structural models to make predictions, while others (including UEP, BeAtMuSiC and mCSM) do not. Physical energy predictors estimate the ΔG of a given complex, and ΔΔG can be extracted from computing the difference between the predicted ΔGs for wild-type and mutant. Hence, two predictions must be performed to obtain the ΔΔG: one for the wild-type and another one for the mutant structure. FoldX and EvoEF1/2 are the ones that offer a set of tools for modelling mutations. Hence, predictors lacking this modelling process (pyDock and PRODIGY) were evaluated using the structures generated by FoldX and EvoEF1. BeAtMuSiC and mCSM are particularly devoted to the evaluation of the impact of mutations and provide the ΔΔG scores directly and without generating mutant structures.

Performance of the predictions was evaluated following two main criteria: (i) ability to classify mutations into improving and decreasing the binding energy, and (ii) ability to correlate with the experimental determinations. From the classification point of view, predictors were compared through the examination of their confusion matrices. Experimental and predicted thresholds used for the construction of the confusion matrices were ΔΔGexp=0 kcal/mol and ΔΔGpred=0 kcal/mol. Those thresholds would be the most common ones in any experimental set up aiming to qualitatively decide the effects of a mutation. In the confusion matrices, rows represent the predicted class instances (P+ and P−) and columns represent the experimental conditions (C+ and C−). Each window of the confusion matrix reports the number of True Positives (TP), False Positives (FP), True Negatives (TN) and False Negatives (FN). This allows the calculation of multiple statistical descriptors: Positive Predictive Value (PPV) (Eq. 2), Negative Predictive Value (NPV) (Eq. 3), True Positive Rate (TPR) (Eq. 4) and True Negative Rate (TNR) (Eq. 5). Moreover, the Matthew’s Correlation Coefficient (MCC) was computed to compare the global prediction performance (Eq. 6). MCC score is widely used for classifying unbalanced data, where the amount of positive entries differs significantly from the negative ones. Thus, MCC score represents a balanced measure to determine the overall predictive power, since only 122 out of 985 (mutations to alanine) and 298 out of 1251 (mutations other than alanine) mutations increase the binding affinity compared to their wild-type counterparts. From the correlation point of view, we computed the Pearson Correlation Coefficient (PCC) and Root Mean Square Error (RMSE).
(1)
 
(2)
 
(3)
 
(4)
 
(5)
 
(6)

Equations accounting for: (1) ΔΔGUEP, (2) Positive Predictive Value (PPV), (3) Negative Predictive Value (NPV), (4) True Positive Rate (TPR), (5) True Negative Rate (TNR) and (6) Matthew’s Correlation Coefficient (MCC).

3 Results

Predictors were evaluated under several different criteria: (i) computational time needed to evaluate the benchmarks, (ii) ability to classify mutations into improving and decreasing the binding energy of the complex, (iii) ability to correlate ΔΔG predictions with experimental ΔΔG and (iv) their performance depending on two physicochemical properties: volume and hydrophobicity changes on mutation. Moreover, we evaluated the similarity between the predictors, and suggested two different selection procedures, involving the consensus and unanimous decision of three of the best classifiers to increase the prediction performance compared to the individual methods.

3.1 Benchmarking predictors: computational time

We evaluated the computational cost needed for each method to predict the entire benchmark consisting of 2236 mutations (to alanine and to other than alanine together; we use 4 computing cores for comparison). UEP is the fastest method and scored the entire benchmark in less than 1 minute. The average time for modelling a single mutation using FoldX and EvoEF1/2 was ∼30 and ∼5 s per computing core, respectively. Therefore, modelling the entire benchmark would take ∼280 and ∼45 min for FoldX and EvoEF1/2, respectively. This process is necessary to obtain the final ΔΔG prediction of the five physical energy predictors (FoldX, EvoEF1, EvoEF2, pyDock and PRODIGY). Estimation of the ΔG just takes a few seconds per structure, needing ∼15 min to evaluate the entire benchmark for each predictor. Time measurements are not provided for the web servers (BeAtMuSiC and mCSM), since the main time bottleneck consists of manually filling in the web form for each protein–protein complex, rather than the prediction calculation itself.

3.2 Benchmarking predictors: ability to classify

Regarding the ability of classifying mutations into improving and decreasing the binding energy of the complex, mutations to alanine are generally difficult to be correctly classified by all methods (Supplementary Figs S3–S6). This is explained because of the fact that, as reported in previous sections, mutations to alanine exert less impact on the experimental ΔΔG than mutations to other than alanine (Table 1). Overall, MCC values of predictions to alanine are in range of 0.06–0.19 (Supplementary Fig. S2), reaching UEP an MCC value of 0.10. Interestingly, some predictors tend to classify most of the mutations to alanine as decreasing the ΔΔG, such as BeAtMuSiC and mCSM.

For mutations to other than alanine, higher PPV, TPR and MCC have been observed compared to mutations to alanine. Overall, the MCC values of predictors (mutations to other than alanine) are in range of 0.09–0.26, reaching UEP an MCC value of 0.20 (Fig. 2). Our results indicate that, excluding mCSM method (discussed later), PPV and NPV for all predictors are relatively similar: minimum and maximum values for PPV are 0.28–0.38 and NPV are 0.79–0.87. These results are consistent with previous studies (Geng et al., 2019a). However, TPR and TNR values differ substantially between predictors: minimum and maximum values for TPR are 0.27–0.73 and for TNR are 0.56–0.87. In particular, UEP showed an MCC = 0.20, with a PPV = 0.33 (in mid-term of minimum and maximum values), an NPV = 0.84 (closer to the highest one), a TPR = 0.66 (closer to the highest one) and a TNR = 0.58 (closer to the lowest one). Two-thirds of the mutations that actually improve the binding affinity (197 out of 298, TPR = 0.66) were correctly predicted as such, with a PPV = 0.33. Other predictors showing higher PPV, such as BeAtMuSiC (PPV = 0.39), classify less than one-third of the mutations that actually improve the binding affinity as such (79 out of 293, TPR = 0.27). Taking this into account together with the fact that the PPVs of all methods are relatively similar within a thin range, PPV cannot be used as a fair metric to determine which is the best predictor.

Performance of all tested protein–protein affinity predictors on mutation on the 1251 selected mutations other than alanine of the SKEMPI 2.0. Left panel shows the TPR, TNR, PPV and NPV patterns of all tested predictors. On the right panel, the confusion matrices of all predictors are depicted: experimental data condition is represented in vertical (C+ or C−, if mutation increases or decreases experimental binding affinity, respectively) while predictions are represented in horizontal (P+ or P−, if mutation is predicted to increase or decrease the binding affinity). MCC scores and the approximated time of analysis are also represented. Time is not depicted for web server-only-based methods. Unanimous and consensus selection were performed using UEP, EvoEF1 and pyDock-EvoEF1
Fig. 2.

Performance of all tested protein–protein affinity predictors on mutation on the 1251 selected mutations other than alanine of the SKEMPI 2.0. Left panel shows the TPR, TNR, PPV and NPV patterns of all tested predictors. On the right panel, the confusion matrices of all predictors are depicted: experimental data condition is represented in vertical (C+ or C, if mutation increases or decreases experimental binding affinity, respectively) while predictions are represented in horizontal (P+ or P, if mutation is predicted to increase or decrease the binding affinity). MCC scores and the approximated time of analysis are also represented. Time is not depicted for web server-only-based methods. Unanimous and consensus selection were performed using UEP, EvoEF1 and pyDock-EvoEF1

EvoEF1 showed better prediction performance than EvoEF2 for classifying mutations, and EvoEF1 also slightly outperformed FoldX (as reported in Huang et al., 2020). Moreover, pyDock and PRODIGY predictions were slightly better for predicting models generated by EvoEF1 than FoldX. Regarding mCSM, we splitted the benchmark into two groups depending on whether the mutation was described in its training data (mCSM trained group, mCSM TRA) or not (mCSM untrained group, mCSM UNT). mCSM trained group shows an impressive high MCC = 0.58, with PPV = 0.80, NPV = 0.89, TPR = 0.52 and TNR = 0.97. However, mCSM untrained group shows a notable decrease in MCC = 0.03, with a PPV = 0.35, NPV = 0.70, TPR = 0.12 and TNR = 0.90. We benchmarked UEP using the same mutations appearing in the mCSM untrained group (UEP mCSM untrained, UEP mCSM UNT), obtaining an MCC = 0.23, with PPV = 0.41, NPV = 0.81, TPR = 0.70 and TNR = 0.55. Since UEP classified this group with similar performance than the entire benchmark, we discarded possible effects making their prediction more challenging than the other group. Therefore, the (important) drop in performance of mCSM may indicate that the training data used for the development of the algorithm was not representative enough to predict new mutations other than alanine on protein–protein complexes, suggesting a possible overfitting towards the training data; also reported by Geng et al. (2019a).

Overall, our results indicate that for mutations other than alanine, the main difference between all predictors evaluated in this work is related to their TPR and TNR, which is the percentage of improving/decreasing mutations that are correctly identified as such (Fig. 2 and Supplementary Tables S4 and S6). Hence, some predictors tend to better capture mutations improving the binding affinity than the decreasing ones (higher TPR than TNR) and vice versa. We have observed the three possible scenarios (Fig. 2): higher TPR (UEP and pyDock-FoldX), similar TPR to TNR (PRODIGY-FoldX, PRODIGY-EvoEF1 pyDock-EvoEF1, EvoEF1) and higher TNR (FoldX, EvoEF2, BeAtMuSiC and mCSM).

3.3 Benchmarking predictors: ability to correlate

Regarding the ability of correlating with the experimental data, we observed a different scenario than the ability to classify section. In this case, most of the predictors showed higher linear correlations for mutations to alanine than mutations to other than alanine (Supplementary Figs S3 and S4). For mutations to alanine, PCC varied among 0.16 and 0.35. The highest performance was achieved by FoldX (PCC: 0.35, RMSE: 2.83) and EvoEF1 (PCC: 0.35, RMSE: 2.87). In this case, UEP reached a PCC: 0.16 with an RMSE: 2.97. For mutations other than alanine, PCC varied among 0.06 and 0.43. The highest performance was achieved by FoldX (PCC: 0.43, RMSE: 3.96), and UEP reached a PCC: 0.22 with an RMSE: 4.59. mCSM performance was also evaluated by splitting the mutation data into trained and untrained mCSM groups. mCSM trained achieved a PCC 0.72 (mutations to alanine) and 0.88 (to other than alanine). mCSM untrained group achieved lower PCC values: 0.33 (mutations to alanine) and -0.34 (mutations to other than alanine). This observation supports the previous hypothesis regarding the possible limitations of the machine learning-based methods in this field, which it seems to be especially relevant for mutations to other than alanine. We also evaluated the untrained mCSM set of mutations with UEP, which achieved similar PCC than the entire benchmark: 0.17 (mutations to alanine) and 0.18 (mutations other than alanine). EvoEF1 achieved better correlation values than EvoEF2 as reported by the authors (PCC: 0.35 and 0.26, for mutations to alanine, respectively; PCC: 0.33 and 0.16 for other than alanine mutations, respectively). Moreover, pyDock and PRODIGY showed slightly higher correlation values while predicting models generated by EvoEF1 than FoldX (Supplementary Figs S2 and S3).

3.4 Performance depending on mutation nature

At this point, we aimed to evaluate if there are differences in performance while predicting different groups of amino acids. We evaluated groups of mutations depending on changes in amino acid size and hydrophobicity (ΔV and ΔH, respectively) for the 1251 mutations other than alanine. Side chain volumes and hydrophobicity indices were extracted from literature (Eisenberg et al., 1984; Lin et al., 2008). We used a threshold of 0.1 nm³ to determine whether the side chain of a mutation increases its volume (ΔV > 0.1 nm³, Gain group, 490 mutations), decreases (ΔV<-0.1 nm³, Loss group, 423 mutations) or shows a similar volume compared to its wild-type counterpart (|ΔV|≤0.1 nm³, Neutral group, 338 mutations). For the hydrophobicity classification, we used a threshold of 0.3 Eisenberg units (Eu) to determine whether a mutation increases (ΔH > 0.3Eu, Gain group, 604 mutations), decreases (ΔH<-0.3Eu, Loss group, 466 mutations) or shows similar hydrophobicity (|ΔH|≤0.3Eu, Neutral group, 181 mutations). Confusion matrices of those subgroups can be found in Supplementary Tables S8.1–9 (changes in amino acid size) and Supplementary Tables S9.1–9 (changes in hydrophobicity).

Multiple prediction patterns can be observed when comparing the MCC scores of all subgroups (Fig. 3). The left panel depicts performance depending on changes in amino acid size, while the right panel shows performance depending on changes in hydrophobicity. Regarding amino acid size changes, five out of nine predictors (UEP, pyDock-EvoEF1, pyDock-FoldX, PRODIGY-EvoEF1 and BeAtMuSiC) showed better performance when evaluating neutral size mutations. FoldX and EvoEF1 showed better performance when evaluating non-neutral size mutations. A linear decrease in performance was observed in PRODIGY-FoldX when mutation decreases the amino acid size compared to the wild-type counterpart. Regarding changes in hydrophobicity, three different patterns can be observed. All predictors (except UEP and BeAtMuSiC) showed better prediction performance for predicting neutral changes in hydrophobicity. UEP showed a linear increase in performance depending on hydrophobicity loss, and BeAtMuSiC showed very similar performance in all groups.

Performance of UEP, FoldX, EvoEF1, EvoEF2, pyDock-FoldX, pyDock-EvoEF1, PRODIGY-FoldX, PRODIGY-EvoEF1 and BeAtMuSiC depending on changes in amino acid size (left panel) and hydrophobicity (right panel). Subgroups represent the MCC performance of the entire benchmark consisting of 1251 mutations other than alanine (All), an increase in amino acid size/hydrophobicity (Gain), similar amino acid size/hydrophobicity (Neutral) and a decrease in amino acid size/hydrophobicity (Loss)
Fig. 3.

Performance of UEP, FoldX, EvoEF1, EvoEF2, pyDock-FoldX, pyDock-EvoEF1, PRODIGY-FoldX, PRODIGY-EvoEF1 and BeAtMuSiC depending on changes in amino acid size (left panel) and hydrophobicity (right panel). Subgroups represent the MCC performance of the entire benchmark consisting of 1251 mutations other than alanine (All), an increase in amino acid size/hydrophobicity (Gain), similar amino acid size/hydrophobicity (Neutral) and a decrease in amino acid size/hydrophobicity (Loss)

3.5 Consensus and unanimous decisions

Having observed different prediction patterns for mutations other than alanine, we aimed to evaluate which predictors are the most similar ones when classifying mutations into improving or decreasing the binding affinity. We excluded mCSM due to its poor performance on the untrained group (Table 2 and Supplementary Table S4). The most similar predictors are pyDock and PRODIGY when estimating the ΔΔG on models generated by FoldX and EvoEF1 [pyDock-FoldX and pyDock-EvoEF1 (80.2%); PRODIGY-FoldX and PRODIGY-EvoEF1 (84.3%)]. Despite the high similarity, not observing a completely agreement in those groups highlights the importance of the process of generating mutations during ΔΔG predictions. Pairwise comparison of all other predictors indicated relatively similar agreement rates, from 54.9% to 71.3% (Table 2), with a mean of 62.1%.

Table 2.

Percentage of pairwise similarity between all evaluated predictors

PredictorsPercentagePredictorsPercentage
UEP/pyDock-FoldX59.1 (739)pyDock-EvoEF1/PRODIGY-FoldX56.7 (709)
UEP/pyDock-EvoEF156.5 (707)pyDock-EvoEF1/PRODIGY-EvoEF157.0 (713)
UEP/FoldX64.0 (801)pyDock-EvoEF1/BeAtMuSiC61.6 (771)
UEP/EvoEF165.3 (817)PRODIGY-FoldX/PRODIGY-EvoEF184.3 (1055)
UEP/EvoEF255.6 (696)FoldX/EvoEF168.5 (857)
UEP/PRODIGY-FoldX57.6 (721)FoldX/EvoEF271.3 (892)
UEP/PRODIGY-EvoEF157.6 (721)FoldX/PRODIGY-FoldX56.4 (705)
UEP/BeAtMuSiC56.7 (709)FoldX/PRODIGY-EvoEF156.2 (703)
pyDock-FoldX/pyDock-EvoEF180.2 (1003)FoldX/BeAtMuSiC64.5 (807)
pyDock-FoldX/FoldX60.5 (757)EvoEF1/EvoEF264.9 (812)
pyDock-FoldX/EvoEF163.5 (795)EvoEF1/PRODIGY-FoldX57.6 (721)
pyDock-FoldX/EvoEF253.9 (674)EvoEF1/PRODIGY-EvoEF157.5 (719)
pyDock-FoldX/PRODIGY-FoldX54.9 (687)EvoEF1/BeAtMuSiC59.2 (741)
pyDock-FoldX/PRODIGY-EvoEF155.1 (689)EvoEF2/PRODIGY-FoldX56.0 (700)
pyDock-FoldX-BeAtMuSiC56.7 (709)EvoEF2/PRODIGY-EvoEF155.5 (694)
pyDock-EvoEF1/FoldX63.7 (797)EvoEF2/BeAtMuSiC69.4 (868)
pyDock-EvoEF1/EvoEF163.5 (795)PRODIGY-FoldX/BeAtMuSiC58.1 (727)
pyDock-EvoEF1/EvoEF261.6 (770)PRODIGY-EvoEF1/BeAtMuSiC57.0 (713)
PredictorsPercentagePredictorsPercentage
UEP/pyDock-FoldX59.1 (739)pyDock-EvoEF1/PRODIGY-FoldX56.7 (709)
UEP/pyDock-EvoEF156.5 (707)pyDock-EvoEF1/PRODIGY-EvoEF157.0 (713)
UEP/FoldX64.0 (801)pyDock-EvoEF1/BeAtMuSiC61.6 (771)
UEP/EvoEF165.3 (817)PRODIGY-FoldX/PRODIGY-EvoEF184.3 (1055)
UEP/EvoEF255.6 (696)FoldX/EvoEF168.5 (857)
UEP/PRODIGY-FoldX57.6 (721)FoldX/EvoEF271.3 (892)
UEP/PRODIGY-EvoEF157.6 (721)FoldX/PRODIGY-FoldX56.4 (705)
UEP/BeAtMuSiC56.7 (709)FoldX/PRODIGY-EvoEF156.2 (703)
pyDock-FoldX/pyDock-EvoEF180.2 (1003)FoldX/BeAtMuSiC64.5 (807)
pyDock-FoldX/FoldX60.5 (757)EvoEF1/EvoEF264.9 (812)
pyDock-FoldX/EvoEF163.5 (795)EvoEF1/PRODIGY-FoldX57.6 (721)
pyDock-FoldX/EvoEF253.9 (674)EvoEF1/PRODIGY-EvoEF157.5 (719)
pyDock-FoldX/PRODIGY-FoldX54.9 (687)EvoEF1/BeAtMuSiC59.2 (741)
pyDock-FoldX/PRODIGY-EvoEF155.1 (689)EvoEF2/PRODIGY-FoldX56.0 (700)
pyDock-FoldX-BeAtMuSiC56.7 (709)EvoEF2/PRODIGY-EvoEF155.5 (694)
pyDock-EvoEF1/FoldX63.7 (797)EvoEF2/BeAtMuSiC69.4 (868)
pyDock-EvoEF1/EvoEF163.5 (795)PRODIGY-FoldX/BeAtMuSiC58.1 (727)
pyDock-EvoEF1/EvoEF261.6 (770)PRODIGY-EvoEF1/BeAtMuSiC57.0 (713)

Note: Percentage and amount of the 1251 mutations other than alanine that are predicted in the same way (i.e. improving or decreasing the binding energy of the protein–protein complex) for each pair of predictors.

Table 2.

Percentage of pairwise similarity between all evaluated predictors

PredictorsPercentagePredictorsPercentage
UEP/pyDock-FoldX59.1 (739)pyDock-EvoEF1/PRODIGY-FoldX56.7 (709)
UEP/pyDock-EvoEF156.5 (707)pyDock-EvoEF1/PRODIGY-EvoEF157.0 (713)
UEP/FoldX64.0 (801)pyDock-EvoEF1/BeAtMuSiC61.6 (771)
UEP/EvoEF165.3 (817)PRODIGY-FoldX/PRODIGY-EvoEF184.3 (1055)
UEP/EvoEF255.6 (696)FoldX/EvoEF168.5 (857)
UEP/PRODIGY-FoldX57.6 (721)FoldX/EvoEF271.3 (892)
UEP/PRODIGY-EvoEF157.6 (721)FoldX/PRODIGY-FoldX56.4 (705)
UEP/BeAtMuSiC56.7 (709)FoldX/PRODIGY-EvoEF156.2 (703)
pyDock-FoldX/pyDock-EvoEF180.2 (1003)FoldX/BeAtMuSiC64.5 (807)
pyDock-FoldX/FoldX60.5 (757)EvoEF1/EvoEF264.9 (812)
pyDock-FoldX/EvoEF163.5 (795)EvoEF1/PRODIGY-FoldX57.6 (721)
pyDock-FoldX/EvoEF253.9 (674)EvoEF1/PRODIGY-EvoEF157.5 (719)
pyDock-FoldX/PRODIGY-FoldX54.9 (687)EvoEF1/BeAtMuSiC59.2 (741)
pyDock-FoldX/PRODIGY-EvoEF155.1 (689)EvoEF2/PRODIGY-FoldX56.0 (700)
pyDock-FoldX-BeAtMuSiC56.7 (709)EvoEF2/PRODIGY-EvoEF155.5 (694)
pyDock-EvoEF1/FoldX63.7 (797)EvoEF2/BeAtMuSiC69.4 (868)
pyDock-EvoEF1/EvoEF163.5 (795)PRODIGY-FoldX/BeAtMuSiC58.1 (727)
pyDock-EvoEF1/EvoEF261.6 (770)PRODIGY-EvoEF1/BeAtMuSiC57.0 (713)
PredictorsPercentagePredictorsPercentage
UEP/pyDock-FoldX59.1 (739)pyDock-EvoEF1/PRODIGY-FoldX56.7 (709)
UEP/pyDock-EvoEF156.5 (707)pyDock-EvoEF1/PRODIGY-EvoEF157.0 (713)
UEP/FoldX64.0 (801)pyDock-EvoEF1/BeAtMuSiC61.6 (771)
UEP/EvoEF165.3 (817)PRODIGY-FoldX/PRODIGY-EvoEF184.3 (1055)
UEP/EvoEF255.6 (696)FoldX/EvoEF168.5 (857)
UEP/PRODIGY-FoldX57.6 (721)FoldX/EvoEF271.3 (892)
UEP/PRODIGY-EvoEF157.6 (721)FoldX/PRODIGY-FoldX56.4 (705)
UEP/BeAtMuSiC56.7 (709)FoldX/PRODIGY-EvoEF156.2 (703)
pyDock-FoldX/pyDock-EvoEF180.2 (1003)FoldX/BeAtMuSiC64.5 (807)
pyDock-FoldX/FoldX60.5 (757)EvoEF1/EvoEF264.9 (812)
pyDock-FoldX/EvoEF163.5 (795)EvoEF1/PRODIGY-FoldX57.6 (721)
pyDock-FoldX/EvoEF253.9 (674)EvoEF1/PRODIGY-EvoEF157.5 (719)
pyDock-FoldX/PRODIGY-FoldX54.9 (687)EvoEF1/BeAtMuSiC59.2 (741)
pyDock-FoldX/PRODIGY-EvoEF155.1 (689)EvoEF2/PRODIGY-FoldX56.0 (700)
pyDock-FoldX-BeAtMuSiC56.7 (709)EvoEF2/PRODIGY-EvoEF155.5 (694)
pyDock-EvoEF1/FoldX63.7 (797)EvoEF2/BeAtMuSiC69.4 (868)
pyDock-EvoEF1/EvoEF163.5 (795)PRODIGY-FoldX/BeAtMuSiC58.1 (727)
pyDock-EvoEF1/EvoEF261.6 (770)PRODIGY-EvoEF1/BeAtMuSiC57.0 (713)

Note: Percentage and amount of the 1251 mutations other than alanine that are predicted in the same way (i.e. improving or decreasing the binding energy of the protein–protein complex) for each pair of predictors.

Having observed relatively low agreement rates among the different predictors, we aimed to determine whether the aggregation of three of the top scored methods (in MCC terms) could increase the predictive power compared to the individual methods. It is important to note that the aggregation can only be performed from the binary classification point of view, since the range of predicted ΔΔG differs substantially for all predictors (Supplementary Fig. S4). Thus, only the binary decision of the effects of a mutation (this is improving or decreasing the binding energy of the complex) can be consensuated from all predictors. Evaluation of the entire benchmark indicated that the best classifiers are pyDock, EvoEF1, FoldX and UEP. Hence, we applied two different decision criteria: consensus and unanimous selections. Consensus criterion represents the decision of most of the predictors (at least 2 of 3), while unanimous criterion represents the agreement of all of them (3 of 3). The main limitation of the unanimous criterion is that mutations not unanimously predicted as improving or decreasing the binding affinity cannot be taken into account. Importantly, those decisions may be especially affected by overfitting issues of the selected individual methods, and therefore their use needs to be performed cautiously. Performance of five different combinations of three predictors is shown in Table 3 (mutations to other than alanine) and Supplementary Table S7 (mutations to alanine). For mutations to other than alanine, maximum MCC values were 0.29 (consensus) and 0.47 (unanimous), while for mutations to alanine were 0.19 (consensus) and 0.27 (unanimous). Of notable interest, the combination of most of those selected predictors resulted in an increase of the classification performance compared to any method alone (Fig. 2, Supplementary Fig. S2 and Table 3, Supplementary Table S7). Two combinations achieved the highest classification performance for mutations to other than alanine: (UEP, EvoEF1 and pyDock-EvoEF1) and (UEP, FoldX and pyDock-FoldX). In this case, the combination using EvoEF1 may be more interesting since it is faster than FoldX (70’ versus 185’) (Table 3). For mutations to alanine (Supplementary Table S7), combination of (UEP, EvoEF1 and pyDock-EvoEF1) achieved the highest performance for the consensus selection (with a minimal improvement compared to other combinations) while the best combination for the unanimous group were (UEP, FoldX and pyDock-FoldX) and (FoldX, pyDock-FoldX and pyDock-EvoEF1). Overall, our results show that performing consensus and unanimous decisions increase the classification performance compared to the individual methods, which can be exploited depending on the needs of the user. However, a consensus selection may be more applicable than the unanimous in a real scenario, since the strict selection procedure of the latter impedes the evaluation of all candidates.

Table 3.

Correlations and time for consensus and unanimous selections

PredictorsConsensusUnanimousTime
UEP, pyDock-EvoEF1, EvoEF10.29 (1251)0.47 (534)70’
UEP, pyDock-FoldX, FoldX0.29 (1251)0.47 (523)185’
UEP, EvoEF1, FoldX0.26 (1251)0.42 (612)220’
FoldX, pyDock-FoldX, pyDock-EvoEF10.25 (1251)0.47 (653)235’
EvoEF1, pyDock-EvoEF1, pyDock-FoldX0.29 (1251)0.42 (671)235’
PredictorsConsensusUnanimousTime
UEP, pyDock-EvoEF1, EvoEF10.29 (1251)0.47 (534)70’
UEP, pyDock-FoldX, FoldX0.29 (1251)0.47 (523)185’
UEP, EvoEF1, FoldX0.26 (1251)0.42 (612)220’
FoldX, pyDock-FoldX, pyDock-EvoEF10.25 (1251)0.47 (653)235’
EvoEF1, pyDock-EvoEF1, pyDock-FoldX0.29 (1251)0.42 (671)235’

Note: MCC values are depicted for consensus and unanimous selections for the combination of three predictors. The amount of structures evaluated for each group of predictors is shown between parentheses. Approximate time is illustrated (using 4 computing cores as reference).

Table 3.

Correlations and time for consensus and unanimous selections

PredictorsConsensusUnanimousTime
UEP, pyDock-EvoEF1, EvoEF10.29 (1251)0.47 (534)70’
UEP, pyDock-FoldX, FoldX0.29 (1251)0.47 (523)185’
UEP, EvoEF1, FoldX0.26 (1251)0.42 (612)220’
FoldX, pyDock-FoldX, pyDock-EvoEF10.25 (1251)0.47 (653)235’
EvoEF1, pyDock-EvoEF1, pyDock-FoldX0.29 (1251)0.42 (671)235’
PredictorsConsensusUnanimousTime
UEP, pyDock-EvoEF1, EvoEF10.29 (1251)0.47 (534)70’
UEP, pyDock-FoldX, FoldX0.29 (1251)0.47 (523)185’
UEP, EvoEF1, FoldX0.26 (1251)0.42 (612)220’
FoldX, pyDock-FoldX, pyDock-EvoEF10.25 (1251)0.47 (653)235’
EvoEF1, pyDock-EvoEF1, pyDock-FoldX0.29 (1251)0.42 (671)235’

Note: MCC values are depicted for consensus and unanimous selections for the combination of three predictors. The amount of structures evaluated for each group of predictors is shown between parentheses. Approximate time is illustrated (using 4 computing cores as reference).

4 Discussion

Predicting the impact of mutations in protein–protein complexes is of great interest in the biotechnology industry, since it would facilitate a la carte design of protein variants, being of crucial importance in some research areas such as in protein–protein design or in the development of biosensors. Moreover, it also has potential in biomedical applications such as helping in the molecular interpretation of pathological mutations for personalized diagnosis and novel therapeutics.

Most of the binding affinity predictors are based and weighted to different measurements coming from experimental determinations. The lack of publicly available experimental data in binding affinity changes on mutations is a major limiting factor for the design of efficient algorithms devoted to this task. In fact, machine learning-based techniques (such as we observed in mCSM) may suffer from overtraining problems due to the low amount of accessible data. Hence, developing algorithms not relying on specific experimental data on mutation may offer alternative prediction solutions. In this work, we present UEP, a fast classifier for predicting the impact of mutations in protein–protein complexes. UEP is a contact-based method following a three-body contact scheme from the interactions observed in structural interactomics data (coming from experimental structures, homology models and domain–domain structural templates). Importantly, all Interactome3D complexes having a protein with higher sequence identity than 30% to any protein in SKEMPI 2.0 were discarded for the construction of the UEP contact matrix. Hence, UEP results are purely based on amino acid contact frequencies coming from very unrelated protein–protein complexes than the ones that are evaluated in this work. We observed that our three-body scheme increases the specificity of the method compared to the canonical pairwise contacts. Moreover, we designed UEP algorithm in such a way that it only predicts the highly packed positions in protein–protein complexes (i.e. two or more contacts with the other protein within a heavy atom cut-off distance of 5 Å). Hence, positions placed far away from the protein–protein interface, or those lacking contacts cannot be assessed by our method. However, we found that mutations on the highly packed interface exert larger differences on the experimental ΔΔG than non-highly packed ones. Moreover, we also reported that non-highly packed positions tend to be more difficult to be correctly predicted by any method (evaluated in our benchmarks). Hence, for a protein–protein design effort, highly packed positions identified by UEP might be more relevant and statistically more significant than others: (i) because they exert a larger impact in ΔΔG, and (ii) because their prediction is more accurate. Regarding mutation nature, we separated the benchmark into mutations to alanine and mutations other than alanine. We observed that mutations to alanine exert lesser effects on the final ΔΔG than mutations other than alanine, and therefore, they are generally more challenging to be correctly classified into improving and decreasing the binding affinity of the complex.

From the classification point of view for mutations to other than alanine, our benchmark indicates that the best classifiers are pyDock, EvoEF1, FoldX and UEP. pyDock showed to be the best classifier of all evaluated methods. For the generation of mutations, EvoEF1 showed to be ∼5 times faster than FoldX, while reaching similar overall classification accuracy. Moreover, pyDock and PRODIGY predictions achieved slightly better performance for predicting models generated by EvoEF1 than by FoldX. The same trend was observed on correlation with experimental data, which highlights the importance of correctly generating mutations for ΔΔG predictions. From all methods, UEP showed to be the faster one for predicting the entire benchmark in ∼1 min, while the second (EvoEF1/2) one needed ∼50 min. This feature makes UEP especially relevant for large screening purposes, since overall classification and correlation performances are competitive with the other predictors. We also aimed to evaluate the performance of all methods depending on the physicochemical properties of the mutations. Hence, we grouped mutations depending on changes in amino acid size and hydrophobicity. Our results indicate that some predictors work better for some groups than others, which could be exploited in design campaigns aiming to improve the binding affinity of a complex.

We further aimed to evaluate if the aggregation of the top classifiers resulted in an improvement of the predictions, using consensus and unanimous criteria. Both strategies boosted the classification performance compared to any method alone. The combination achieving higher classification performance involved the use of UEP, pyDock and EvoEF1/FoldX (EvoEF1 is faster than FoldX, while the FoldX showed higher performance than EvoEF1 while performing the unanimous selection). Since each method shows some sort of peculiarities (such as different prediction performance depending on the type of mutations), the agreement between them (via consensus or unanimous selection) reduces the impact of their individual limitations and increases the classification performance.

Overall, UEP simplifies the complexity of predicting the effects of mutations. Despite being much simpler than the other methods in the field, UEP shows very competitive performance to the best methods evaluated in this work with the advantage of being purely based on structural bioinformatics (and not relying on experimental determinations upon mutations), and allowing faster and computationally cheaper predictions that are especially relevant for large screening efforts.

Acknowledgements

P.A.-R. thanks Yves Dehouck for the technical assistance provided.

Funding

This work was supported by a predoctoral fellowship from the Government of Catalonia (2018FI_B_00873 to P.A.-R.) and grants from the Spanish government ‘Programa Estatal I+D+i’ (BIO2016-79930-R), (PID2019-110167RB-I00) and (CTQ2016-79138-R), and by grant PIREPRED from the EU European Regional Development Fund program Interreg V-A Spain-France-Andorra (POCTEFA). This work was also received funding from the IBM-BSC Deep Learning Center (2016).

Conflict of Interest: none declared.

References

Cheng
T.M.-K.
 et al. (
2007
)
pyDock: electrostatics and desolvation for effective scoring of rigid-body protein–protein docking
.
Proteins Struct. Funct. Bioinf
.,
68
,
503
515
.

David
A.
 et al. (
2015
)
The contribution of missense mutations in core and rim residues of protein–protein interfaces to human disease
.
J. Mol. Biol
.,
427
,
2886
2898
.

Dehouck
Y.
 et al. (
2013
)
BeAtMuSiC: prediction of changes in protein–protein binding affinity on mutations
.
Nucleic Acids Res
.,
41
,
W333
W339
.

Diskin
R.
 et al. (
2011
)
Increasing the potency and breadth of an HIV antibody by using structure-based rational design
.
Science
,
334
,
1289
1293
.

Eisenberg
D.
 et al. (
1984
)
Analysis of membrane and surface protein sequences with the hydrophobic moment plot
.
J. Mol. Biol
.,
179
,
125
142
.

Geng
C.
 et al. (
2019
a)
Finding the ΔΔG spot: are predictors of binding affinity changes upon mutations in protein–protein interactions ready for it?
 
Wiley Interdiscip. Rev. Comput. Mol. Sci
.,
9
,
e1410
.

Geng
C.
 et al. (
2019
b)
iSEE: interface structure, evolution, and energy-based machine learning predictor of binding affinity changes upon mutations
.
Proteins
,
87
,
110
119
.

Guerois
R.
 et al. (
2002
)
Predicting changes in the stability of proteins and protein complexes: a study of more than 1000 mutations
.
J. Mol. Biol
.,
320
,
369
387
.

Huang
X.
 et al. (
2020
)
EvoEF2: accurate and fast energy function for computational protein design
.
Bioinformatics
,
36
,
1135
1142
.

Jankauskaitė
J.
 et al. (
2019
)
SKEMPI 2.0: an updated benchmark of changes in protein–protein binding energy, kinetics and thermodynamics upon mutation
.
Bioinformatics
,
35
,
462
469
.

Jiménez-García
B.
 et al. (
2013
)
pyDockWEB: a web server for rigid-body protein–protein docking using electrostatics and desolvation scoring
.
Bioinformatics
,
29
,
1698
1699
.

Lensink
M.F.
 et al. (
2019
)
Blind prediction of homo- and hetero-protein complexes: the CASP13-CAPRI experiment
.
Proteins
,
87
,
1200
1221
.

Lin
Z.-H.
 et al. (
2008
)
New descriptors of amino acids and their application to peptide QSAR study
.
Peptides
,
29
,
1798
1805
.

Moal
I.H.
 et al. (
2013
)
Intermolecular contact potentials for protein–protein interactions extracted from binding free energy changes upon mutation
.
J. Chem. Theory Comput
.,
9
,
3715
3727
.

Mosca
R.
 et al. (
2013
)
Interactome3D: adding structural details to protein networks
.
Nat. Methods
,
10
,
47
53
.

Navío
D.
 et al. (
2019
)
Structural and computational characterization of disease-related mutations involved in protein–protein interfaces
.
Int. J. Mol. Sci
.,
20, 1583
.

Pearce
R.
 et al. (
2019
)
EvoDesign: designing protein–protein binding interactions using evolutionary interface profiles in conjunction with an optimized physical energy function
.
J. Mol. Biol
.,
431
,
2467
2476
.

Pires
D.E.V.
 et al. (
2014
)
mCSM: predicting the effects of mutations in proteins using graph-based signatures
.
Bioinformatics
,
30
,
335
342
.

Rudicell
R.S.
 et al. (
2014
)
Enhanced potency of a broadly neutralizing HIV-1 antibody in vitro improves protection against lentiviral infection in vivo
.
J. Virol
.,
88
,
12669
12682
.

Sahni
N.
 et al. (
2015
)
Widespread macromolecular interaction perturbations in human genetic disorders
.
Cell
,
161
,
647
660
.

Schymkowitz
J.
 et al. (
2005
)
The FoldX web server: an online force field
.
Nucleic Acids Res
.,
33
,
W382
W388
.

Vangone
A.
 et al. (
2015
)
Contacts-based prediction of binding affinity in protein–protein complexes
.
eLife
,
4, e07454
.

Warszawski
S.
 et al. (
2019
)
Optimizing antibody affinity and stability by the automated design of the variable light-heavy chain interfaces
.
PLoS Comput. Biol
.,
15
,
e1007207
.

Xue
L.C.
 et al. (
2016
)
PRODIGY: a web server for predicting the binding affinity of protein–protein complexes
.
Bioinformatics
,
32
,
3676
3678
.

Yates
C.M.
,
Sternberg
M.J.E.
(
2013
)
The effects of non-synonymous single nucleotide polymorphisms (nsSNPs) on protein–protein interactions
.
J. Mol. Biol
.,
425
,
3949
3963
.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)
Associate Editor: Arne Elofsson
Arne Elofsson
Associate Editor
Search for other works by this author on:

Supplementary data