Abstract

The sheer volume of non-synonymous single nucleotide polymorphisms that have been generated in recent years from projects such as the Human Genome Project, the HapMap Project and Genome-Wide Association Studies means that it is not possible to characterize all mutations experimentally on the gene products, i.e. elucidate the effects of mutations on protein structure and function. However, automatic methods that can predict the effects of mutations will allow a reduced set of mutations to be studied. Site Directed Mutator (SDM) is a statistical potential energy function that uses environment-specific amino-acid substitution frequencies within homologous protein families to calculate a stability score, which is analogous to the free energy difference between the wild-type and mutant protein. Here, we present a web server for SDM ( http://www-cryst.bioc.cam.ac.uk/~sdm/sdm.php ), which has obtained more than 10 000 submissions since being online in April 2008. To run SDM, users must upload a wild-type structure and the position and amino acid type of the mutation. The results returned include information about the local structural environment of the wild-type and mutant residues, a stability score prediction and prediction of disease association. Additionally, the wild-type and mutant structures are displayed in a Jmol applet with the relevant residues highlighted.

INTRODUCTION

Primarily hydrophobic interactions and a network of hydrogen bonds stabilize the folded state of a protein. However, a protein that is folded correctly is only marginally more stable than when it is unfolded, and mutations that affect a stabilizing interaction within a folded protein may lead to protein instability and malfunction. Where protein malfunction does occur and cannot be remediated by an alternative molecular pathway this may result in disease. For example, destabilizing mutations in phenylalanine hydroxylase lead to the metabolic disease, phenylketonuria ( 1 ). In fact, up to 80% of Mendelian disease-associated single mutations in protein coding regions are estimated to be caused by protein destabilization effects ( 2 ). However, a huge volume of single nucleotide polymorphisms (SNPs) has been generated in recent years from projects such as the Human Genome Project ( 3 ) and the HapMap Project ( 4 ) largely due to the availability of high-throughput array-based genotyping methods ( 5 ) and next generation sequencing platforms ( 6 , 7 ). Automatic methods that can predict the effect of mutations accurately will allow a reduced set of mutations to be characterized experimentally, saving time and money.

Various methods of predicting protein stability changes caused by mutation have been described and can be grouped into four main categories based on the strategy used in the calculation: (i) physical effective energy functions; (ii) empirical potential energy functions; (iii) machine learning methods; and (iv) statistical potential energy functions.

Physical potential energy functions (such as molecular mechanics approaches or Monte Carlo simulations) are probably the most accurate methods for predicting the effects of mutations on protein stability, however, they are currently only useful for testing small sets of mutants due to the large amount of time required to compute calculated ΔΔ G values ( 8–12 ). The reliability of predictions is also complicated by the difficulties in sampling in the folded and unfolded states ( 12 ). Empirical potential approaches are fitted to experimental data using a set of weighted terms incorporating physical and statistical energy terms and structural descriptors ( 13 , 14 ). Machine learning methods include neural networks and support vector machines (SVMs) and use information about mutations, protein sequence and structural information to fit a non-linear function to experimental data ( 15–17 ). They are similar to empirical potential approaches in their use of experimental data to fit their function and in both cases, care must be taken that the function is not over-fitted to the training data set. Statistical potential energy approaches are derived from the statistical analysis of protein data such as substitution frequencies, distance potentials and amino acid environmental propensities ( 18–21 ). Other methods use a combination of the above strategies ( 22–24 ).

Site Directed Mutator ( SDM ) is a statistical potential energy function developed by Topham et al. ( 20 ) to predict the effect that SNPs will have on the stability of proteins. SDM uses environment-specific amino acid substitution frequencies within homologous protein families to calculate a stability score, which is analogous to the free energy difference between a wild-type and mutant protein. Blind testing on a set of 83 staphylococcal nuclease and 63 barnase mutants showed a correlation of 0.80 between the predicted stability changes and experimental data ( 20 ). The method performs comparably or better than other published methods in the task of classifying mutations as stabilizing or destabilizing ( 25 ). Additionally, SDM has much improved sensitivity in predicting stabilizing mutations compared to other published methods (five of the seven methods tested incorrectly classify >68% of the stabilizing mutations). When applied to the task of predicting disease-associated mutations, SDM had an accuracy of 61% ( 26 ). Therefore, SDM is a useful tool for guiding the design of site-directed mutagenesis experiments or for predicting whether a mutation will impact protein structure and have a role in disease. Here, we present a web server for SDM ( http://www-cryst.bioc.cam.ac.uk/~sdm/sdm.php ), which has not previously been published.

MATERIALS AND METHODS

Environment-specific substitution tables

SDM uses a set of conformationally constrained environment-specific substitution tables (ESSTs), the general methodology of which are described in ( 27 , 28 ). The tables were derived from 371 protein family sequence alignments from the HOMSTRAD database ( 29 ), consisting of 1357 structures and were built using a modified version of the program Makesub, which is able to handle sidechain hydrogen bond satisfaction (C. Topham, unpublished data). By defining the local structural environment of amino acid residues (secondary structure, solvent accessibility and formation of hydrogen bonds) distinct patterns of substitutions have been observed ( 30 , 31 ). Environment-specific substitution tables (ESSTs) store these substitution data quantitatively in the form of probabilities and therefore provide information about the existence of each amino acid in a particular environment and the probability of it being substituted by any other amino acid. Functional residues [as defined by Uniprot ( 32 ), the Catalytic Site Atlas ( 33 ) and Interpare ( 34 )] were masked from substitution counts.

Definition of structural environment

The structural parameters that were used to define the local environment of amino acid residues are mainchain conformation, solvent accessibility and hydrogen-bonding class. These cut-offs were chosen based on an assessment of relative sidechain solvent accessibility values ( 36 ). The accessibility of each residue in a structure was calculated using the program psa (A. Sali, unpublished data).

  • Mainchain conformation and secondary structure: Nine classes of mainchain conformation were defined: residues were identified as belonging to either α-helix or β-sheet first and the remaining residues were classified as being a, b, p, t, l, g or e according to their mainchain φ-ψ torsion angles. The torsion angles and secondary structure assignments were calculated using the sstruc program (D. Smith, unpublished data).

  • Relative sidechain solvent accessibility: Three classes of relative sidechain solvent accessibility were defined based on the method of Lee and Richards ( 35 ). Residues with sidechain relative accessibilities of:

    • <17% were defined as inaccessible

    • 17–43% were defined as partially accessible

    • >43% were defined as accessible

  • iii. Hydrogen bonding: Two classes of hydrogen bonding were defined: residues were classed as either being satisfied in terms of their sidechain hydrogen bonding or not based on the criteria described by Worth and Blundell ( 37 ). Proteins were first protonated and the charge state of ionisable residues determined using the program, PROPKA ( 38 ). The program, hbond (J. Overington, unpublished data), was used to identify hydrogen bonds defined by the criterion that the distance between donor and acceptor was <3.5Å except for interactions involving sulphur atoms where 4.0Å was used. Hydrogen bonds were then further filtered using the methodology described by Worth and Blundell ( 37 ).

These structural parameters gave a total of 54 local environments (nine mainchain × three solvent accessibility × two hydrogen bonding terms).

Prediction of protein stability changes caused by mutation

The algorithm underlying SDM was first described by Topham et al. ( 20 ). In this original work, two stability difference scores were calculated using either amino acid environmental substitution data (method I) or amino acid propensities (method II). Our subsequent analysis showed that updating the substitution and propensity data using additional protein families resulted in a better performance when the environment substitution data were used (data not shown). Therefore, SDM uses only method I to calculate protein stability changes caused by mutation. In addition, SDM now uses a far more comprehensive set of substitution data (ESSTs) compared to the original publication (371 families compared to 131) and known functional sites are excluded from the substitution counts. Furthermore, the local structural environment parameter ‘sidechain hydrogen bond (yes/no)’ was modified to ‘sidechain hydrogen-bonding satisfaction (satisfied/unsatisfied)’ and this was shown to improve the stability score calculations ( 36 ).

By analogy to the folding-unfolding cycle in Figure 1 , the algorithm uses ESSTs to calculate the difference in the stability scores of the folded and unfolded state for the wild-type and mutant protein structures:  
formula
(1)
The substitution data used for calculating the stability score are from families of homologous proteins, which have accepted multiple mutations during the course of their evolution. However, the effects of single substitutions are not often observed over the timescale of evolution e.g. cavity mutants. In order to compensate for this a disruption term is introduced for buried mutated residues. It is defined as the logarithmic function of the absolute value of the net change over the mutated position in the sidechain surface accessible area in an extended peptide Gly-X-Gly, relative to that for glycine. Therefore Equation (1) becomes:  
formula
(2)
ESSTs take into account the environment of only one of the two residues (wild-type or mutant), therefore it is necessary to consider not only the probability of replacement of the wild-type residue ( Rj ) in the wild-type environment ( εwt ) by a mutant residue type ( rk ) in an undefined environment [ P ( rk / Rj , εwt )] but also the probability of replacement of the mutant residue type ( Rk ) in the mutant environment ( εmut ) by the wild-type residue ( rj ) in an undefined environment [ P ( rj / Rk , εmut )].
Figure 1.

The thermodynamic cycle can be used to calculate protein stability changes between wild-type and mutant proteins.

Figure 1.

The thermodynamic cycle can be used to calculate protein stability changes between wild-type and mutant proteins.

In order to normalise the probabilities that are combined from different substitution tables, it is necessary to introduce a reference state. For the wild-type residue ( Rj ) in the wild-type environment a suitable reference state is the probability of it being conserved in that environment [ P ( rj / Rj , εwt )]. In an analogous way, for the mutant residue type ( Rk ) in the mutant environment, a suitable reference state is the probability of it being conserved in that environment [ P ( rk / Rk , εmut )].

The difference in stability scores for a mutation in the folded state is therefore calculated by:  
formula
(3)
The difference in stability scores in the unfolded state ( forumla ) is also calculated using Equation (3) but uses an environmental substitution table derived from non-hydrogen bonded, surface exposed amino acid residues falling outside regions of regular secondary structure. The stability difference scores for the folded and unfolded state for the wild-type and mutant protein structures are then calculated using Equation (1).

Prediction of disease-association

From studying missense mutations for which the phenotypes are known, it is estimated that the stability margin that can be accommodated without any immediate effect on protein fitness is 1–3 kcal mol −1  ( 39–41 ). Studies of Ig-like proteins have shown that mutations that decrease the stability of these proteins by >2 kcal mol −1  result in severe disease phenotypes ( 42 , 43 ).

It may appear counter-intuitive that increased protein stability can lead to protein malfunction; however, protein flexibility is essential for enzyme catalysis. For instance, the increased stability of many thermophilic proteins is accompanied by loss of protein flexibility and reduced enzymatic activity at low temperatures ( 44–48 ). Furthermore, stabilizing mutations at catalytic site residues typically decrease activity and suggest that function often comes with a substantial penalty to stability ( 44 , 49–52 ). In addition, highly stable proteins are protease-resistant and therefore difficult to regulate—this is important to consider in systems such as cell signalling, where removing a signal is as important as its activation ( 53 ). A recent study showed that β-catenin accumulation is the most common aberration in parathyroid tumours of primary origin and that the S37A stabilizing mutation of CTNNB1 was found in 5.8% of the tumours ( 54 ). Another example of a stabilizing and damaging mutation is the Parkinson disease-associated A30P mutation, which stabilizes α-synuclein against proteasomal degradation triggered by haeme oxygenease-1 over-expression in human neuroblastoma cells ( 55 ). Hence, there is biological evidence that increased protein stability can lead to protein malfunction and hence disease.

In light of the studies mentioned in the previous two paragraphs, we have used a cut-off of 2 kcal mol −1 (stabilizing or destabilizing) for classifying mutations as leading to protein malfunction and possibly disease.

Mutant thermodynamic data sets

A subset of the data set used by Capriotti et al. ( 16 ) was used for initial benchmarking. This mutant data set was taken from the ProTherm database, which stores thermodynamic data for proteins and mutants ( 56 ). Our method requires knowledge of the local structural environment of wild-type and mutant residues in order to predict the effect of mutation on the stability of a protein. If the local environment is incorrectly defined e.g. the protein functions as a trimer but is defined in the crystallographic asymmetric unit as the protomer, this may affect our calculation. To remove the effect of such errors we used the Protein Interfaces, Surface and Assemblies (PISA) service to predict the oligomeric state of each of the proteins in the data set ( 57 ). Only those proteins predicted to be monomers were used. This data set is hereafter referred to as the monomeric set.

The validation data set used by Dehouck et al. ( 22 ) for benchmarking their method PoPMuSiC-2.0 was used for comparison of SDM’s performance to other published stability change prediction algorithms. This data set comprises 350 mutations, none of which was included in any of the databases used to devise or test the seven methods tested by Dehouck et al . ( 22 ).

A set of 388 mutants ( S388 ) with thermodynamic measurements conducted under physiological conditions was also used to test our method. The S388 data set has been used to test other published methods and therefore allows us to perform a direct comparison of our method to them.

WEBSERVER

Input

SDM requires the 3D co-ordinates of the wild-type protein (in PDB format), the PDB chain identifier, the mutation position and the amino acid type of the mutation in one-letter code in order to calculate a stability score for mutant proteins. Users who have not already obtained a structure of their protein of interest may use the search boxes on the home page to do so. These search boxes allow a user to query the RCSB Protein Data Bank ( www.pdb.org ) ( 58 ) for their protein of interest, using protein name, description or amino acid sequence.

The wild-type structure may be submitted using one of two methods; the user can either upload the PDB file or enter the four-letter PDB code. NMR structures are accepted by SDM for input; however, users should note that it is only the first model in the PDB file, which is used for subsequent analysis.

SDM also requires a 3D structure of the mutant protein to perform its calculations. In this case, the user has the option of either uploading a mutant structure or using the program ANDANTE to build a model structure of the mutant ( 59 ). A requirement of SDM is that the wild-type and mutant structures span the same part of the polypeptide chain; therefore users must ensure that when they upload a mutant PDB structure that they fulfil this requirement.

The home page also provides a link to example output in order that users may view the type of output produced before running their job. Additionally, tutorials on usage are available for viewing using the link provided on the navigator bar.

Output

The results page is split into three sections. On the left-hand side the mutant information is displayed (wild-type and mutant amino acid types plus the position). Where ANDANTE was used to build a mutant structure, the PDB file is made available for download. The results returned include information about the local structural environment of the wild-type and mutant residues (the secondary structure, solvent accessibility and sidechain hydrogen bond satisfaction), a stability score prediction and prediction of disease association. As mentioned in the methods section, a cut-off of 2 kcal mol −1 is used to indicate whether a mutation is likely to be disease-associated or not. However, mutations that do not reach this cut-off may still lead to protein malfunction and disease if they affect binding sites. A statement indicating this issue is therefore displayed and the links page lists resources that can be used to assess whether a residue is involved in binding.

In the middle portion of the results page, the wild-type and mutant structures are displayed using the Jmol structure viewer (Jmol: an open-source Java viewer for chemical structures in 3D http://www.jmol.org/ ) with the relevant residues highlighted. The user may control the display of these structures using the menu buttons on the right-hand side.

An example of the type of output produced by SDM is shown in Figure 2 . A particular advantage of the predictions provided by SDM over other published methods is the indication of the local structural environment of wild-type and mutant residues and the fact that the user may view the 3D structural context of the residues. This allows users to identify possible molecular mechanisms that underlie predicted stability changes for example, loss of hydrogen bonds to the protein backbone.

Figure 2.

Screenshot of SDM analysis results for the example of mutation Y231N in Dystrophin (PDB code 1DXX, chain A). On the left hand side information about the wild-type and mutant residue is displayed such as the secondary structure, solvent accessibility and hydrogen bonds formed by the sidechain. Underneath this information is the predicted effect on protein stability. In this case, SDM predicts that the mutation is highly destabilizing and disease-associated. In fact, this mutation is associated with muscular dystrophy and has been shown to decrease protein stability ( 73 ). In the middle, the structural context of the wild-type and mutant amino acids are shown in the Jmol applet with the residues coloured according to their chemical properties (key displayed on right hand side). Using the menus on the right hand side the user can manipulate the Jmol applet and control what is shown.

Figure 2.

Screenshot of SDM analysis results for the example of mutation Y231N in Dystrophin (PDB code 1DXX, chain A). On the left hand side information about the wild-type and mutant residue is displayed such as the secondary structure, solvent accessibility and hydrogen bonds formed by the sidechain. Underneath this information is the predicted effect on protein stability. In this case, SDM predicts that the mutation is highly destabilizing and disease-associated. In fact, this mutation is associated with muscular dystrophy and has been shown to decrease protein stability ( 73 ). In the middle, the structural context of the wild-type and mutant amino acids are shown in the Jmol applet with the residues coloured according to their chemical properties (key displayed on right hand side). Using the menus on the right hand side the user can manipulate the Jmol applet and control what is shown.

VALIDATION

SDM has previously been validated using a set of ∼230 mutants and was shown to have an accuracy of 74% in predicting the sign of stability change and a linear correlation coefficient of 0.60 between predicted and observed ΔΔ G values ( 25 ). Removal of one outlying data point increased the linear correlation coefficient to 0.66. Analysis of the performance of SDM in predicting the sign of stability change in comparison to eight other published methods demonstrated that SDM performs comparably or better than the other methods.

Since the benchmarking detailed above was carried out, SDM has been modified so that the definition of sidechain hydrogen bonding has been changed from yes or no to satisfied or unsatisfied. Furthermore, functional residues have been masked from the substitution counts used to generate the ESSTs. We tested the improvement that these changes made to SDM’s predictions using the 855 mutants in the monomeric data set. The additional families used to generate the ESSTs, masking functional residues and incorporation of the hydrogen bond satisfaction term improved the correlation coefficient between predicted stability changes and experimental measurements from 0.51 to 0.58 ( Table 1 ).

Table 1.

Comparison of the performance of SDM using different sets of ESSTs and the monomeric data set

Parameters used to generate ESSTs
 
Accuracy (%) Ra σ (kcal/mol) 
Protein families Hydrogen bonding term Masking of functional residues 
113 Original No 73 0.51 1.82 
371 Original Yes 73 0.56 1.61 
371 Satisfied No 73 0.56 1.73 
371 Satisfied Yes 71 0.58 1.74 
Parameters used to generate ESSTs
 
Accuracy (%) Ra σ (kcal/mol) 
Protein families Hydrogen bonding term Masking of functional residues 
113 Original No 73 0.51 1.82 
371 Original Yes 73 0.56 1.61 
371 Satisfied No 73 0.56 1.73 
371 Satisfied Yes 71 0.58 1.74 

a Pearson product-moment correlation coefficient.

Table 1.

Comparison of the performance of SDM using different sets of ESSTs and the monomeric data set

Parameters used to generate ESSTs
 
Accuracy (%) Ra σ (kcal/mol) 
Protein families Hydrogen bonding term Masking of functional residues 
113 Original No 73 0.51 1.82 
371 Original Yes 73 0.56 1.61 
371 Satisfied No 73 0.56 1.73 
371 Satisfied Yes 71 0.58 1.74 
Parameters used to generate ESSTs
 
Accuracy (%) Ra σ (kcal/mol) 
Protein families Hydrogen bonding term Masking of functional residues 
113 Original No 73 0.51 1.82 
371 Original Yes 73 0.56 1.61 
371 Satisfied No 73 0.56 1.73 
371 Satisfied Yes 71 0.58 1.74 

a Pearson product-moment correlation coefficient.

The statistical potential-based method, PoPMuSiC-2.0 was recently reported and achieved a correlation of 0.63 between measured and predicted stability changes ( 22 ). The predictive power of the method was shown to be significantly higher than that of other programs described in the literature. In order to compare the predictive power of SDM to PoPMuSiC-2.0 and the other tested methods, we used the same data set of 350 mutants. After the PoPMuSiC algorithms, SDM has the highest linear correlation between predicted and measured ΔΔ G values ( Table 2 ). It also has the benefit of making predictions for the entire data set of 350 mutants. It is encouraging that the performance of SDM is improved when considering only highly stabilizing or destabilizing mutations—the correlation coefficient increases from 0.52 to 0.63 ( Table 2 ).

The vast majority of published methods for predicting the effects of mutations on protein stability are based on machine learning (ML). These are first trained on a data set of mutations. Many of these ML methods report high correlations with experimental data sets [e.g. CUPSAT R  = 0.87 ( 21 ) and IMutant2.0 R  = 0.71 ( 60 )]. However, when tested later in blind tests, these correlations drop drastically [e.g. CUPSAT R  = 0.37 and IMutant-2.0 R  = 0.29 ( 22 )]. This reduction in prediction performance may be due to over-fitting to available data sets. The problem of decreasing performance of ML methods using blind-data sets was also observed by two independent assessments of the performance of protein stability predictors ( 61 , 62 ). SDM is not a ML method, but rather a statistical method based on observed amino acid substitutions that have occurred during divergent protein evolution. Therefore, it does not suffer from the problem of over-fitting, as demonstrated by the similar correlation coefficients obtained using the monomeric data set and the PoPMuSiC-2.0 validation data set. The problem of over-fitting is an important point to consider if methods are to be used to help successfully design mutagenesis experiments.

Table 3 shows the results of testing the S388 data set. These results show the performance of methods in predicting the sign of stability change i.e. whether a mutation is stabilizing or destabilizing. Many of the methods have accuracies of over 80%, which is impressive. However, if we examine the ability of the methods to predict stabilizing and destabilizing mutations another picture emerges; they tend to be very good at predicting destabilizing mutations but much worse at predicting stabilizing mutations. SDM however has a more balanced sensitivity in predicting both types of mutations, although the specificity of predicting destabilizing mutations is far better than that of predicting stabilizing mutations. Most mutations are destabilizing and this is reflected in the mutant thermodynamic data sets used for developing and testing such methods. Methods that assign all of the samples to the majority class (destabilizing mutations) will have high accuracy even though the performance is poor for the minority class (stabilizing mutations). This trend is observed for most of the methods reported in Table 3 . It is possible that some of the results in Table 3 are biased by some over-fitting to the training data sets used in developing the methods.

Table 2.

Comparison of the performance of different prediction methods

Method  No. of predictions b  Complete set (350/309/87 mutants) a
 
R σ (kcal/mol) 
Automute c 315 0.46 / 0.45 / 0.45 1.43 / 1.46 / 1.99 
CUPSAT c 346 0.37 / 0.35 / 0.50 1.91 / 1.96 / 2.14 
Dmutant c 350 0.48 / 0.47 / 0.57 1.81 / 1.87 / 2.31 
Eris c 334 0.35 / 0.34 / 0.49 4.12 / 4.28 / 3.91 
I-mutant-2.0 c 346 0.29 / 0.27 / 0.27 1.65 / 1.69 / 2.39 
PoPMuSiC-1.0 c 350 0.62 / 0.63 / 0.70 1.24 / 1.25 / 1.66 
PoPMuSiC-2.0 c 350 0.67 / 0.67 / 0.71 1.16 / 1.19 / 1.67 
SDM 350 0.52 / 0.53 / 0.63 1.80 / 1.81 / 2.11 
Method  No. of predictions b  Complete set (350/309/87 mutants) a
 
R σ (kcal/mol) 
Automute c 315 0.46 / 0.45 / 0.45 1.43 / 1.46 / 1.99 
CUPSAT c 346 0.37 / 0.35 / 0.50 1.91 / 1.96 / 2.14 
Dmutant c 350 0.48 / 0.47 / 0.57 1.81 / 1.87 / 2.31 
Eris c 334 0.35 / 0.34 / 0.49 4.12 / 4.28 / 3.91 
I-mutant-2.0 c 346 0.29 / 0.27 / 0.27 1.65 / 1.69 / 2.39 
PoPMuSiC-1.0 c 350 0.62 / 0.63 / 0.70 1.24 / 1.25 / 1.66 
PoPMuSiC-2.0 c 350 0.67 / 0.67 / 0.71 1.16 / 1.19 / 1.67 
SDM 350 0.52 / 0.53 / 0.63 1.80 / 1.81 / 2.11 

a Three values are given per column. The first corresponds to the whole validation set of 350 mutants with the unavailable ΔΔ G predictions set to 0.0 kcal/mol. The second corresponds to the 309 mutants for which a ΔΔ G prediction is available for all predictors. The third corresponds to 87 mutants for which the experimental ΔΔ G value causes >2 kcal mol −1 change and for which a ΔΔ G prediction is available for all predictors.

b 350 mutations were tested with each method. However, some servers failed to compute the ΔΔG prediction for all mutants, resulting in predictions for less than the full number.

c Data taken from ( 22 ).

Table 2.

Comparison of the performance of different prediction methods

Method  No. of predictions b  Complete set (350/309/87 mutants) a
 
R σ (kcal/mol) 
Automute c 315 0.46 / 0.45 / 0.45 1.43 / 1.46 / 1.99 
CUPSAT c 346 0.37 / 0.35 / 0.50 1.91 / 1.96 / 2.14 
Dmutant c 350 0.48 / 0.47 / 0.57 1.81 / 1.87 / 2.31 
Eris c 334 0.35 / 0.34 / 0.49 4.12 / 4.28 / 3.91 
I-mutant-2.0 c 346 0.29 / 0.27 / 0.27 1.65 / 1.69 / 2.39 
PoPMuSiC-1.0 c 350 0.62 / 0.63 / 0.70 1.24 / 1.25 / 1.66 
PoPMuSiC-2.0 c 350 0.67 / 0.67 / 0.71 1.16 / 1.19 / 1.67 
SDM 350 0.52 / 0.53 / 0.63 1.80 / 1.81 / 2.11 
Method  No. of predictions b  Complete set (350/309/87 mutants) a
 
R σ (kcal/mol) 
Automute c 315 0.46 / 0.45 / 0.45 1.43 / 1.46 / 1.99 
CUPSAT c 346 0.37 / 0.35 / 0.50 1.91 / 1.96 / 2.14 
Dmutant c 350 0.48 / 0.47 / 0.57 1.81 / 1.87 / 2.31 
Eris c 334 0.35 / 0.34 / 0.49 4.12 / 4.28 / 3.91 
I-mutant-2.0 c 346 0.29 / 0.27 / 0.27 1.65 / 1.69 / 2.39 
PoPMuSiC-1.0 c 350 0.62 / 0.63 / 0.70 1.24 / 1.25 / 1.66 
PoPMuSiC-2.0 c 350 0.67 / 0.67 / 0.71 1.16 / 1.19 / 1.67 
SDM 350 0.52 / 0.53 / 0.63 1.80 / 1.81 / 2.11 

a Three values are given per column. The first corresponds to the whole validation set of 350 mutants with the unavailable ΔΔ G predictions set to 0.0 kcal/mol. The second corresponds to the 309 mutants for which a ΔΔ G prediction is available for all predictors. The third corresponds to 87 mutants for which the experimental ΔΔ G value causes >2 kcal mol −1 change and for which a ΔΔ G prediction is available for all predictors.

b 350 mutations were tested with each method. However, some servers failed to compute the ΔΔG prediction for all mutants, resulting in predictions for less than the full number.

c Data taken from ( 22 ).

Table 3.

Comparison of the performance of different prediction methods

Method MCC Accuracy Sens. (+) Spec. (+) Sens. (−) Spec. (−) 
Automute S1227 a 0.31 0.87 0.36 0.42 0.94 0.92 
FOLDX b 0.25 0.75 0.56 0.26 0.78 0.93 
DFIRE b 0.11 0.68 0.44 0.18 0.71 0.90 
PoPMuSiC-1.0 b 0.20 0.85 0.25 0.33 0.93 0.90 
PoPMuSiC-2.0 0.32 0.86 0.35 0.44 0.94 0.91 
NeuralNet b 0.25 0.87 0.21 0.44 0.96 0.90 
MuPro SO c 0.26 0.86 0.30 0.40 0.94 0.90 
MuPro TO c 0.28 0.86 0.31 0.42 0.94 0.91 
MuPro ST c 0.27 0.86 0.31 0.40 0.93 0.91 
MuX-S d 0.39 0.88 0.29 0.67 0.94 0.91 
MuX-48 c 0.39 0.89 0.29 0.67 0.98 0.91 
SDM 0.28 0.71 0.70 0.24 0.71 0.94 
Method MCC Accuracy Sens. (+) Spec. (+) Sens. (−) Spec. (−) 
Automute S1227 a 0.31 0.87 0.36 0.42 0.94 0.92 
FOLDX b 0.25 0.75 0.56 0.26 0.78 0.93 
DFIRE b 0.11 0.68 0.44 0.18 0.71 0.90 
PoPMuSiC-1.0 b 0.20 0.85 0.25 0.33 0.93 0.90 
PoPMuSiC-2.0 0.32 0.86 0.35 0.44 0.94 0.91 
NeuralNet b 0.25 0.87 0.21 0.44 0.96 0.90 
MuPro SO c 0.26 0.86 0.30 0.40 0.94 0.90 
MuPro TO c 0.28 0.86 0.31 0.42 0.94 0.91 
MuPro ST c 0.27 0.86 0.31 0.40 0.93 0.91 
MuX-S d 0.39 0.88 0.29 0.67 0.94 0.91 
MuX-48 c 0.39 0.89 0.29 0.67 0.98 0.91 
SDM 0.28 0.71 0.70 0.24 0.71 0.94 

a Data taken from Masso and Vaisman ( 24 ).

b Data taken from Capriotti et al . ( 16 ).

c Data taken from Cheng et al . ( 17 ).

d Data taken from Kang et al. ( 74 ).

Table 3.

Comparison of the performance of different prediction methods

Method MCC Accuracy Sens. (+) Spec. (+) Sens. (−) Spec. (−) 
Automute S1227 a 0.31 0.87 0.36 0.42 0.94 0.92 
FOLDX b 0.25 0.75 0.56 0.26 0.78 0.93 
DFIRE b 0.11 0.68 0.44 0.18 0.71 0.90 
PoPMuSiC-1.0 b 0.20 0.85 0.25 0.33 0.93 0.90 
PoPMuSiC-2.0 0.32 0.86 0.35 0.44 0.94 0.91 
NeuralNet b 0.25 0.87 0.21 0.44 0.96 0.90 
MuPro SO c 0.26 0.86 0.30 0.40 0.94 0.90 
MuPro TO c 0.28 0.86 0.31 0.42 0.94 0.91 
MuPro ST c 0.27 0.86 0.31 0.40 0.93 0.91 
MuX-S d 0.39 0.88 0.29 0.67 0.94 0.91 
MuX-48 c 0.39 0.89 0.29 0.67 0.98 0.91 
SDM 0.28 0.71 0.70 0.24 0.71 0.94 
Method MCC Accuracy Sens. (+) Spec. (+) Sens. (−) Spec. (−) 
Automute S1227 a 0.31 0.87 0.36 0.42 0.94 0.92 
FOLDX b 0.25 0.75 0.56 0.26 0.78 0.93 
DFIRE b 0.11 0.68 0.44 0.18 0.71 0.90 
PoPMuSiC-1.0 b 0.20 0.85 0.25 0.33 0.93 0.90 
PoPMuSiC-2.0 0.32 0.86 0.35 0.44 0.94 0.91 
NeuralNet b 0.25 0.87 0.21 0.44 0.96 0.90 
MuPro SO c 0.26 0.86 0.30 0.40 0.94 0.90 
MuPro TO c 0.28 0.86 0.31 0.42 0.94 0.91 
MuPro ST c 0.27 0.86 0.31 0.40 0.93 0.91 
MuX-S d 0.39 0.88 0.29 0.67 0.94 0.91 
MuX-48 c 0.39 0.89 0.29 0.67 0.98 0.91 
SDM 0.28 0.71 0.70 0.24 0.71 0.94 

a Data taken from Masso and Vaisman ( 24 ).

b Data taken from Capriotti et al . ( 16 ).

c Data taken from Cheng et al . ( 17 ).

d Data taken from Kang et al. ( 74 ).

When applied to the task of predicting disease-associated mutations, SDM had an accuracy of 61% ( 26 ), only 3% less than the accuracy achieved by the program Sorting Intolerant from Tolerant (SIFT) ( 63 ). Of course, it is unsurprising that SIFT obtains a higher accuracy than SDM as SDM is able to distinguish disease-associations only for those mutations that perturb protein structure and not those that directly affect catalytic residues, binding sites etc. Mutations that cause protein malfunction by affecting the functional residues of a protein (active sites or protein–protein interaction sites) or by altering post-translational modifications will not be identified as damaging by SDM. Therefore, to obtain a more accurate prediction of whether an nsSNP is associated with disease, these other effects should also be taken into account. We previously demonstrated that when SDM’s predictions were combined with predictions of functional sites using Crescendo ( 64 ) and known functional sites, this combined approach has a comparable accuracy to the other methods tested but has the benefit of a much lower false-positive rate, therefore providing a high-quality set of predictions ( 26 ).

SUMMARY

The SDM server provides users with a fast and accurate means of assessing the impact that a mutation will have on protein structure and stability. It provides a 3D view of the wild-type and mutant residues, allowing users to inspect the structural context of the sidechains. SDM is a useful tool for identifying possible disease associations and has been applied to the task of predicting deleterious nsSNPs at the genome scale ( 25 , 26 , 65 ) and also for generating new hypotheses regarding: (i) the molecular aetiology of renal cell carcinoma and pheochromocytoma in the cancer syndrome, von Hippel-Lindau disease ( 66 ); (ii) the structural effects of mutations in thyroid stimulating hormone receptor that are associated with congenital non-goitrous hypothyroidism ( 67 ); and (iii) tumour risk associated with mutations in succinate dehydrogenase D ( 68 ). It has also been used in the analysis of mutations in the autoimmune regulator protein ( 69 ), mixed lineage kinase 3 ( 70 ), the adaptor protein MyD88 adaptor-like ( 71 ) and breast cancer susceptibility gene 1 ( 72 ).

FUNDING

This work was supported by the Biotechnology and Biological Sciences Research Council (research studentship to C.L.W.) and a Wellcome Trust Programme Grant (to T.L.B.). Funding for open access charge: Wellcome Trust Programme Grant.

Conflict of interest statement . None declared.

REFERENCES

1
Bjorgo
E
Knappskog
PM
Martinez
A
Stevens
RC
Flatmark
T
Partial characterization and three-dimensional-structural localization of eight mutations in exon 7 of the human phenylalanine hydroxylase gene associated with phenylketonuria
Eur. J. Biochem.
 , 
1998
, vol. 
257
 (pg. 
1
-
10
)
2
Wang
Z
Moult
J
SNPs, protein structure, and disease
Hum. Mutat.
 , 
2001
, vol. 
17
 (pg. 
263
-
270
)
3
Venter
JC
Adams
MD
Myers
EW
Li
PW
Mural
RJ
Sutton
GG
Smith
HO
Yandell
M
Evans
CA
Holt
RA
, et al. 
The sequence of the human genome
Science
 , 
2001
, vol. 
291
 (pg. 
1304
-
1351
)
4
Frazer
KA
Ballinger
DG
Cox
DR
Hinds
DA
Stuve
LL
Gibbs
RA
Belmont
JW
Boudreau
A
Hardenbol
P
Leal
SM
, et al. 
A second generation human haplotype map of over 3.1 million SNPs
Nature
 , 
2007
, vol. 
449
 (pg. 
851
-
861
)
5
Gunderson
KL
Steemers
FJ
Ren
H
Ng
P
Zhou
L
Tsan
C
Chang
W
Bullis
D
Musmacker
J
King
C
, et al. 
Whole-genome genotyping
Methods Enzymol.
 , 
2006
, vol. 
410
 (pg. 
359
-
376
)
6
Metzker
ML
Sequencing technologies - the next generation
Nat. Rev. Genet.
 , 
2010
, vol. 
11
 (pg. 
31
-
46
)
7
Wheeler
DA
Srinivasan
M
Egholm
M
Shen
Y
Chen
L
McGuire
A
He
W
Chen
YJ
Makhijani
V
Roth
GT
, et al. 
The complete genome of an individual by massively parallel DNA sequencing
Nature
 , 
2008
, vol. 
452
 (pg. 
872
-
876
)
8
Bash
PA
Singh
UC
Langridge
R
Kollman
PA
Free energy calculations by computer simulation
Science
 , 
1987
, vol. 
236
 (pg. 
564
-
568
)
9
Funahashi
J
Sugita
Y
Kitao
A
Yutani
K
How can free energy component analysis explain the difference in protein stability caused by amino acid substitutions? Effect of three hydrophobic mutations at the 56th residue on the stability of human lysozyme
Protein Eng.
 , 
2003
, vol. 
16
 (pg. 
665
-
671
)
10
Kollman
PA
Massova
I
Reyes
C
Kuhn
B
Huo
S
Chong
L
Lee
M
Lee
T
Duan
Y
Wang
W
, et al. 
Calculating structures and free energies of complex molecules: combining molecular mechanics and continuum models
Acc. Chem. Res.
 , 
2000
, vol. 
33
 (pg. 
889
-
897
)
11
Park
H
Lee
S
Prediction of the mutation-induced change in thermodynamic stabilities of membrane proteins from free energy simulations
Biophys. Chem.
 , 
2005
, vol. 
114
 (pg. 
191
-
197
)
12
Shi
YY
Mark
AE
Wang
CX
Huang
F
Berendsen
HJ
van Gunsteren
WF
Can the stability of protein mutants be predicted by free energy calculations?
Protein Eng.
 , 
1993
, vol. 
6
 (pg. 
289
-
295
)
13
Bordner
AJ
Abagyan
RA
Large-scale prediction of protein geometry and stability changes for arbitrary single point mutations
Proteins
 , 
2004
, vol. 
57
 (pg. 
400
-
413
)
14
Guerois
R
Nielsen
JE
Serrano
L
Predicting changes in the stability of proteins and protein complexes: a study of more than 1000 mutations
J. Mol. Biol.
 , 
2002
, vol. 
320
 (pg. 
369
-
387
)
15
Capriotti
E
Fariselli
P
Calabrese
R
Casadio
R
Predicting protein stability changes from sequences using support vector machines
Bioinformatics
 , 
2005
, vol. 
21
 
Suppl. 2
(pg. 
ii54
-
ii58
)
16
Capriotti
E
Fariselli
P
Casadio
R
A neural-network-based method for predicting protein stability changes upon single point mutations
Bioinformatics
 , 
2004
, vol. 
20
 
Suppl. 1
(pg. 
i63
-
i68
)
17
Cheng
J
Randall
A
Baldi
P
Prediction of protein stability changes for single-site mutations using support vector machines
Proteins
 , 
2006
, vol. 
62
 (pg. 
1125
-
1132
)
18
Gilis
D
Rooman
M
Predicting protein stability changes upon mutation using database-derived potentials: solvent accessibility determines the importance of local versus non-local interactions along the sequence
J. Mol. Biol.
 , 
1997
, vol. 
272
 (pg. 
276
-
290
)
19
Saraboji
K
Gromiha
MM
Ponnuswamy
MN
Average assignment method for predicting the stability of protein mutants
Biopolymers
 , 
2006
, vol. 
82
 (pg. 
80
-
92
)
20
Topham
CM
Srinivasan
N
Blundell
TL
Prediction of the stability of protein mutants based on structural environment-dependent amino acid substitution and propensity tables
Protein Eng.
 , 
1997
, vol. 
10
 (pg. 
7
-
21
)
21
Parthiban
V
Gromiha
MM
Schomburg
D
CUPSAT: prediction of protein stability upon point mutations
Nucleic Acids Res.
 , 
2006
, vol. 
34
 (pg. 
W239
-
W242
)
22
Dehouck
Y
Grosfils
A
Folch
B
Gilis
D
Bogaerts
P
Rooman
M
Fast and accurate predictions of protein stability changes upon mutations using statistical potentials and neural networks: PoPMuSiC-2.0
Bioinformatics
 , 
2009
, vol. 
25
 (pg. 
2537
-
2543
)
23
Yin
S
Ding
F
Dokholyan
NV
Modeling backbone flexibility improves protein stability estimation
Structure
 , 
2007
, vol. 
15
 (pg. 
1567
-
1576
)
24
Masso
M
Vaisman
II
Accurate prediction of stability changes in protein mutants by combining machine learning with structure based computational mutagenesis
Bioinformatics
 , 
2008
, vol. 
24
 (pg. 
2002
-
2009
)
25
Worth
CL
Burke
DF
Blundell
TL
Estimating the effects of single nucleotide polymorphisms on protein structure: how good are we at identifying likely disease associated mutations?
Proceedings of “Molecular Interactions - Bringing Chemistry to Life”
 , 
2007
(pg. 
11
-
26
)
26
Worth
CL
Bickerton
GR
Schreyer
A
Forman
JR
Cheng
TM
Lee
S
Gong
S
Burke
DF
Blundell
TL
A structural bioinformatics approach to the analysis of nonsynonymous single nucleotide polymorphisms (nsSNPs) and their relation to disease
J. Bioinform. Comput. Biol.
 , 
2007
, vol. 
5
 (pg. 
1297
-
1318
)
27
Overington
J
Donnelly
D
Johnson
MS
Sali
A
Blundell
TL
Environment-specific amino acid substitution tables: tertiary templates and prediction of protein folds
Protein Sci.
 , 
1992
, vol. 
1
 (pg. 
216
-
226
)
28
Topham
CM
McLeod
A
Eisenmenger
F
Overington
JP
Johnson
MS
Blundell
TL
Fragment ranking in modelling of protein structure. Conformationally constrained environmental amino acid substitution tables
J. Mol. Biol.
 , 
1993
, vol. 
229
 (pg. 
194
-
220
)
29
Mizuguchi
K
Deane
CM
Blundell
TL
Overington
JP
HOMSTRAD: a database of protein structure alignments for homologous families
Protein Sci.
 , 
1998
, vol. 
7
 (pg. 
2469
-
2471
)
30
Blundell
TL
Cooper
J
Donnelly
D
Driessen
H
Edwards
Y
Eisenmenger
F
Frazao
C
Johnson
M
Niefind
K
Newman
M
, et al. 
Jornvall/Hoog/Gustavsson
Patterns of sequence variation in families of homologous proteins
Methods in Protein Sequence Analysis
 , 
1991
Basel
Birkhauser Verlag AG
(pg. 
373
-
385
)
31
Overington
J
Johnson
MS
Sali
A
Blundell
TL
Tertiary structural constraints on protein evolutionary diversity: templates, key residues and structure prediction
Proc. Biol. Sci.
 , 
1990
, vol. 
241
 (pg. 
132
-
145
)
32
Consortium
U
Ongoing and future developments at the Universal Protein Resource
Nucleic Acids Res.
 , 
2011
, vol. 
39
 (pg. 
D214
-
D219
)
33
Porter
CT
Bartlett
GJ
Thornton
JM
The Catalytic Site Atlas: a resource of catalytic sites and residues identified in enzymes using structural data
Nucleic Acids Res.
 , 
2004
, vol. 
32
 (pg. 
D129
-
D133
)
34
Gong
S
Park
C
Choi
H
Ko
J
Jang
I
Lee
J
Bolser
DM
Oh
D
Kim
DS
Bhak
J
A protein domain interaction interface database: InterPare
BMC Bioinformatics
 , 
2005
, vol. 
6
 pg. 
207
 
35
Lee
B
Richards
FM
The interpretation of protein structures: estimation of static accessibility
J. Mol. Biol.
 , 
1971
, vol. 
55
 (pg. 
379
-
400
)
36
Worth
CL
The role of amino acid sidechains in protein stability
Ph.D Thesis
 , 
2008
Cambridge
University of Cambridge
37
Worth
CL
Blundell
TL
Satisfaction of hydrogen-bonding potential influences the conservation of polar sidechains
Proteins
 , 
2009
, vol. 
75
 (pg. 
413
-
429
)
38
Li
H
Robertson
AD
Jensen
JH
Very fast empirical prediction and rationalization of protein pKa values
Proteins
 , 
2005
, vol. 
61
 (pg. 
704
-
721
)
39
Calloni
G
Zoffoli
S
Stefani
M
Dobson
CM
Chiti
F
Investigating the effects of mutations on protein aggregation in the cell
J. Biol. Chem.
 , 
2005
, vol. 
280
 (pg. 
10607
-
10613
)
40
Mayer
S
Rudiger
S
Ang
HC
Joerger
AC
Fersht
AR
Correlation of levels of folded recombinant p53 in escherichia coli with thermodynamic stability in vitro
J. Mol. Biol.
 , 
2007
, vol. 
372
 (pg. 
268
-
276
)
41
Tokuriki
N
Tawfik
DS
Stability effects of mutations and protein evolvability
Curr. Opin. Struct. Biol.
 , 
2009
, vol. 
19
 (pg. 
596
-
604
)
42
Lindberg
MJ
Bystrom
R
Boknas
N
Andersen
PM
Oliveberg
M
Systematically perturbed folding patterns of amyotrophic lateral sclerosis (ALS)-associated SOD1 mutants
Proc. Natl Acad. Sci. USA
 , 
2005
, vol. 
102
 (pg. 
9754
-
9759
)
43
Randles
LG
Lappalainen
I
Fowler
SB
Moore
B
Hamill
SJ
Clarke
J
Using model proteins to quantify the effects of pathogenic mutations in Ig-like proteins
J. Biol. Chem.
 , 
2006
, vol. 
281
 (pg. 
24216
-
24226
)
44
Counago
R
Wilson
CJ
Pena
MI
Wittung-Stafshede
P
Shamoo
Y
An adaptive mutation in adenylate kinase that increases organismal fitness is linked to stability-activity trade-offs
Protein Eng. Des. Sel.
 , 
2008
, vol. 
21
 (pg. 
19
-
27
)
45
Jaenicke
R
Protein stability and molecular adaptation to extreme conditions
Eur. J. Biochem.
 , 
1991
, vol. 
202
 (pg. 
715
-
728
)
46
Somero
GN
Proteins and temperature
Annu. Rev. Physiol.
 , 
1995
, vol. 
57
 (pg. 
43
-
68
)
47
Wolf-Watz
M
Thai
V
Henzler-Wildman
K
Hadjipavlou
G
Eisenmesser
EZ
Kern
D
Linkage between dynamics and catalysis in a thermophilic-mesophilic enzyme pair
Nat. Struct. Mol. Biol.
 , 
2004
, vol. 
11
 (pg. 
945
-
949
)
48
Zavodszky
P
Kardos
J
Svingor
Petsko
GA
Adjustment of conformational flexibility is a key event in the thermal adaptation of proteins
Proc. Natl Acad. Sci. USA
 , 
1998
, vol. 
95
 (pg. 
7406
-
7411
)
49
Beadle
BM
Shoichet
BK
Structural bases of stability-function tradeoffs in enzymes
J. Mol. Biol.
 , 
2002
, vol. 
321
 (pg. 
285
-
296
)
50
Meiering
EM
Serrano
L
Fersht
AR
Effect of active site residues in barnase on activity and stability
J. Mol. Biol.
 , 
1992
, vol. 
225
 (pg. 
585
-
589
)
51
Mukaiyama
A
Haruki
M
Ota
M
Koga
Y
Takano
K
Kanaya
S
A hyperthermophilic protein acquires function at the cost of stability
Biochemistry
 , 
2006
, vol. 
45
 (pg. 
12673
-
12679
)
52
Yutani
K
Ogasahara
K
Tsujita
T
Sugino
Y
Dependence of conformational stability on hydrophobicity of the amino acid residue in a series of variant proteins substituted at a unique position of tryptophan synthase alpha subunit
Proc. Natl Acad. Sci. USA
 , 
1987
, vol. 
84
 (pg. 
4441
-
4444
)
53
DePristo
MA
Weinreich
DM
Hartl
DL
Missense meanderings in sequence space: a biophysical view of protein evolution
Nat. Rev. Genet.
 , 
2005
, vol. 
6
 (pg. 
678
-
687
)
54
Bjorklund
P
Lindberg
D
Akerstrom
G
Westin
G
Stabilizing mutation of CTNNB1/beta-catenin and protein accumulation analyzed in a large series of parathyroid tumors of Swedish patients
Mol. Cancer
 , 
2008
, vol. 
7
 pg. 
53
 
55
Song
W
Patel
A
Qureshi
HY
Han
D
Schipper
HM
Paudel
HK
The Parkinson disease-associated A30P mutation stabilizes alpha-synuclein against proteasomal degradation triggered by heme oxygenase-1 over-expression in human neuroblastoma cells
J. Neurochem.
 , 
2009
, vol. 
110
 (pg. 
719
-
733
)
56
Gromiha
MM
Sarai
A
Thermodynamic database for proteins: features and applications
Methods Mol. Biol.
 , 
2010
, vol. 
609
 (pg. 
97
-
112
)
57
Krissinel
E
Henrick
K
Inference of macromolecular assemblies from crystalline state
J. Mol. Biol.
 , 
2007
, vol. 
372
 (pg. 
774
-
797
)
58
Berman
HM
Westbrook
J
Feng
Z
Gilliland
G
Bhat
TN
Weissig
H
Shindyalov
IN
Bourne
PE
The protein data bank
Nucleic Acids Res.
 , 
2000
, vol. 
28
 (pg. 
235
-
242
)
59
Smith
RE
Lovell
SC
Burke
DF
Montalvao
RW
Blundell
TL
Andante: reducing side-chain rotamer search space during comparative modeling using environment-specific substitution probabilities
Bioinformatics
 , 
2007
, vol. 
23
 (pg. 
1099
-
1105
)
60
Capriotti
E
Fariselli
P
Casadio
R
I-Mutant2.0: predicting stability changes upon mutation from the protein sequence or structure
Nucleic Acids Res.
 , 
2005
, vol. 
33
 (pg. 
W306
-
W310
)
61
Khan
S
Vihinen
M
Performance of protein stability predictors
Hum. Mutat.
 , 
2010
, vol. 
31
 (pg. 
675
-
684
)
62
Potapov
V
Cohen
M
Schreiber
G
Assessing computational methods for predicting protein stability upon mutation: good on average but not in the details
Protein Eng. Des. Sel.
 , 
2009
, vol. 
22
 (pg. 
553
-
560
)
63
Ng
PC
Henikoff
S
SIFT: Predicting amino acid changes that affect protein function
Nucleic Acids Res.
 , 
2003
, vol. 
31
 (pg. 
3812
-
3814
)
64
Chelliah
V
Chen
L
Blundell
TL
Lovell
SC
Distinguishing structural and functional restraints in evolution in order to identify interaction sites
J. Mol. Biol.
 , 
2004
, vol. 
342
 (pg. 
1487
-
1504
)
65
Burke
DF
Worth
CL
Priego
EM
Cheng
T
Smink
LJ
Todd
JA
Blundell
TL
Genome bioinformatic analysis of nonsynonymous SNPs
BMC Bioinformatics
 , 
2007
, vol. 
8
 pg. 
301
 
66
Forman
JR
Worth
CL
Bickerton
GR
Eisen
TG
Blundell
TL
Structural bioinformatics mutation analysis reveals genotype-phenotype correlations in von Hippel-Lindau disease and suggests molecular mechanisms of tumorigenesis
Proteins
 , 
2009
, vol. 
77
 (pg. 
84
-
96
)
67
Cangul
H
Morgan
NV
Forman
JR
Saglam
H
Aycan
Z
Yakut
T
Gulten
T
Tarim
O
Bober
E
Cesur
Y
, et al. 
Novel TSHR mutations in consanguineous families with congenital nongoitrous hypothyroidism
Clin. Endocrinol.
 , 
2010
, vol. 
73
 (pg. 
671
-
677
)
68
Ricketts
CJ
Forman
JR
Rattenberry
E
Bradshaw
N
Lalloo
F
Izatt
L
Cole
TR
Armstrong
R
Kumar
VK
Morrison
PJ
, et al. 
Tumor risks and genotype-phenotype-proteotype analysis in 358 patients with germline mutations in SDHB and SDHD
Hum. Mutat.
 , 
2010
, vol. 
31
 (pg. 
41
-
51
)
69
Ferguson
BJ
Alexander
C
Rossi
SW
Liiv
I
Rebane
A
Worth
CL
Wong
J
Laan
M
Peterson
P
Jenkinson
EJ
, et al. 
AIRE's CARD revealed, a new structure for central tolerance provokes transcriptional plasticity
J. Biol. Chem.
 , 
2008
, vol. 
283
 (pg. 
1723
-
1731
)
70
Velho
S
Oliveira
C
Paredes
J
Sousa
S
Leite
M
Matos
P
Milanezi
F
Ribeiro
AS
Mendes
N
Licastro
D
, et al. 
Mixed lineage kinase 3 gene mutations in mismatch repair deficient gastrointestinal tumours
Hum. Mol. Genet.
 , 
2010
, vol. 
19
 (pg. 
697
-
706
)
71
Nagpal
K
Plantinga
TS
Wong
J
Monks
BG
Gay
NJ
Netea
MG
Fitzgerald
KA
Golenbock
DT
A TIR domain variant of MyD88 adapter-like (Mal)/TIRAP results in loss of MyD88 binding and reduced TLR2/TLR4 signaling
J. Biol. Chem.
 , 
2009
, vol. 
284
 (pg. 
25742
-
25748
)
72
Rowling
PJ
Cook
R
Itzhaki
LS
Toward classification of BRCA1 missense variants using a biophysical approach
J. Biol. Chem.
 , 
2010
, vol. 
285
 (pg. 
20080
-
20087
)
73
Singh
SM
Kongari
N
Cabello-Villegas
J
Mallela
KM
Missense mutations in dystrophin that trigger muscular dystrophy decrease protein stability and lead to cross-beta aggregates
Proc. Natl Acad. Sci. USA
 , 
2010
, vol. 
107
 (pg. 
15069
-
15074
)
74
Kang
S
Chen
G
Xiao
G
Robust prediction of mutation-induced protein stability change by property encoding of amino acids
Protein Eng. Des. Sel.
 , 
2009
, vol. 
22
 (pg. 
75
-
83
)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Comments

0 Comments
Submit a comment
You have entered an invalid code
Thank you for submitting a comment on this article. Your comment will be reviewed and published at the journal's discretion. Please check for further notifications by email.