Surfaces: a software to quantify and visualize interactions within and between proteins and ligands

Abstract Summary Computational methods for the quantification and visualization of the relative contribution of molecular interactions to the stability of biomolecular structures and complexes are fundamental to understand, modulate and engineer biological processes. Here, we present Surfaces, an easy to use, fast and customizable software for quantification and visualization of molecular interactions based on the calculation of surface areas in contact. Surfaces calculations shows equivalent or better correlations with experimental data as computationally expensive methods based on molecular dynamics. Availability and implementation All scripts are available at https://github.com/NRGLab/Surfaces. Surface’s documentation is available at https://surfaces-tutorial.readthedocs.io/en/latest/index.html.

we have the quantitative correlations (Pearson's correlation coefficient, Pearson's R) and the area under the ROC curve (AUC) presented for the entire dataset in blue, medium confidence subset (|ΔΔG| > 0.5 kcal/mol) in gray, and high confidence subset (|ΔΔG| > 1 kcal/mol) in red.Results are from predictions performed using the scoring functions bASA, dDFIRE, DFIRE and FoldX (using mutants generated by FoldX), Rosetta (with mutants generated by Rosetta), Discovery Studio (using mutants generated by Discovery Studio) and STATIUM; (B) Surfaces results for the same dataset with quantitative correlations (Pearson's correlation coefficient, Pearson's R) and the area under the ROC curve (AUC), with binding predictions performed to mutant structures generated using FoldX and Rosetta.
The second method validation was based on the work by Sergeeva et al. (Sergeeva et al., 2023), that compares binding prediction results using free-energy pertubations (FEP) against a broad range of different methods, specifically methods based on machine-learning: Mutabind2 (Zhang et al., 2020), mCSM-PPI (Rodrigues et al., 2019), SAAMBE-3D (Pahari et al., 2020); based on statistical potentials: BeAtMusic (Dehouck et al., 2013); and force field related scoring functions: FoldX (Schymkowitz et al., 2005), Rosetta flex ddG (Barlow et al., 2018)Click or tap here to enter text.; and molecular mechanics: MM/PBSA and MM/GBSA (Singh and Warshel, 2010;Genheden and Ryde, 2015).The authors compare the performance of these methods on a set of SARS-CoV-2 Spike mutations for which they have experimentally determined binding affinity to the receptor ACE2 using surface plasmon resonance (Pattnaik, 2005).Their data allows us to compare the performance of Surfaces to that of the various methods against experimental results for the evaluation of this particular interaction.
To do so, we performed all the calculations on mutants modeled to the same crystal structure of ACE2/RBD (PDB 6M0J) (Lan et al., 2020), following the same methodology used by Sergeeva et al. (Sergeeva et al., 2023).In order to do the RMSE calculation, as well as to follow particular numerical thresholds used for the analysis of the data, we also produced Surfaces results subjected to a linear regression to the experimental values.
The statistical metrics originally used to compare the results using the different methods were the Pearson's correlation coefficient (Pearson's R), root mean square error (RMSE) and Pearson's phi for stabilizing mutations, performing a binary classification of the data, considering as stabilizing the mutations with ΔΔG ≤ -0.4 (Pearson's Φ (stabilizing ≤ -0.4)), a threshold defined by Sergeeva et al. based on the experimental accuracy (Sergeeva et al., 2023).We utilized the same statistical metrics to Surface results.The results are shown in Table S1.In Figure S1A we plot the experimental vs. calculated ΔΔG values for all the methods presented by Sergeeva et al. and in Figure S1B for Surfaces.According to all metrics, Surfaces' performance is equivalent to that of FEP, with a superior Pearson's R to all other methods at a fraction of the computational cost.
It is worth noting that Surfaces provides a simplified estimation of changes in enthalpy differences (DDH), not of the free energy.In the past we have utilized our ENCoM normal mode analysis method to estimate DDG of mutations based solely on vibrational entropy differences (Frappier and R. J. Najmanovich, 2014;Frappier and R. Najmanovich, 2014).Such vibrational entropy differences, as calculated for a large number of Spike mutants (Teruel et al., 2021), can be performed in a straightforward manner with NRGTEN package (Mailhot and Najmanovich, 2021) and combined with Surfaces DDH predictions.
According to the different metrics used to compare predictive performance, FEP calculations based on a 100ns Molecular Dynamics (MD) trajectories are shown to represent a good alternative (Table S1).As noted by the authors (Sergeeva et al., 2023), being MD-based, FEP calculations require a complex computational infrastructure and are time consuming.
Surfaces, presenting a close Pearson's correlation, similar error compared to the experimental values, as well as a comparable classification performance, measured by the Pearson's phi for the threshold value used for classifying a mutation as "stabilizing" (Table S1, Fig S1), required a very small fraction of the computational power to evaluate all mutants (185 CPU-seconds).The results from Table S1 and Fig S2 were generated with the goal to understand binding affinities of mutations and variants already selected during viral evolution.However, some of these same methods, with significant computational cost, would not be suitable for exploratory objectives while Surfaces can be used in high-throughput computational mutational scans.
Table S1.Comparative measures of binding affinity of mutants of the SARS-CoV-2 RBD and the receptor ACE2.From (Sergeeva et al., 2023) we have the experimental values (ΔΔG experiment SPR), as well as the predictions using different binding affinity calculation methods (left table) and the analysis of Pearson's correlation coefficient (Pearson's R), root mean square error (RMSE) and Pearson's phi for stabilizing mutations (Pearson's Φ (stabilizing ≤ -0.4)) (lines in gray).Added to this table, ΔCF Surface results calculated for the same mutants, the linear regression of these values to the experimental reference (right table) and the statistical analysis for Surfaces results (light blue lines).Blue values correspond to stabilizing mutations with ΔΔG ≤ -0.4.Red values correspond to destabilizing mutations with ΔΔG ³ 0.4.This table is also available in word format for easy of reuse as a supplementary Table S1 file.

Fig S2. Graphic representation of the calculated binding affinities for each of the mutants against the experimental values, as well as their tendency lines.
Values of ΔΔG predictions using different binding affinity calculation methods compared to experimental data (Sergeeva et al., 2023).Linear regression of Surfaces' ΔCF prediction values for binding affinity compared to experimental data.
While the measure of full binding of mutants is often used to evaluate the impact of mutations on protein interactions, it does not always reflect the changes in interactions of the mutated residue alone, and may be associated with a disruption of adjacent regions in the interface.This point is also discussed when observing the non-cumulative effects seen for single mutants on full interfacesthe sum of the effects of individual mutations may not equal the effect of multiple mutations combinedas pointed out in the first evaluations of the Omicron Spike (Cameroni et al., 2022;Dejnirattisai et al., 2022).
Due to these limitations, for exploratory objectives and rational design, per-residue energetic decomposition methods offer very important information for purposes of protein engineering.
All data presented above, including individual predictions for each mutant are available in the Github repository for Surfaces: https://github.com/NRGlab/Surfaces.

Per-residue decomposition
The most common methods for analyzing protein-protein interactions and per-residue energetic decomposition are based on molecular dynamics (MD) simulations (Homeyer and Gohlke, 2012;Kollman et al., 2000;Serçinoglu and Ozbek, 2018).MD simulations can provide detailed information on the structural changes and energetics associated with protein-protein interactions, including the binding free energy and per-residue energetic contributions.However, MD simulations are computationally intensive and therefore can take a long time to complete (Ciccotti et al., 2022;Bopp et al., 2008), making them less suitable for large-scale studies such as in computational protein design.
A widely used method for residue-based energy decomposition is gRINN (Serçinoglu and Ozbek, 2018), which uses MD trajectories to calculate binding energy means and distributions.
From trajectories of the SARS-CoV-2 Spike Delta variant in complex with the receptor ACE2 (Cheng et al., 2022), available at the COVID-19 Molecular Structure and Therapeutics Hub (MolSSI), we calculated interface interactions using gRINN and compare with Surfaces.We generated the results for gRINN considering only residues that were at a maximum distance of 6.8Å from the interface, a filtering distance of 20Å between residues and interactions present in 100% of the trajectory frames -these parameters reduced gRINN analysis to interactions in closer proximity, inter-chain interacting residues within the interface and filtered off transient interactions, to more closely match the interaction that are analyzed with Surfaces.Considering the net value of interactions for each residue, the Pearson's correlation coefficient between the results of the two methods is 0.634 (p= 6.07E-90) (Fig S2).
Whereas Surfaces does not detect transient interactions, it is possible to generate protein ensembles with the NRGTEN package (Mailhot and Najmanovich, 2021).Such an approach would make possible to generate a distribution of energies for each interaction.Furthermore, the possibility of calculating transition probabilities and occupancies for different configurations opens the possibility to apply statistical mechanics techniques.

Ligand binding evaluation
The same rationale and scoring function can also be used to evaluate binding between a protein and a ligand, as well as the binding energy decomposition per residue of the protein and per atom of the ligand.Surfaces employs a scoring function first introduced by FlexAID (Gaudreault and Najmanovich, 2015), offering as default a matrix of pairwise interactions based on 40 SYBYL atom types that was optimized from the evaluation of ligand interactions of the PDBbind dataset (Wang et al., 2005).
The concept of "frustration" in statistical physics refers to a system where a global energy minima cannot be reached by minimizing the energy of each interaction present in the system, some interactions remain at a state with higher energy than the lowest possible and are said to be frustrated.Biomolecules have been shown to harbor frustrated interactions (Bryngelson and Wolynes, 1987;Ferreiro et al., 2011Ferreiro et al., , 2013)).In that context, Surfaces can be used to identify potential frustrated interactions not only between interacting amino acids as shown above, but also between ligands and proteins.The quantification and visualization of the energetic contribution of individual interactions within a specific ligand-protein or protein-protein complex with Surfaces helps to understand and exploit frustration in drug design and protein engineering.
As an example, we utilize Surfaces to visualize the interactions between ligand protein complexes from the BioLiP dataset (Yang et al., 2013).First, two distinct proteins (DMSP lyase and Sigma-54 dependent transcriptional regulator) bound to the ATP (Fig S4 A,B and Table S2 A,B), and second, two similar molecules (ATP and GTP) bound to the same binding site (Sigma-54 dependent transcriptional regulator) (Fig S4 B,C and Table S2 B,C).Surfaces identifies frustrated interactions between ATP and DMSP lyase residues (residue ASP 721, figure S4A) as well as different frustrated interactions for Sigma-54 dependent transcriptional regulator (residue ASP 241, figure S4B), illustrating probable different convergent evolutionary paths.The comparison between interactions of ATP and GTP with the same Sigma-54 dependent transcriptional regulator binding site reveals visible changes in some interactions, namely in the strength of the favorable interaction with residue ARG 357 and the existence of interactions with ASP 241 and ILE 199, motivated by the small difference between the molecules and also by the different conformations of the ligands.B, C). (A) ATP bound to DMSP lyase (PDB 7CM9), with the interacting residue ASP 721 highlighted for its unfavorable interaction; (B) ATP bound to Sigma-54 dependent transcriptional regulator (PDB 7V3W), with the interacting residue ASP 241 highlighted for its unfavorable interaction and residue ARG 357 highlighted for its particularly favorable interaction; and (C) GTP bound to Sigma-54 dependent transcriptional regulator (PDB 7V2B), with the interacting residue ARG 357 highlighted for its less favorable interaction, and residue ILE 199, highlighted for its new favorable interaction, both compared to those seen for ATP.

Fig S1 .
Fig S1.Comparative evaluation of different binding affinity prediction methods applied to the AB-Bind dataset.(A) From Sarin et al. (Sirin et al., 2016) we have the quantitative correlations (Pearson's correlation coefficient, Pearson's R) and the area under the ROC curve (AUC) presented for the entire dataset in blue, medium confidence subset (|ΔΔG| > 0.5 kcal/mol) in gray, and high confidence subset (|ΔΔG| > 1 kcal/mol) in red.Results are from predictions performed using the scoring functions bASA, dDFIRE, DFIRE and FoldX (using mutants generated by FoldX), Rosetta (with mutants generated by Rosetta), Discovery Studio (using mutants generated by Discovery Studio) and STATIUM; (B) Surfaces results for the same dataset with quantitative correlations (Pearson's correlation coefficient, Pearson's R)

Fig
Fig S3.Per-residue result comparison between gRINN and Surfaces calculations.(A) Graphic representation of the calculated binding affinities for each residue of the complex using gRINN and Surfaces, as well as the tendency line of the correlation between the values.(B) Visual representation of the per-residue interactions according to gRINN results in a scale of dark blue (favorable) to white (neutral) generated with Surfaces visual output scripts.(C) Visual representation of the per-residue interactions according to Surfaces results in a scale of dark blue (favorable) to white (neutral) generated with Surfaces visual output scripts.

Fig S4 .
Fig S4.Illustration of the use of Surfaces for the assessment and visualization of protein-ligand interactions with the same molecule in different binding sites (A, B), and slightly different molecules in the same site (B, C). (A)ATP bound to DMSP lyase (PDB 7CM9), with the interacting residue ASP 721 highlighted for its unfavorable interaction; (B) ATP bound to Sigma-54 dependent transcriptional regulator (PDB 7V3W), with the interacting residue ASP 241 highlighted for its unfavorable interaction and residue ARG 357 highlighted for its particularly favorable interaction; and (C) GTP bound to Sigma-54 dependent transcriptional regulator (PDB 7V2B), with the interacting residue ARG 357 highlighted for its less favorable interaction, and residue ILE 199, highlighted for its new favorable interaction, both compared to those seen for ATP.