- Split View
-
Views
-
Cite
Cite
Miha Skalic, Alejandro Varela-Rial, José Jiménez, Gerard Martínez-Rosell, Gianni De Fabritiis, LigVoxel: inpainting binding pockets using 3D-convolutional neural networks, Bioinformatics, Volume 35, Issue 2, January 2019, Pages 243–250, https://doi.org/10.1093/bioinformatics/bty583
- Share Icon Share
Abstract
Structure-based drug discovery methods exploit protein structural information to design small molecules binding to given protein pockets. This work proposes a purely data driven, structure-based approach for imaging ligands as spatial fields in target protein pockets. We use an end-to-end deep learning framework trained on experimental protein–ligand complexes with the intention of mimicking a chemist’s intuition at manually placing atoms when designing a new compound. We show that these models can generate spatial images of ligand chemical properties like occupancy, aromaticity and donor–acceptor matching the protein pocket.
The predicted fields considerably overlap with those of unseen ligands bound to the target pocket. Maximization of the overlap between the predicted fields and a given ligand on the Astex diverse set recovers the original ligand crystal poses in 70 out of 85 cases within a threshold of 2 Å RMSD. We expect that these models can be used for guiding structure-based drug discovery approaches.
LigVoxel is available as part of the PlayMolecule.org molecular web application suite.
Supplementary data are available at Bioinformatics online.
1 Introduction
Many methods in structure-based drug discovery focus on exploiting structural information of protein pockets in order to select and optimize possible ligand binders. A three dimensional spatial featurization of proteins and small molecules and a metric on such space is usually utilized to characterize the pocket or ligand. Shape similarity, for instance, is commonly used in several virtual screening pipelines as is the case of the volumetric aligned molecular shapes (VAMS) algorithm (Koes and Camacho, 2014) and the ROCS software (Hawkins et al., 2007). Docking procedures such as PharmDock (Hu and Lill, 2014), use this type of information to refine poses. Another common application where three dimensional features are used is de novo drug-design, in order to build a binding site model with the intent of designing a novel ligand taking into account some constraints (Yuan et al., 2011). Grid generation is also central in the elucidation of pharmacophores, such is the case of FLAPpharm (Cross et al., 2012) and VolSite (Desaphy et al., 2012), and their search (Ebalunode et al., 2008).
Whether optimal features can be automatically learned from the data rather than imposed remains an open question. In the last few years, deep learning based approaches (Goodfellow et al., 2016), have demonstrated state of the art performance on tasks such as 2-D image classification (Krizhevsky et al., 2012), natural language processing (Goldberg, 2015), speech recognition (Graves et al., 2013) and more. The promise of deep learning in computational biology, however, is still yet to be fully developed (Angermueller et al., 2016), but is increasingly common. Proteins, for instance, can be seen from a computer vision point of view, as if they were three-dimensional images after featurization. Variants of this approach have already been applied in protein binding site prediction (Jiménez et al., 2017), protein–ligand binding affinity prediction (Jiménez et al., 2018), large-scale active-inactive classification of drug-like compounds (Ragoza et al., 2017) and quality assessment of protein folds (Derevyanko et al., 2018).
This work shows how end-to-end deep convolutional neural network (DCNNs) models (LeCun et al., 1998) can generate ligand fields given the structure of a protein binding site. Unlike previous knowledge-based and hand-crafted rules approaches, the proposed model is purely data-driven with the intention to mimic a chemist’s intuition at manually placing atoms when designing a new compound. Furthermore, one can adjust the properties of these generated fields by tuning several input parameters. We treat proteins and ligand structures from a computer vision perspective, by voxelizing them according to different physio-chemical properties, and show that the learned and predicted representation can be useful in several related applications by evaluating it against unseen co-crystal structures.
The proposed methods can be considered complementary to molecular interaction fields (MIF) methods such as GRID (Carosati et al., 2004), where an interaction energy is calculated between a target molecule and a probe, which may be an atom or a group. However, there are key differences: MIFs create volumetric maps by iteratively placing the probe at each point and calculating the interaction energy with the macromolecule. In contrast, the method proposed here takes as input the whole area at once and produces all outputs simultaneously. This is done in an end-to-end fashion. Values representing protein shape are multiplied by model’s weights generating latent feature maps. The operation is repeated sequentially, until an output feature map is generated. While MIFs predict interaction energies, the method described here gives probabilities of seeing atoms with a property at each point.
2 Approach
Property . | Rule . |
---|---|
Hydrophobic | Aliphatic or aromatic C |
Aromatic | Aromatic C |
Hydrogen bond acceptor | Acceptor 1 H-bond or S Spherical N Acceptor 2 H-bonds or S Spherical O Acceptor 2 H-bonds S |
Hydrogen bond donor | Donor 1 H-bond or Donor S Spherical H with either O or N partner |
Positive ionizable | Gasteiger positive charge |
Negative ionizable | Gasteiger negative charge |
Metallic | Mg, Zn, Mn, Ca or Fe |
Occupancy (Excluded volume) | All heavy atoms |
Property . | Rule . |
---|---|
Hydrophobic | Aliphatic or aromatic C |
Aromatic | Aromatic C |
Hydrogen bond acceptor | Acceptor 1 H-bond or S Spherical N Acceptor 2 H-bonds or S Spherical O Acceptor 2 H-bonds S |
Hydrogen bond donor | Donor 1 H-bond or Donor S Spherical H with either O or N partner |
Positive ionizable | Gasteiger positive charge |
Negative ionizable | Gasteiger negative charge |
Metallic | Mg, Zn, Mn, Ca or Fe |
Occupancy (Excluded volume) | All heavy atoms |
Property . | Rule . |
---|---|
Hydrophobic | Aliphatic or aromatic C |
Aromatic | Aromatic C |
Hydrogen bond acceptor | Acceptor 1 H-bond or S Spherical N Acceptor 2 H-bonds or S Spherical O Acceptor 2 H-bonds S |
Hydrogen bond donor | Donor 1 H-bond or Donor S Spherical H with either O or N partner |
Positive ionizable | Gasteiger positive charge |
Negative ionizable | Gasteiger negative charge |
Metallic | Mg, Zn, Mn, Ca or Fe |
Occupancy (Excluded volume) | All heavy atoms |
Property . | Rule . |
---|---|
Hydrophobic | Aliphatic or aromatic C |
Aromatic | Aromatic C |
Hydrogen bond acceptor | Acceptor 1 H-bond or S Spherical N Acceptor 2 H-bonds or S Spherical O Acceptor 2 H-bonds S |
Hydrogen bond donor | Donor 1 H-bond or Donor S Spherical H with either O or N partner |
Positive ionizable | Gasteiger positive charge |
Negative ionizable | Gasteiger negative charge |
Metallic | Mg, Zn, Mn, Ca or Fe |
Occupancy (Excluded volume) | All heavy atoms |
3 Materials, methods and evaluation procedure
3.1 Datasets
Our model was both trained and evaluated on the 2013 release of the sc-PDB database (Desaphy et al., 2015), which provides 9190 high-quality protein–ligand co-crystals extracted from the PDB (Berman et al., 2000). For further evaluation, we also use the Astex diverse set, which includes 85 protein–ligand complexes considered to be of pharmaceutical interest (Hartshorn et al., 2007) and has been previously used in several studies for docking software benchmarking (Davis and Baker, 2009; Ravindranath et al., 2015).
In order to provide a fair benchmarking of the method, a cross-validation scheme based on protein sequence similarity was used. The goal was to test how well the proposed method generalizes to proteins that differ from the ones used in training. We extracted the latest available sequence similarity clusters from the PDB (Berman et al., 2000) website using a 90% similarity cutoff and assigned each of the proteins available in the scPDB database to a particular cluster only if all of its chains belong to the same one, while proteins with chains belonging to different clusters are discarded. This procedure yielded a total of 8808 protein–ligand complexes belonging to 3305 different sequence similarity clusters that were then randomly grouped together into k = 10 splits with sizes ranging between 525 and 1230 complexes that we evaluate using k-fold cross-validation. In practice this procedure implies that k models are trained using all splits minus a single one, used for evaluation. All PDB IDs used in each split are available as Supplementary Material. In the case of the Astex diverse set, we make the sequence cluster difference between proteins in the scPDB database and the former for training, yielding a training set of size 7147.
3.2 Neural network architecture and learning
Two different types of model were trained in this work, named conditional and unconditional. The rationale behind the distinction is that in conditional models an additional four atom counts input was used (hydrogen donor and acceptor, aromatic and total number of atoms), while in the unconditional one we choose not to provide this information. The former uses both terms of the loss, with the second (corresponding to the KL-divergence) forcing the model and its additional input to place a fixed number of ligand fields in its predicted output. By adjusting or providing different additional channels to the same pocket channel inputs the conditional model can therefore be fine-tuned. In contrast, the unconditional model only uses the first part of the loss (corresponding to binary cross-entropy). Our proposed model architecture is presented in Figure 2, and code snippet is available in Supplementary Material. Model consists of seven sequential and independent layers of convolution, activated by a rectified linear unit (ReLU) non-linearity except the last one, which instead uses a sigmoid function, outputting values ranging between 0 and 1. The output of this last layer represents the predicted representation. All convolutions in the model use a filter size of 3 units, as commonly used in image processing tasks, except for the one in the fifth layer with uses an extensive 16 units one, so as to capture long-distance relationships in the input. Zero-padding is used in all feature maps and the number of latent feature maps in each layer has been chosen so that the model could be trained on a single GPU with 8GB of VRAM. It is common in the literature to use the back-propagation algorithm and some form of stochastic gradient descent to train deep neural networks. For this study, we chose Adam (Kingma and Ba, 2014) with default parameters for momentum scheduling ( as model optimizer. The neural network was then trained during 300 epochs using randomized batch size of 128 samples, introducing molecule rotations and a shifting of 0.5 Å from the ligand center during training so as to prevent overfitting and set λ = 10 after visual inspection of the trained models. Model learning and predictions were done in the Keras (Chollet, 2015) deep learning library (keras.io) and the Theano (Theano Development Team, 2016) tensor backend.
4 Results
4.1 Ligand design
The predictive model is capable of generating useful property fields that reasonably fill pocket space, while avoiding protein clashes. One can further control the proposed conditional model by providing different input parameters, in practice affecting the properties of the generated fields. Figure 3 shows how these changes modify the behavior of the algorithm by generating bigger grids as input values increase. Additional examples of varying property counts are depicted in Supplementary Figure S3. Although the model itself learns how to place ligand hydrogen-bond acceptors next to protein hydrogen-bond donors and viceversa, one common shortcoming we observed is that for properties where variability can be high, the model is not efficient at generating distinguishable clouds, typically predicting grids around the hydrogen-bond partner. This phenomenon can be partially explained by the fact that donors and acceptors are located close in space, but can be positioned in a wide variety of angles.
4.2 Evaluation procedure
To evaluate the quality of the predicted property fields, we analyzed the volumetric overlaps between crystal ligand features and predicted features. To that end, three different and complimentary tests were carried out, where we either measure or maximize the overlap between the predicted fields and the actual ligand representations. The overlap scores were used to evaluate: (i) sensibility of predictions, comparing it to random ones and values generated by MIFs, (ii) ability to discriminate between random and crystal poses, taking into the account rotation and translation and (iii), ability to discriminate crystal ligand conformers from generated ones. The goal of the proposed tests is not to prove the method’s superiority compared to existing approaches, but rather to show and motivate that the architecture can flexibly be used alongside similar representations in the same or related tasks.
In our second test, we measure the model’s ability to recover the crystal pose by rotating the crystallized ligand in the pocket around its own geometrical center with a Δ and measure the overlap between each rotated state and the predicted fields. All rotated states were generated starting from a random orientation of the ligand. Ideally, the intention is to see that rotated states close to the crystal pose are ranked significantly higher than those far apart when considering volumetric overlaps. Analysis were further expanded by adding translations. In this last experiment the featurization window was slided ±3 Å in all three dimensions with a stride of 1 Å around the binding site.
We pick the best poses according to the detailed metric, taking into account possible ligand redundancies by discarding those poses with an RMSD below 2 Å with respect to any better scored pose. A pose was considered correct if the RMSD from its corresponding crystal was less than 2 Å. Results for test (ii) are presented in Table 2. After evaluation, an enrichment of scored poses below RMSD of 2 Å with respect to the crystallized ligand is observed: for the 85 protein–ligand pairs, and the 1728 total generated poses evaluated for each considering only rotations, those ranked first by the algorithm were under the 2 Å threshold in 70 pairs, while if we consider the 5 top ranked poses, the number of correct poses increases to 75. If translations are included, 64 and 81 are under the defined threshold for top 1 and top 5, respectively. This enrichment suggests that the proposed model and scoring function can effectively discriminate crystal poses. However, some ligand topologies represent a harder challenge; Figure 5 shows one case (PDB Id. 1S3V, ligand TQD) of unsuccessful pose recovery. In this example, the shape of the molecule and its almost symmetrical distribution of aromatic and h-bond groups along the ligand allows it to accommodate inside the predicted excluded volume with the two orientations shown, while fulfilling the aromatic and h-bond prediction restraints. The placement of two consecutive aromatic rings near the bigger aromatic patch prediction suggests why this pose was scored higher than those closer to the crystal. Similar scenarios were found during our testing, suggesting that molecules displaying functional groups symmetry are harder to recover - but it remains unclear if these alternative poses are simply less likely, but still viable.
. | Top 1 . | Top 5 . |
---|---|---|
Rotations | 70 (82%) | 75 (88%) |
Rotations and translations | 64 (75%) | 80 (94%) |
Conformer enrichment | 74 (93%) | 76 (95%) |
. | Top 1 . | Top 5 . |
---|---|---|
Rotations | 70 (82%) | 75 (88%) |
Rotations and translations | 64 (75%) | 80 (94%) |
Conformer enrichment | 74 (93%) | 76 (95%) |
Note: For test ii we first only took rotations into account and then translations.
. | Top 1 . | Top 5 . |
---|---|---|
Rotations | 70 (82%) | 75 (88%) |
Rotations and translations | 64 (75%) | 80 (94%) |
Conformer enrichment | 74 (93%) | 76 (95%) |
. | Top 1 . | Top 5 . |
---|---|---|
Rotations | 70 (82%) | 75 (88%) |
Rotations and translations | 64 (75%) | 80 (94%) |
Conformer enrichment | 74 (93%) | 76 (95%) |
Note: For test ii we first only took rotations into account and then translations.
For our third test, we design a conformer enrichment procedure. To that end, 10 ligand conformers were generated using RDKit (Landrum, 2012). Only poses with at least 2 Å difference between them were kept. For each generated conformer including the crystal conformer the best scoring Δ pose was selected. Ranks of these poses were compared. Results for tests (iii) can also be found in Table 2, where 5 ligands had to be discarded from the analyses as RDKit failed to successfully provide 10 conformers. The proposed model is able to recover the crystal ligand out of the pool of generated conformers in 74 and 76 out of 80 cases for top 1 and 5 ranks, respectively. When considering only 10 generated poses, however, we did not observe successful pose prediction. Only in two cases the method was able to generate poses with an RMSD <2. This low performance could be attributed to non-exhaustive conformer generation, large Δ angle or translation.
5 Implementation and availability
An implementation of LigVoxel can be further explored by the scientific community through a web-application, part of the playmolecule.org suite. An example sesssion is shown in Figure 6. Users can submit their own protein binding site in.pdb format, choose either the conditional or unconditional model for field predictions and retrieve results, that can either be visualized using NGL (Rose and Hildebrand, 2015) or downloaded locally for further analysis.
6 Discussion
In this work, we have presented a novel approach for the generation of ligand images filling protein pockets based on deep neural networks. Results shown here suggest that these predictions are responsive to the number of atoms selected as input, they significantly overlap with ligand features of previously unseen ligands, and they can be used to select poses and conformers close to the native ligand orientation and geometry. We believe that approaches similar to the one proposed here will be important for structure based design of protein binders. For instance, in de novo drug-design, these generated fields could be used as reference for incremental atom placement. Furthermore, in computational docking, the proposed fields can assist classical scoring functions by overlap or proximity optimization. The extent to which these models can be applied in drug discovery and design pipelines has yet to be demonstrated fully, but we believe that the initial validation performed in this work is promising for future applications.
Funding
The authors thank Acellera Ltd. for funding. G.D.F. acknowledges support from MINECO (BIO2017-82628-P) and FEDER. This project has received funding from the European Union’s Horizon 2020 Research and innovation programme under Grant Agreement No. 675451 (CompBioMed project).
Conflict of Interest: none declared.
References