Abstract

Motivation

The large-scale kinome-wide virtual profiling for small molecules is a daunting task by experimental and traditional in silico drug design approaches. Recent advances in deep learning algorithms have brought about new opportunities in promoting this process.

Results

KinomeX is an online platform to predict kinome-wide polypharmacology effect of small molecules based solely on their chemical structures. The prediction is made by a multi-task deep neural network model trained with over 140 000 bioactivity data points for 391 kinases. Extensive computational and experimental validations have been performed. Overall, KinomeX enables users to create a comprehensive kinome interaction network for designing novel chemical modulators, and is of practical value on exploring the previously less studied or untargeted kinases.

Availability and implementation

KinomeX is available at: https://kinome.dddc.ac.cn.

Supplementary information

Supplementary data are available at Bioinformatics online.

1 Introduction

By December 2018, 48 small kinase inhibitors have been approved by US FDA, about half of which were approved over the last 5 years. However, the target space or bioactivities of the approved and clinical inhibitors are still surprisingly poorly characterized (Klaeger et al., 2017). Since large-scale biochemical assay to determine kinase targets of compounds are usually costly and time-intensive, the discovery of potent inhibitors with designated selectivity and polypharmacological profile is still a challenging direction in novel drug development process.

Many in silico modelling approaches have been developed to predict kinase inhibitory activity for large-scale compound libraries (Zhong et al., 2018). Compared to traditional drug design methods (e.g. docking and standard QSAR), machine learning-based models such as naïve Bayesian (NB) (Niijima et al., 2012), k-nearest neighbours (KNN) (Stephan and Steven, 2013), random forest (RF) (Bora et al., 2016; Merget et al., 2017), support vector machine (SVM) (Cheng et al., 2012; Niijima et al., 2012) and deep neural network (DNN) (Manallack et al., 2002) have been established to predict a wider range of biological activities for a compound by employing high-dimensional datasets. The separate training process with individual datasets for different tasks of these methods is bound to affect their capability to predict kinome-wide selectivity and polypharmacological profiles. On the other hand, network-aided approaches exploit topological information of drug-target interaction network (Cheng, 2019; Cheng et al., 2019). Assembling targets to be predicted into a network architecture reveals notable advantages of prediction of polypharmacological profiles and drug repositioning, but extending the range of predictive tasks to include tasks with insufficient data could be intractable for network-aided approaches.

Here we present KinomeX, an online platform to predict kinome-wide polypharmacology effect of small molecules based on an established multi-task deep neural network (MTDNN) model. KinomeX implements comprehensive profiling on kinome-wide activity for small molecule, and it can depict the overall selectivity and the selectivity towards a sub-family of kinases based on the predicted kinase profile. We envisage that KinomeX could enable the prediction of kinome-wide polypharmacology effect of small molecules for designing novel chemical modulators or drug repositioning.

2 Main features

2.1 Kinome-wide activity prediction for small molecules based on MTDNN model

The main function of KinomeX is to predict the kinome-wide activity for the small molecules user submitted based on a pre-trained MTDNN classification model. MTDNNs are trained to learn a joint representation of input data points shared for all learning tasks. For the classification model, the representation is fed into N softmax classifiers, one for each task. Previous studies of MTDNN applied to drug discovery suggested that MTDNN can obtain predictive accuracies significantly better than single-task methods for the problems with multiple and related tasks (Ramsundar et al., 2017). Here, over 170 000 bioactivity data points composed of 391 kinases and ∼32 000 compounds were collected from Kinase SARfari database version 6.00 and the Metz dataset (Metz et al., 2011). Eighty percentage of the dataset (∼140 000 data points) were used for training the MTDNN model where one task corresponds to the bioactivity prediction against a specific type of kinase, and rest 20% used for testing (see details in Supplementary Material). ECPF4 was adopted to featurize each molecule and fed into input layer, meaning that only 2D chemical structural information is needed for making prediction based on the resulted model. The outputs of the model are the probabilities whether a small molecule is active against a panel of kinases. A higher probability corresponds to a higher confidence for the predicted interaction. Generally, a predicted interaction with a probability value higher than the cut-off of 0.5 could be considered active. The results can be further narrowed down by setting higher probability cut-offs (0.7 and 0.9) in KinomeX. Extensive computational and experimental validations have been performed. The model shows excellent prediction capability with an auROC of 0.90 on the internal test dataset and consistently outperforms conventional single-task models (Merget et al., 2017) on multiple independent external datasets, especially for the kinases with insufficient activity data (Supplementary Fig. S2 and Table S2).

2.2 Selectivity depiction

Based on the predicted kinome-wide activity, KinomeX also provides a depiction of the overall selectivity and the selectivity towards a sub-family of kinases for the submitted small molecule. Two accepted quantitative evaluation methods, standard score (Karaman et al., 2008) and Gini coefficient (Graczyk, 2007), were adopted to calculate the overall selectivity. To measure of sub-family selectivity, odds ratio (OR) was adopted (Bland and Altman, 2000), which can be considered as the strength of the association between an inhibitor and a sub-family. If the OR is significantly >1.0, it indicates that the inhibitor is selective to the sub-family.

3 Case study

A rigorous experimental validation was performed to investigate the practical benefits of the model for drug discovery or repurposing applications. NVP-BHG712 has been reported as an EphB4 receptor inhibitor showing activity against c-Raf, c-Src and c-Abl (Martiny-Baron et al., 2010). NPV-BHG712 is not included in our modelling dataset, and complete kinase spectrum of it is unavailable. The prediction for activity and selectivity of NVP-BHG712 was performed by KinomeX, and it shows that its reported primary targets can be accurately identified. Besides, NVP-BHG712 was predicted to inhibit other important therapeutic targets, including Src family kinases (Lyn, Hck, Lck and Yes), DDR1 and TrkC, which may implicate the anti-cancer potential of NVP-BHG712. The commercial KinaseProfiler Service (Eurofins Scientific, Inc.) was utilized to evaluate the experimental activity against a panel of 405 kinases. Overall, KinomeX produced high-quality predictions generally in agreement with experiment data with an auROC of 0.86, and the novel ‘off-target’ activities have been confirmed by subsequent experiments at lower compound concentrations (Supplementary Fig. S3 and Table S3).

4 Implementation

KinomeX is freely accessible for non-commercial users in a webserver (https://kinome.dddc.ac.cn). A typical kinome-wide virtual profiling workflow of KinomeX is as following (Fig. 1): (i) A molecule is submitted from user-side (browser) by drawing online (Bienfait and Ertl, 2013) or by uploading (Molecule Submission); (ii) The kinome-wide activities are computed by server-side based on a pre-trained MTDNN model (Activity Computation); (iii) Figures are generated based on the profiling, including a chemical depiction, a kinase phylogenetic tree (Chartier et al., 2013) and selectivity bar charts (Figure Generation); (iv) esults download and/or display online (Display & Download).

Fig. 1.

The workflow of KinomeX

MTDNN model was developed with Tensorflow (Version 1.6.0) and DeepChem (Version 2.1.0). For the backend of KinomeX, it uses a MySQL database for data storage and an nginx software as the server engine. For the frontend, it deploys several JS/React modules which support the functionalities such as molecule submission, online display, etc. The main frame of KinomeX is developed within a Django framework, with some backend services written in Python scripts for tasks such as job management, log management, etc. The key features of KinomeX have been fully modularized for better usage and construction. More secure connections to KinomeX have been provided through HTTPS protocols for many popular web browsers. In addition to these, KinomeX is deployed on an AliYun cloud server, which enables better stability and scalability for managing computing resources.

Funding

We gratefully acknowledge financial support from the National Natural Science Foundation of China [81773634 to M.Z.], National Science & Technology Major Project ‘Key New Drug Creation and Manufacturing Programme’, China [2018ZX09711002 to H.J.] and the ‘Personalized Medicines—Molecular Signature-based Drug Discovery and Development’, Strategic Priority Research Programme of the Chinese Academy of Sciences [XDA12050201 to M.Z.].

Conflict of Interest: none declared.

References

Bienfait
 
B.
,
Ertl
P.
(
2013
)
JSME: a free molecule editor in JavaScript
.
J. Cheminform
.,
5
,
24.

Bland
 
J.M.
,
Altman
D.G.
(
2000
)
Statistics notes. The odds ratio
.
BMJ
,
320
,
1468.

Bora
 
A.
 et al.  (
2016
)
Predictive models for fast and effective profiling of kinase inhibitors
.
J. Chem. Inf. Model
.,
56
,
895
905
.

Chartier
 
M.
 et al.  (
2013
)
Kinome Render: a stand-alone and web-accessible tool to annotate the human protein kinome tree
.
PeerJ
,
1
,
e126.

Cheng
 
F.
(
2019
) In silico oncology drug repositioning and polypharmacology. In: Krasnitz,A. (ed.)
Cancer Bioinformatics Methods in Molecular Biology
, Vol. 1878.
Humana Press, New York, NY
, pp.
243
261
.

Cheng
 
F.
 et al.  (
2012
)
Prediction of chemical–protein interactions: multitarget-QSAR versus computational chemogenomic methods
.
Mol. BioSyst
.,
8
,
2373
2384
.

Cheng
 
F.
 et al.  (
2019
)
Network-based prediction of drug combinations
.
Nat. Commun
.,
10
,
1197.

Graczyk
 
P.P.
(
2007
)
Gini coefficient: a new way to express selectivity of kinase inhibitors against a family of kinases
.
J. Med. Chem
.,
50
,
5773
5779
.

Karaman
 
M.W.
 et al.  (
2008
)
A quantitative analysis of kinase inhibitor selectivity
.
Nat. Biotechnol
.,
26
,
127
132
.

Klaeger
 
S.
 et al.  (
2017
)
The target landscape of clinical kinase drugs
.
Science
,
358
, eaan4368.

Manallack
 
D.T.
 et al.  (
2002
)
Selecting screening candidates for kinase and G protein-coupled receptor targets using neural networks
.
J. Chem. Inf. Comput. Sci
.,
42
,
1256
1262
.

Martiny-Baron
 
G.
 et al.  (
2010
)
The small molecule specific EphB4 kinase inhibitor NVP-BHG712 inhibits VEGF driven angiogenesis
.
Angiogenesis
,
13
,
259
267
.

Merget
 
B.
 et al.  (
2017
)
Profiling prediction of kinase inhibitors: toward the virtual assay
.
J. Med. Chem
.,
60
,
474
485
.

Metz
 
J.T.
 et al.  (
2011
)
Navigating the kinome
.
Nat. Chem. Biol
.,
7
,
200
202
.

Niijima
 
S.
 et al.  (
2012
)
Dissecting kinase profiling data to predict activity and understand cross-reactivity of kinase inhibitors
.
J. Chem. Inf. Model
.,
52
,
901
912
.

Ramsundar
 
B.
 et al.  (
2017
)
Is multitask deep learning practical for pharma?
J. Chem. Inf. Model
.,
57
,
2068
2076
.

Stephan
 
C.S.
,
Steven
M.M.
(
2013
)
Kinome-wide activity modeling from diverse public high-quality data sets
.
J. Chem. Inf. Model
.,
53
,
27
38
.

Zhong
 
F.
 et al.  (
2018
)
Artificial intelligence in drug design
.
Sci. China Life Sci
.,
61
,
1191
1204
.

Author notes

The authors wish it to be known that, in their opinion, Zhaojun Li and Xutong Li authors should be regarded as Joint First Authors.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)
Associate Editor: Jonathan Wren
Jonathan Wren
Associate Editor
Search for other works by this author on:

Supplementary data