Abstract

Motivation

Given an unknown compound, is it possible to predict its Anatomical Therapeutic Chemical class/classes? This is a challenging yet important problem since such a prediction could be used to deduce not only a compound’s possible active ingredients but also its therapeutic, pharmacological and chemical properties, thereby substantially expediting the pace of drug development. The problem is challenging because some drugs and compounds belong to two or more ATC classes, making machine learning extremely difficult.

Results

In this article a multi-label classifier system is proposed that incorporates information about a compound’s chemical–chemical interaction and its structural and fingerprint similarities to other compounds belonging to the different ATC classes. The proposed system reshapes a 1D feature vector to obtain a 2D matrix representation of the compound. This matrix is then described by a histogram of gradients that is fed into a Multi-Label Learning with Label-Specific Features classifier. Rigorous cross-validations demonstrate the superior prediction quality of this method compared with other state-of-the-art approaches developed for this problem, a superiority that is reflected particularly in the absolute true rate, the most important and harshest metric for assessing multi-label systems.

Availability and implementation

The MATLAB code for replicating the experiments presented in this article is available at https://www.dropbox.com/s/7v1mey48tl9bfgz/ToolPaperATC.rar?dl=0.

Supplementary information

Supplementary data are available at Bioinformatics online.

1 Introduction

Being able to classify an unknown compound into its ATC (Anatomical Therapeutic Chemical) system is a problem that has significance for both drug development and basic research. Developed by the World Health Organization (WHO), the ATC system is a hierarchical classification system that categorizes compounds into 14 main groups: (i) alimentary tract and metabolism; (ii) blood and blood forming organs; (iii) cardiovascular system; (iv) dermatologicals; (v) genitourinary system and sex hormones; (vi) systemic hormonal preparations, excluding sex hormones and insulins; (vii) anti-infectives for systemic use; (viii) antineoplastic and immunomodulating agents; (ix) musculoskeletal system; (x) nervous system; (xi) antiparasitic products, insecticides and repellents; (xii) respiratory system; (xiii) sensory organs; and (xiv) various. In the last decade, many systems and webservers for predicting a compound’s ATC classification have been developed (Chen, 2012; Cheng et al., 2016, 2017; Dunkel et al., 2008). Some of these systems, such as (Dunkel et al., 2008) and (Wu et al., 2013), can only map a compound to one ATC label, even though there is mounting evidence indicating that many compounds in system biology and medicine can belong to more than one category. Therefore, mapping a compound to its ATC system is a multi-label problem, and predictive systems, as pointed out by (Chen, 2012), should address it as such.

In this work, we propose an ensemble composed of Multi-Label Learning with Label-Specific Features (LIFT) (Zhang and Wu, 2015) classifiers that are fed with Histograms of Gradients (HoG) descriptors extracted from 2D representations of a compound. The original 1D feature vector used to describe the compound is randomly sorted 50 times, generating 50 different 2D representations that are then used to train 50 LIFTs that are combined by sum rule. Finally, this ensemble is combined with LIFT trained on the original 1D feature vectors. Rigorous comparisons with other state-of-the-art approaches demonstrate the superiority of the proposed system.

The main strength of this approach lies in the adoption of 2D representations of patterns. Direct manipulation of matrices offers may advantages, with perhaps the most important being the possibility of extracting powerful state-of-the-art texture descriptors (Nanni et al., 2012). In this work, HoG was chosen as the descriptor so that the correlation among sets of features within a given neighborhood of a 2D representation could be investigated; this is different from coupling feature selection and classification.

As demonstrated by a series of recent publications (see, e.g. Chen et al., 2016a,b,, 2017; Cheng et al., 2017; Jia et al., 2016a,b; Liu et al., 2017; Qiu et al., 2016) and in compliance with Chou’s five-step rule (Chou, 2011), to establish a really useful statistical predictor for a biological or biomedical system, we should follow the following five guidelines: (i) construct or select a valid benchmark dataset to train and test the predictor; (ii) formulate the statistical samples with an effective mathematical expression that can truly reflect their intrinsic correlation with the target to be predicted; (iii) introduce or develop a powerful algorithm (or engine) to operate the prediction; (iv) properly perform cross-validation tests to objectively evaluate the anticipated accuracy of the predictor; and (v) establish a user-friendly web-server for the predictor that is accessible to the public. Below, we describe how to deal with these steps one-by-one.

2 Materials and methods

2.1 Benchmark dataset and original representation

To facilitate comparison, we use the benchmark dataset provided in (Chen, 2012), which contains a total of 3883 drugs that are divided nonexclusively into 14 ATC-classes, with 3295 belonging to only one class, 370 belonging to two classes, 110 belonging to three classes, 37 belonging to four classes, 27 belonging to five classes and 44 belonging to six classes. None of the drugs belong to seven or more classes. Because the ATC classification system is multiclass, for each class a 1 indicates that the drug belongs to the corresponding class and a 0 indicates that it does not.

The dataset can be formulated in set notation as the union of elements in each class: S=S1S2S14, and a sample D in S can be represented starting with three mathematical expressions reflecting their intrinsic correlation with the target to be predicted. First, via a 14D vector DInt=[Φ1 Φ2 Φ3  Φ14]T that represents its maximum interaction scoreΦi with the drugs in each of the 14 Si subsets (for details, see Kanehisa et al., 2004; Kotera et al., 2012). These scores can be downloaded in Supplementary Material S4 for (Chen, 2012). Second, via a 14D vector DStrSim=[Ψ1 Ψ2 Ψ3  Ψ14]T that represents the maximum structural similarity scoreΨi in the 14 subsets (for details, see Kotera et al., 2012). These scores can be downloaded in Supplementary Material S5 for (Chen, 2012). Third, via a 14D vector DFigSim=[T1 T2 T3  T14]T that represents its molecular fingerprint similarity scoreTi in the 14 subsets (for details, see Xiao et al., 2013). These scores can be downloaded in Supplementary Material S6 for (Chen, 2012).

A given drug sample is formulated by:
D=DInt  DStrSim  DFigSim=[@1 @2@42]T
(1)
where is the orthogonal sum and where
 @u=Φu (1u14)Ψu (15u28)Tu 29 u42
(2)

2.2 Vector-to-matrix operation

Reshaping the original vector D into a matrix to investigate the correlation among sets of features in a given neighborhood can be accomplished as follows: starting with the original input vector D and letting MRd1×d2 be the output matrix, where d1=d2=u0.5 (u = 42 here, see Section 2.1) and M be a random rearrangement of the original vector into a square matrix, then each entry of matrix M can be formulated as an element of D(a), such that aR42 is a random permutation of [1…u].

A simple approach for improving performance is to perform the reshaping n times (n = 50 here) by randomly sorting D. For each extracted descriptor, a different LIFT (see Section 2.4) is trained, with the results combined by sum rule.

2.2 Histograms of gradients

The matrix M is described by HoG (Dalal and Triggs, 2005), which is implemented by dividing a matrix into small spatial regions, or cells (5 × 6 here), where each cell accumulates a local 1D histogram of gradient directions over the values contained in each cell. Simple 1D [−1,0,1] masks with a smoothing scale of σ  = 0 are used in this step, and each cell value calculates a weighted vote for an edge orientation based on the orientation of the gradient element that is centered on it; votes are accumulated into orientation bins (nine here) evenly spaced over 0°–180° over the cells. These values are then normalized by accumulating a measure of local histogram energy over larger spatial regions, or blocks. This measure in turn is used to normalize all the cells within the block. The concatenated histogram entries of these normalized descriptor blocks are the descriptors used to represent M. In this work, normalization was performed on the whole descriptor using the L2 norm.

2.3 Multi-label learning with label-specific features

The HoG descriptors are fed into a LIFT (Zhang and Wu, 2015) classifier, which is a two-step multi-label learning method that, in step 1, constructs features specific to each label using clustering analysis on both positive and negative instances, training and testing them by querying the clustering results; and that, in step two, induces a family of f classifiers, where each member is generated from label specific features that are different from the original ones.

Given a set of n multi-label training samples S={(xi, Yi)|1kn}, where xiX is a d-dimensional feature vector and YiY is a set of labels associated with xi, then step one constructs features specific to each label by investigating the underlying properties of the training instances with respect to each class label. For one class lkY, the set of positive training instances Pk and negative training instance Nk correspond to:
 Pk={xi|(xi,Yi)S,lkYi
(3)
 Nk={xi|(xi,Yi)S,lkYi.
(4)
A k-means clustering is performed on Pk and Nk, each with nk clusters defined as follows:
nk=rmin(Pk, Nk)
(5)
The operator returns the cardinality, and r[0,1] is a ratio parameter controlling the number of clusters retained. A mapping ϕk from the original d-dimensional input space X to the 2nk-dimensional label-specific space is then created.
In step two a family of f classifiers {g1, g2,  gf} are induced with the label specific features. As the base we use the Support Vector Machine. For each class label lkY, a new binary training set Bk with n samples is created from the original multi-label set S. A given classifier gn is induced by invoking a binary learner on Bk. Given an unseen sample sX, its associated label set is predicted as:
Y={lk|gkϕks>0,1kf}
(6)
If the similarity of a given pattern of a given class is higher than a cutoff value θ, then the pattern is assigned to that class. For more details, see (Zhang and Wu, 2015).

3 Results and discussion

3.1 Set of five metrics for multi-label systems

Metrics for multi-label systems are defined according to five formulations established in (Chou, 2013) and used extensively in the literature:
Aiming=1Nk=1NLkLk*Lk*
(7)
Coverage=1Nk=1NLkLk*Lk
(8)
Accuracy=1Nk=1NLkLk*LkLk*
(9)
Absolute True=1Nk=1NΔLk,Lk*
(10)
Absolute False=1Nk=1NLkLk*-LkLk*M
(11)
where N is the total number of samples under consideration, M the total number of labels in the problem, is the operator acting on the set so as to count the number of its elements, is the set union operator, Lk is the subset containing all the labels observed by experiments for the kth sample, Lk* is the subset containing all the labels predicted for the kth sample, and
k=1NΔLk,Lk*=1, if all labels in Lk* are identical to those in Lk 0, otherwise 
(13)

3.2 Cross-validation

Many cross-validation methods are used for statistical prediction, the most common being (i) independent test set, (ii) subsampling (K-fold cross-validation) and (iii) the jackknife test, which is considered the least arbitrary method and one that yields a unique outcome for a benchmark dataset (Chou, 2011). Accordingly, the jackknife test is used in this study.

3.3 Performance of the proposed method and comparisons with the literature

Our ensemble and the LIFT classifier contains the parameter θ for assigning a pattern to a given class, and the predicted results obtained by the classifier will depend on this parameter's value. If the similarity of a given pattern of a given class is higher than θ, then the pattern is assigned to that class. In Figure 1 we report the absolute true performance indicator obtained by varying weight θ in the following:

Fig. 1.

Comparison among LIFT-based predictors

  1. LIFT trained using original 1D feature vector (dashed line): θ {0.2 0.35 0.50 0.65 0.80 0.95}.

  2. Ensemble of the 50 LIFTs (combined by sum rule) trained using HOG (dotted line): θ {10 12 14 16 18 20}.

  3. Fusion by sum rule between 1 above, weighted by 50, and 2 above (solid line): θ {20 24 28 32 36 40}. We call this approach EnsLIFT.

Clearly, EnsLIFT is the most stable and best performing method, its superiority indicated by considering other performance indicators (see Table 1).

Table 1.

Comparison with other state-of-the-art multi-label predictors

MethodAimingCoverageAccuracyAbsolute trueAbsolute False
EnsLIFT (θ = 30) 78.18% 75.77% 71.21% 63.30% 2.85% 
LIFT 83.93% 68.18% 67.78% 61.11% 3.13% 
iATC-mISF 67.83% 67.10% 66.41% 60.98% 5.85% 
(Chen 201250.76% 75.79% 49.38% 13.83% 8.83% 
ML-KNN 79.96% 56.70% 59.10% 54.16% — 
RankSVM 60.26% 52.89% 45.49% 35.77% — 
MethodAimingCoverageAccuracyAbsolute trueAbsolute False
EnsLIFT (θ = 30) 78.18% 75.77% 71.21% 63.30% 2.85% 
LIFT 83.93% 68.18% 67.78% 61.11% 3.13% 
iATC-mISF 67.83% 67.10% 66.41% 60.98% 5.85% 
(Chen 201250.76% 75.79% 49.38% 13.83% 8.83% 
ML-KNN 79.96% 56.70% 59.10% 54.16% — 
RankSVM 60.26% 52.89% 45.49% 35.77% — 
Table 1.

Comparison with other state-of-the-art multi-label predictors

MethodAimingCoverageAccuracyAbsolute trueAbsolute False
EnsLIFT (θ = 30) 78.18% 75.77% 71.21% 63.30% 2.85% 
LIFT 83.93% 68.18% 67.78% 61.11% 3.13% 
iATC-mISF 67.83% 67.10% 66.41% 60.98% 5.85% 
(Chen 201250.76% 75.79% 49.38% 13.83% 8.83% 
ML-KNN 79.96% 56.70% 59.10% 54.16% — 
RankSVM 60.26% 52.89% 45.49% 35.77% — 
MethodAimingCoverageAccuracyAbsolute trueAbsolute False
EnsLIFT (θ = 30) 78.18% 75.77% 71.21% 63.30% 2.85% 
LIFT 83.93% 68.18% 67.78% 61.11% 3.13% 
iATC-mISF 67.83% 67.10% 66.41% 60.98% 5.85% 
(Chen 201250.76% 75.79% 49.38% 13.83% 8.83% 
ML-KNN 79.96% 56.70% 59.10% 54.16% — 
RankSVM 60.26% 52.89% 45.49% 35.77% — 

The final cross-validation results (using the jackknife test) are shown in Table 1.

To facilitate comparisons, the corresponding results obtained by iATC-mISF (Cheng et al., 2016) and by the prediction method in (Chen, 2012) are also provided. To demonstrate the power of the EnsLIFT predictor, we extend the comparison to cover two state-of-the-art predictors: Multi-label K-Nearest Neighbor (ML-KNN) (Li et al., 2012) and Support Vector Machine (RankSVM) (Lee and Lin, 2014). Although not developed specifically for the ATC system prediction problem, both these methods were designed to handle multi-label systems. The absolute-true and absolute-false metrics are the two most important and harshest metrics for multi-label systems. For the absolute-true metric, the higher the percentage the better the multi-label predictor's performance; for the absolute-false metric, the reverse is the case: the lower the percentage the better the performance. The results in Table 1 clearly demonstrate that the proposed predictor is a powerful method for identifying ATC classes for unknown compounds.

In Table 2 we report the performance obtained using a 5-fold cross validation. As expected this result in a slight decrease in performance. Nonetheless, EsnLIFT still outperforms LIFT.

Table 2.

Performance obtained using a 5-fold cross validation

MethodAimingCoverageAccuracyAbsolute trueAbsolute False
EnsLIFT (θ = 30) 79.52% 73.22% 69.74% 62.46% 2.93% 
LIFT 83.38% 66.51% 66.09% 59.46% 3.30% 
MethodAimingCoverageAccuracyAbsolute trueAbsolute False
EnsLIFT (θ = 30) 79.52% 73.22% 69.74% 62.46% 2.93% 
LIFT 83.38% 66.51% 66.09% 59.46% 3.30% 
Table 2.

Performance obtained using a 5-fold cross validation

MethodAimingCoverageAccuracyAbsolute trueAbsolute False
EnsLIFT (θ = 30) 79.52% 73.22% 69.74% 62.46% 2.93% 
LIFT 83.38% 66.51% 66.09% 59.46% 3.30% 
MethodAimingCoverageAccuracyAbsolute trueAbsolute False
EnsLIFT (θ = 30) 79.52% 73.22% 69.74% 62.46% 2.93% 
LIFT 83.38% 66.51% 66.09% 59.46% 3.30% 

In Figure 2 we report the absolute-true performance of EnsLIFT obtained by varying the value of N from 10 to 75. Cleary, the performance increases with higher values of N. For this reason, we compare in Table 4 below the performance obtained by a fusion of LIFT trained using the original 1D feature vectors and an ensemble of the 75 LIFTs (combined by sum rule) trained using HOG. This new fusion is labeled EnsLIFTl.

Table 3.

Performance obtained varying the parameters of HoG

number of cellsnumber of binsAbsolute True
5 × 6 61.73 
5 × 6 53.59 
5 × 6 11 57.30 
4 × 5 61.50 
6 × 7 62.22 
7 × 8 62.30 
8 × 9 62.07 
number of cellsnumber of binsAbsolute True
5 × 6 61.73 
5 × 6 53.59 
5 × 6 11 57.30 
4 × 5 61.50 
6 × 7 62.22 
7 × 8 62.30 
8 × 9 62.07 
Table 3.

Performance obtained varying the parameters of HoG

number of cellsnumber of binsAbsolute True
5 × 6 61.73 
5 × 6 53.59 
5 × 6 11 57.30 
4 × 5 61.50 
6 × 7 62.22 
7 × 8 62.30 
8 × 9 62.07 
number of cellsnumber of binsAbsolute True
5 × 6 61.73 
5 × 6 53.59 
5 × 6 11 57.30 
4 × 5 61.50 
6 × 7 62.22 
7 × 8 62.30 
8 × 9 62.07 
Table 4.

Performance obtained using n = 75 versus n = 50 and a fine tuning of the HoG parameters

MethodAimingCoverageAccuracyAbsolute trueAbsolute False
EnsLIFThog (θ = 45) 78.15% 75.81% 71.14% 63.25% 2.80% 
EnsLIFTl (θ = 45) 78.19% 75.98% 71.31% 63.40% 2.80% 
EnsLIFT (θ = 30) 78.18% 75.77% 71.21% 63.30% 2.85% 
MethodAimingCoverageAccuracyAbsolute trueAbsolute False
EnsLIFThog (θ = 45) 78.15% 75.81% 71.14% 63.25% 2.80% 
EnsLIFTl (θ = 45) 78.19% 75.98% 71.31% 63.40% 2.80% 
EnsLIFT (θ = 30) 78.18% 75.77% 71.21% 63.30% 2.85% 
Table 4.

Performance obtained using n = 75 versus n = 50 and a fine tuning of the HoG parameters

MethodAimingCoverageAccuracyAbsolute trueAbsolute False
EnsLIFThog (θ = 45) 78.15% 75.81% 71.14% 63.25% 2.80% 
EnsLIFTl (θ = 45) 78.19% 75.98% 71.31% 63.40% 2.80% 
EnsLIFT (θ = 30) 78.18% 75.77% 71.21% 63.30% 2.85% 
MethodAimingCoverageAccuracyAbsolute trueAbsolute False
EnsLIFThog (θ = 45) 78.15% 75.81% 71.14% 63.25% 2.80% 
EnsLIFTl (θ = 45) 78.19% 75.98% 71.31% 63.40% 2.80% 
EnsLIFT (θ = 30) 78.18% 75.77% 71.21% 63.30% 2.85% 

Fig. 2.

Absolute true performance obtained varying the value of N

In Table 3, we report the performance obtained by varying the parameters of HoG (with n = 50) and note that higher performance is obtained by increasing the number of cells.

The results in Table 3 motivated us to run a test using n = 75 coupled with the best parameter settings of HoG, which we label EnsLIFThog. As reported in Table 4, we find that the absolute true performance increases to 62.25. The performance obtained by EnsLIFThog is similar to EnsLIFTl where the best parameter settings of HoG are used. The improvement of EnsLIFTl and EnsLIFThog, with respect to EnsLIFT, is negligible.

We validate results by checking the error independence using the well-known Yule’s Q-statistic (Kuncheva and Whitaker, 2003), where the values of Q are bounded by [−1,1]. Classifiers that recognize the same patterns correctly have a value of Q that is greater than zero; whereas those that commit errors on different patterns have a Q value that is less than zero. The Q-statistic between LIFT trained using the original 1D feature vectors and an ensemble of 50 LIFTs (combined by sum rule) trained using HOG is 0.9374, thereby demonstrating that different descriptors train a set of partially uncorrelated classifiers. For this reason we can conclude with confidence that the fusion EnsLIFT outperforms LIFT trained using the original 1D feature vectors.

4 Conclusion

The focus of this paper was to find a good method for predicting a compound’s ATC class/classes. In the future we plan on testing this method on multiple benchmark datasets. Moreover, as pointed out in (Chou and Shen, 2009) and emphasized and demonstrated in a series of recent publications (see, e.g. Chen et al., 2016a,b), user-friendly and publicly accessible web-servers represent the future direction for practically developing a more useful predictor. As such, we shall make efforts in our future work to provide a web-server for the prediction method presented in this article.

Conflict of Interest: none declared.

References

Chen
 
J.
 et al.  (
2016a
)
DRHP-PseRA: detecting remote homology proteins using profile-based pseudo protein sequence and rank aggregation
.
Sci. Rep
.,
6
,
32333.

Chen
 
L.
(
2012
)
Predicting anatomical therapeutic chemical (ATC) classification of drugs by integrating chemical-chemical interactions and similarities
.
PLoS One
,
7
,
e35254.

Chen
 
W.
 et al.  (
2017
)
iRNA-AI: identifying the adenosine to inosine editing sites in RNA sequences
.
Oncotarget
,
8
,
4208
4217
.

Chen
 
W.
 et al.  (
2016b
)
iRNA-PseU: Identifying RNA pseudouridine sites
.
Mol. Ther. Nucleic Acids
,
5
,
e332.

Cheng
 
X.
 et al.  (
2016
)
iATC-mISF: a multi-label classifier for predicting the classes of anatomical therapeutic chemicals
.
Bioinformatics
,
33
,
341
346
.

Cheng
 
X.
 et al.  (
2017
) iATC-mHyb: a hybrid multi-label classifier for predicting the classification of anatomical therapeutic chemicals, Oncotarget, doi: 10.18632/oncotarget.17028.

Chou
 
K.-C.
(
2011
)
Some remarks on protein attribute prediction and pseudo amino acid composition
.
J. Theor. Biol
.,
273
,
236
247
.

Chou
 
K.-C.
,
Shen
H.B.
(
2009
)
Review: Recent advances in developing web-servers for predicting protein attributes
.
Nat. Sci
.,
2
,
63
92
.

Chou
 
K.C.
(
2013
)
Some remarks on predicting multi-label attributes in molecular biosystems
.
Mol Biosys
,
9
,
10922
11100
.

Dalal
 
N.
,
Triggs
B.
(
2005
) Histograms of oriented gradients for human detection. The 9th European Conference on Computer Vision. San Diego, CA.

Dunkel
 
M.
 et al.  (
2008
)
SuperPred: update on drug classification and target prediction, SuperPred: drug classification and target prediction
.
Nucleic Acids Res
.,
36
,
W55
W59
.

Jia
 
J.
 et al.  (
2016a
)
iSuc-PseOpt: Identifying lysine succinylation sites in proteins by incorporating sequence-coupling effects into pseudo components and optimizing imbalanced training dataset
.
Anal. Biochem
.,
497
,
48
56
.

Jia
 
J.
 et al.  (
2016b
)
pSuc-Lys: Predict lysine succinylation sites in proteins with PseAAC and ensemble random forest approach
.
J. Theor. Biol
.,
394
,
223
230
.

Kanehisa
 
M.
 et al.  (
2004
)
The KEGG resources for deciphering the genome
.
Nucleic Acids Res
.,
32
,
D277
D280
.

Kotera
 
M.
 et al.  (
2012
)
The KEGG databases and tools facilitating omics analysis: latest developments involving human diseases and pharmaceuticals
.
Methods Mol. Biol
.,
802
,
19
39
.

Kuncheva
 
L.I.
,
Whitaker
C.J.
(
2003
)
Measures of Diversity in Classifier Ensembles and their Relationship with the ensemble accuracy
.
Mach. Learn
.,
51
,
181
207
.

Lee
 
C.P.
,
Lin
C.J.
(
2014
)
Large-scale linear rankSVM
.
Neural Comput
.,
26
,
781
817
.

Li
 
G.-Z.
 et al.  (
2012
)
Intelligent ZHENG classification of Hypertension depending on ML-kNN and information fusion
.
Evid. Based Complement Alternat. Med
., June,
837245.

Liu
 
B.
 et al.  (
2017
)
iRSpot-EL: identify recombination spots with an ensemble learning approach
.
Bioinformatics
,
33
,
35
41
.

Liu
 
Z.
 et al.  (
2016
)
pRNAm-PC: Predicting N-methyladenosine sites in RNA sequences via physical-chemical properties
.
Anal. Biochem
.,
497
,
60
67
.

Meher
 
P.K.
 et al.  (
2017
)
Predicting antimicrobial peptides with improved accuracy by incorporating the compositional, physico-chemical and structural features into Chou’s general PseAAC
.
Sci. Rep
.,
7
,
42362.

Nanni
 
L.
 et al.  (
2012
)
Matrix representation in pattern classification
.
Expert Syst. Appl
.,
39
,
3031
3036
.

Qiu
 
W.R.
 et al.  (
2016
)
iPTM-mLys: identifying multiple lysine PTM sites and their different types
.
Bioinformatics
,
32
,
3116
3123
.

Wu
 
L.
 et al.  (
2013
)
Relating anatomical therapeutic indications by the ensemble similarity of drug sets
.
J. Chem. Inf. Mod
.,
53
,
2154
2160
.

Xiao
 
X.
 et al.  (
2013
)
iCDI2PseFpt: Identify the channel2drug interaction in cellular networking with PseAAC and molecular fingerprints
.
J. Theor. Biol
.,
337
,
71
79
.

Zhang
 
M.-L.
,
Wu
L.
(
2015
)
Lift: multi-label learning with label-specific features
.
IEEE Trans. Pattern Anal. Mach. Intell
.,
37
,
107
120
.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/about_us/legal/notices)
Associate Editor: John Hancock
John Hancock
Associate Editor
Search for other works by this author on:

Supplementary data