-
PDF
- Split View
-
Views
-
Cite
Cite
Na Le Dang, Tyler B. Hughes, Varun Krishnamurthy, S. Joshua Swamidass, A simple model predicts UGT-mediated metabolism, Bioinformatics, Volume 32, Issue 20, October 2016, Pages 3183–3189, https://doi.org/10.1093/bioinformatics/btw350
- Share Icon Share
Abstract
Motivation: Uridine diphosphate glucunosyltransferases (UGTs) metabolize 15% of FDA approved drugs. Lead optimization efforts benefit from knowing how candidate drugs are metabolized by UGTs. This paper describes a computational method for predicting sites of UGT-mediated metabolism on drug-like molecules.
Results: XenoSite correctly predicts test molecule’s sites of glucoronidation in the Top-1 or Top-2 predictions at a rate of 86 and 97%, respectively. In addition to predicting common sites of UGT conjugation, like hydroxyl groups, it can also accurately predict the glucoronidation of atypical sites, such as carbons. We also describe a simple heuristic model for predicting UGT-mediated sites of metabolism that performs nearly as well (with, respectively, 80 and 91% Top-1 and Top-2 accuracy), and can identify the most challenging molecules to predict on which to assess more complex models. Compared with prior studies, this model is more generally applicable, more accurate and simpler (not requiring expensive quantum modeling).
Availability and implementation: The UGT metabolism predictor developed in this study is available at http://swami.wustl.edu/xenosite/p/ugt.
Contact: [email protected]
Supplementary information: Supplementary data are available at Bioinformatics online.
1 Introduction
Uridine diphosphate glucuronosyltransferases (UGTs) are an important family of proteins that metabolize 15% of FDA-approved drugs (Williams et al., 2004). UGTs conjugate glucuronic acid to a diverse range of molecules, rendering them more hydrophilic and more easily eliminated. Specifically, the glucuronic acid can conjugate to oxygens, nitrogens, sulfurs or carbons, in order of decreasing likelihood (Fig. 1).

Four types of UGT catalyzed reactions. UGTs attach glucoronides to molecules to detoxify them and make them easier to excrete. Glucoronides can be attached to several atom types in a molecule, for example (in order of decreasing likelihood) oxygens, nitrogens, sulfurs and carbons. Specific examples, from the database, of each of these conjugation reactions is displayed. The site of conjugation is circled in the parent molecules
Understanding and modeling UGT metabolism is important because conjugation can affect the safety and efficacy of drugs. For example, a genetic polymorphism that inactivates a specific UGT significantly increases the bone marrow toxicity of irinotecan, by preventing its primary route of elimination (Carlini et al., 2005). Similarily, genetic polymorphisms that increase expression of a specific UGT reduce the efficacy of atorvastatin, a commonly used HMG CoA reductase inhibitor (Prueksaritanont et al., 2002; Riedmaier et al., 2010). More commonly, drug candidates are optimized to ensure that they are metabolically stable and, therefore, not too rapidly eliminated (Kumar and Surapaneni, 2001).
Identifying the specific atoms of a candidate drug that are glucoronidated during UGT-mediated metabolism—its sites of metabolism (SOMs)—is valuable to lead optimization efforts. Knowing the SOMs of a candidate drug allows medicinal chemists to design new compounds with improved bioavailability. Computational methods that predict UGT-mediated SOMs for drug-like molecules are just now beginning to appear. So far, the literature reports two approaches to predicting UGT SOMs (Peng et al., 2014; Rudik et al., 2015). Both methods use machine learning to encode the chemical and biological rules represented in a set of reactions extracted from the Accelrys Metabolite Database (AMD).
Here, we present two models to predict UGT metabolism. First, we propose a simple heuristic model (based on global statistics) to predict UGT metabolism. This simple model correctly predicts the site of UGT conjugation in >80% of molecules. This heuristic model provides both a baseline of performance against which other methods can be compared, and also a method for identifying specific molecules in the dataset that are difficult to predict without modeling. Second, we introduce XenoSite UGT, an adaptation of an algorithm previously developed by our group to predict drug metabolism by cytochrome P450s (Zaretzki et al., 2013). The XenoSite UGT model uses a similar approach as existing methods, by learning rules from a training dataset derived from the AMD. We built the XenoSite model using 2898 unique substrates containing 4557 glucoronidation reactions, 3.2 times more reactions than used in previous models.
XenoSite improves on existing approaches, including the simple heuristic model, in several ways. First, it improves on the SOM-UGT model described by Peng et al. (2014), who used support vector machines to build four independent classification models to differentiate between observed and non-observed SOMs for common substructures vulnerable to UGT-mediated metabolism—aromatic and aliphatic hydroxyls, carboxylic acids and nitrogen containing groups. Unlike XenoSite, SOM-UGT cannot predict glucuronidation of less common or atypical substructures, such as ketones, thiols, or amides. Moreover, Peng et al. (2014) do not describe or test any strategies for combining the predictions of their four independent models into a single set of predictions for a given molecule. This is a critical shortcoming, because in practice individual molecules are being evaluated for metabolic soft spots, and there is no way to know the performance of SOM-UGT at correctly identifying these soft spots on a per-molecule basis. In contrast, XenoSite is a single model that produces predictions for all atoms in molecule, and can even identify atypical sites of conjugation accurately. Second, XenoSite improves on the Site of Metabolism Prediction (SOMP) model described by (Rudik et al., 2015), which uses fingerprint descriptors in a naïve Bayes classifier to rank all atoms in a molecule by their likelihood of being conjugated by UGTs. However, SOMP was only tested on small, uncharged molecules, and was not evaluated on difficult to predict molecules. On these challenging molecules, XenoSite was significantly more accurate than SOMP, SOM-UGT and the heuristic model.
2 Materials and methods
2.1 Training data
The AMD was used to build a dataset of UGT substrates with their annotated sites of glucoronidation. Each reaction in the AMD contains reactant and product molecular structures, the catalyzing enzyme, and the species involved in the reaction. We extracted 4325 human reactions labeled as glucoronidation by the AMD. To confirm the AMD’s classification, we used SMARTS string matching to check that each reaction product included an added glucoronide attached to an oxygen, nitrogen, carbon, or sulfur atom. UGT-mediated SOMs were identified through analysis of reactant and product structures to identify which reactant atom(s) are glucoronidated. In total, 2839 unique substrates containing 3340 SOMs were curated from the AMD. All atoms in each molecule were classified into one of four substructure groups, matching those in Peng et al. (2014). These groups were aliphatic hydroxyls (AlOH), aromatic hydroxyls (ArOH), carboxylic acids (COOH) and nitrogen containing sites (Nitrogen). The Nitrogen group contained aromatic and aliphatic amino nitrogens, and aromatic and aliphatic heterocyclic nitrogens. All remaining oxygens, nitrogens, sulfurs and carbons were added to the Atypical group. For example, nitrogens in sulfoamides, urea and carboxamides were included in the Atypical group because they are only rarely conjugated. The exact definitions of these groups are included in the Supplementary Materials. As expected, commonly glucoronidated substructures (Fig. 2) were much more common than atypical substructures (Fig. 3), and there were enough examples to model even the atypical sites.

The propensity of commonly glucoronidated chemical groups to undergo UGT-mediated metabolism in the data used in this study. UGT metabolism prediction models are trained on 4557 glucuronidation reactions from 2839 unique substrates extracted from the AMD. Example substrates shown above contain of commonly glucoronidated chemical groups—AlOH, ArOH, COOH and nitrogen containing groups (Nitrogen)—in small dashed-circles. Experimentally observed SOM are circled in black. In our data, the AlOH, ArOH, COOH, Nitrogen and Atypical groups are conjugated, respectively, 49.6, 76.3, 80.0, 8.5 and 0.15% of the time

Atypical chemical groups undergoing UGT-mediated glucorondiation. In rare instances, ethers, ketones, ureas, carboxamides, sulfoamides, thiols and tertiary carbons are chemical groups that are glucoronidated. These chemical groups are collectively referred to as Atypical group in the text. Across 54965 atypical sites within thousands of molecules, only 85 positive were identified, where UGT was conjugated with oxygen, nitrogen, sulfur and carbon, respectively, 29, 51, 2 and 3 times. Examples of these atypicals are shown with sites of UGT metabolism circled
2.2 External testing data
Three external datasets were used to assess the generalizability of models built on the training data. The first testing set contains 141 unique UGT substrates recently added to AMD (January 2015 version) that are not in our training dataset. The second and third test sets respectively composed of 54 and 20 substrates were used as validation sets by Rudik et al. (2015) and Peng et al. (2014).
2.3 Heuristic model
We constructed a simple heuristic model based on overall database statistics. This model was useful for two reasons. First, it provided a baseline of performance that more complex methods should outperform. Second, the molecules it cannot predict were good test cases for more complicated algorithms. In this heuristic model, all the potential sites of UGT metabolism in a test molecule were identified. Each potential site was labeled by its overall probability of being metabolized in the database. Matching these probabilities, the AlOH, ArOH, COOH, Nitrogen and Atypical groups were assigned the initial scores of, respectively, 49.6, 76.3, 80.0, 8.5 and 0.14%. Next, these scores were summed across the whole molecule to compute a normalization term. Finally, the initial scores were divided by this normalization term to yield the final score. This score sums to one in molecules that have at least one potential site. The atoms of a molecule with no potential sites all receive a score of zero. A Python implementation of this model is included in the Supplementary Materials to facilitate future studies.
2.4 Descriptors
A vector of numerical descriptors represented each atom in a test molecule; 98 descriptors in total, 62, 20, and 16 of which encode topological-, molecule- and quantum chemical-derived chemical information, respectively. These descriptors were chosen because they are effective for modeling P450 metabolism in our prior work (Zaretzki et al., 2013). Descriptors were computed using in-house software applied to SDF files with 3D coordinates (generated using Open Babel) and explicit hydrogens (OLBoyle et al., 2011). We added an additional set of 7 statistical descriptors based on the heuristic model. Five of these descriptors encoded the number of certain substructural groups—AlOH, ArOH, COOH, Nitrogen (NR3, NHR2, NH2R) and thiols—that are contained in a given substrate. All atoms of the same molecule had the same values for these descriptors. Two atom-level descriptors, heuristic score (see Section 2.3) and the number of topological equivalent atoms was also added. A full description of all descriptors used in this work is available in the Supplementary Materials.
2.5 Machine learning models
A matrix of descriptor encoded atoms was presented to a neural network with 10 hidden nodes. For comparison, we also trained a logistic regressor on the same data. During training, we learn a mapping between the descriptor values of each atom and the binary experimental response of that atom, metabolized or not metabolized. Also, atoms in the dataset were weighted so that the less common elements were equally important to the model’s error function as the most common atoms. The weights of the model were calibrated during training by performing gradient descent on the cross-entropy error of the difference between the predicted and actual response values of each atom. Predictions were obtained using a leave-one-molecule-out cross-validation procedure. Each molecule was predicted by an independent model trained using the remaining molecules of the training set.
2.6 Performance metrics
Two different metrics were used to evaluate the prediction accuracy. The first was the Top-N metric, which considered a substrate correctly predicted if any of its experimentally observed SOMs were predicted in the top N rank-positions out of all possible SOMs of the substrate. Tied scores are handled appropriately, by averaging Top-N scores over all permutations of the tied sites within each molecule. The Top-2 metric is the standard metric for evaluation CYP site of metabolism models and is sensible in this context as well (Rydberg et al., 2010; Singh et al., 2003). We used a paired t-test to compute the statistic significance between Top-N accuracies. The second metric was the area under the ROC curve (AUC) (Swamidass et al., 2010) of the predictions of metabolized and not metabolized atoms that all belong to the same chemical group: AlOH, ArOH, COOH, Nitrogen or Atypical. This is a standard metric employed in machine learning, as well as by Peng et al. (2014), to determine how well a method is able to distinguish between positives and negative test cases. This metric does not measure the within-molecule accuracy, which is much more important for this specific application. At the same time, it exposes which types of sites are best predicted. We used fisher’s exact test to compute the significance of ROC differences, choosing the score cutoff at the point closest to the upper left corner of the ROC plot. All significance tests used a threshold of 0.05.
3 Results and discussion
3.1 Accuracy in identifying SOMs
XenoSite model very accurately predicted UGT metabolism, and the heuristic model was nearly as accurate. XenoSite model, using a neural network, had cross-validated Top-1, Top-2 and Top-3 accuracies of 86.1, 97.1 and 98.9% for the training set (Table 1 and Fig. 4). The AUC accuracies of XenoSite for AlOH, ArOH, COOH, Nitrogen, and Atypical chemical groups were, respectively, 87.1, 83.1, 87.5, 93.3 and 98.4%. It appears that modeling UGT metabolism is easy for most molecules; according to the Top-2 metric, the XenoSite models were 10% more accurate than CYP models, and the heuristic model alone was 91% accurate. According to the Top-1, Top-2 and Top-3 metrics the performance difference between XenoSite and the baseline heuristic was respectively 6, 5 and 4%. Likewise, the accuracy of the neural network and logistic regressor models differed by less than one percent for all performance metrics, but this improvement was consistent so we decided to continue using the neural network.

Comparison between neural network, logistic regression and heuristic models. Leave-one out cross-validated prediction accuracies on the training dataset are shown. Across all performance metrics, the neural network performs best, followed closely by the logistic regressor, and finally by the heuristic method
Leave-one out cross-validated prediction accuracies on the training dataset
. | Heuristic . | Logistic . | Neural net . |
---|---|---|---|
Top-1 | 79.9 | 85.5 | 86.1 |
Top-2 | 91.6 | 96.3 | 97.1 |
Top-3 | 94.7 | 98.4 | 98.9 |
AlOH-AUC | 84.6 | 86.4 | 87.1 |
ArOH-AUC | 80.0 | 81.3 | 83.1 |
COOH-AUC | 85.4 | 86.8 | 87.5 |
Nitrogen-AUC | 87.8 | 90.8 | 93.3 |
Atypical-AUC | 86.7 | 98.6 | 98.4 |
. | Heuristic . | Logistic . | Neural net . |
---|---|---|---|
Top-1 | 79.9 | 85.5 | 86.1 |
Top-2 | 91.6 | 96.3 | 97.1 |
Top-3 | 94.7 | 98.4 | 98.9 |
AlOH-AUC | 84.6 | 86.4 | 87.1 |
ArOH-AUC | 80.0 | 81.3 | 83.1 |
COOH-AUC | 85.4 | 86.8 | 87.5 |
Nitrogen-AUC | 87.8 | 90.8 | 93.3 |
Atypical-AUC | 86.7 | 98.6 | 98.4 |
For each metric, the highest performance is bold, along with any scores not statistically different from the best performance (using a P-value cutoff of 0.05). XenoSite is the best performing models across all metrics. The number of metabolized to nonmetabolized AlOH, ArOH, COOH, Nitrogen and Atypical sites are, respectively, 925/937, 1551/480, 507/127, 272/2940 and 85/54880.
Leave-one out cross-validated prediction accuracies on the training dataset
. | Heuristic . | Logistic . | Neural net . |
---|---|---|---|
Top-1 | 79.9 | 85.5 | 86.1 |
Top-2 | 91.6 | 96.3 | 97.1 |
Top-3 | 94.7 | 98.4 | 98.9 |
AlOH-AUC | 84.6 | 86.4 | 87.1 |
ArOH-AUC | 80.0 | 81.3 | 83.1 |
COOH-AUC | 85.4 | 86.8 | 87.5 |
Nitrogen-AUC | 87.8 | 90.8 | 93.3 |
Atypical-AUC | 86.7 | 98.6 | 98.4 |
. | Heuristic . | Logistic . | Neural net . |
---|---|---|---|
Top-1 | 79.9 | 85.5 | 86.1 |
Top-2 | 91.6 | 96.3 | 97.1 |
Top-3 | 94.7 | 98.4 | 98.9 |
AlOH-AUC | 84.6 | 86.4 | 87.1 |
ArOH-AUC | 80.0 | 81.3 | 83.1 |
COOH-AUC | 85.4 | 86.8 | 87.5 |
Nitrogen-AUC | 87.8 | 90.8 | 93.3 |
Atypical-AUC | 86.7 | 98.6 | 98.4 |
For each metric, the highest performance is bold, along with any scores not statistically different from the best performance (using a P-value cutoff of 0.05). XenoSite is the best performing models across all metrics. The number of metabolized to nonmetabolized AlOH, ArOH, COOH, Nitrogen and Atypical sites are, respectively, 925/937, 1551/480, 507/127, 272/2940 and 85/54880.
At the same time, the neural network identified Atypical and Nitrogen sites much better than the heuristic model. About 80% of the training set molecules were very easy, and predicted correctly by the heuristic model. The remaining molecules were difficult, but often predicted correctly by XenoSite (Fig. 5).

XenoSite predictions. Example molecules are shown with their experimentally observed UGT SOM circled in black. Four commonly glucoronidated chemical groups—AlOH, ArOH, COOH, Nitrogen—are in small dashed-circles. Possible atypical sites are not circled, even though our model can predict them as sites. The shading on each atom plots the cross-validated prediction score. The top panel shows simple molecules with SOMs that can be readily identified by the heuristic model. The bottom panel shows complex molecules with SOMs that are accurately predicted by the XenoSite but not the heuristic model. The complex molecules are real drug candidates and drugs: GSK101892 (Griffini et al., 2010), Imatinib mesylate (Gschwind et al., 2005), Voriconazole (Roffey et al., 2003), 10074-G5 (Clausen et al., 2010), Oxycodone (Baldacci and Thormann, 2005), Methylprednisone (VREE et al., 1999), Ticagrelor (Teng et al., 2010), SN-38 (Tallman et al., 2005) and Dasatinib (Christopher et al., 2008) (Color version of this figure is available at Bioinformatics online.)
3.2 Descriptors driving accuracy
A permutation sensitivity analysis identified the descriptors driving model accuracy (Hunter et al., 2000). We started with the trained model. Next, we selected the 508 molecules that were poorly predicted by the heuristic model (with no validated SOMs within the molecules’ first ranked heuristic scores). Next, we randomly permuted each descriptor column (or group of closely related descriptors) in the input data for these molecules. The trained model was applied to the permuted data, and the performance drop across all the molecules was recorded. The higher the performance drop, the more important the descriptor to the model’s performance.
We saw similar results using all performance metrics (Fig. 6). Using the aggregate Top-2 performance as a guide, this analysis identified topological descriptors (identities of the atom and its neighbors (atoms one, two and three bond away), number of bound hydrogens and heavy atoms and size of ring containing the atom) and heuristic descriptors (heuristic score and number of substructures) as the most important descriptors for differentiating metabolized sites from non-metabolized sites. The result revealed that XenoSite heavily relies on local topology for calculating SOM score, and does not need descriptors from the quantum simulation. Similar results were seen in the substructure-specific sensitivities. Here, once again, topological and heuristic descriptors were the most important, and no quantum chemical descriptors were necessary. At the same time, a few molecule-level descriptors (like logP) were also important. Notably, the two heuristic descriptors were consistently among the most important.

The importance of specific descriptors to predicting 852 difficult molecules. A permutation sensitivity analysis quantified the importance of descriptors for the human UGT metabolism model to differentiate metabolized from non metabolized sites across the whole dataset (Top-2 accuracy) and within all potential sites of the same substructure groups (AUC accuracy). The 10 most important descriptors are plotted. The heuristic and topological descriptors were most important. Surprisingly, none of the quantum level descriptors were important. Consequently, the final model we publish in the website does not run a quantum simulation to make predictions
3.3 Performance on external dataset
We settled on using a neural network trained on heuristic and topological descriptors for our final model, but also saw value in the heuristic model’s simplicity. Both the neural network and the heuristic models performed well on the external testing dataset of 141 molecules (Fig. 7). Just like in the the training set, the neural network performed better than the heuristic model (83.7 versus 75.5% Top-1 accuracy). Accuracies in substructure AUCs, however, were essentially identical except with atypical sites. Here, the neural network achieved an AUC of 99.5%, compared with 55.4% for the heuristic model. Overall, performance on the testing dataset of either model was comparable to the cross-validated performance on the training dataset.

Performance on external test sets: a test set of 141 unique molecules that were not in our training set were collected by filtering the recently added glucuronidation reactions in the AMD. The neural network performed better than the heuristic model across all metrics
3.4 Comparison to prior studies
Performance comparisons to prior methods using previously published external datasets were inconclusive. We first compared XenoSite to SOMP (Rudik et al., 2015). SOMP’s test set included 20 molecules. Prediction accuracies of XenoSite and SOMP (Rudik et al., 2015) on this test—Top-1, -2 and -3 scores of 90, 95 and 95%, respectively for SOMP versus 95, 95 and 100% for XenoSite—were not statically significant. Similarly, the SOM-UGT model in Peng et al. (2014) was validated using an external dataset of 54 molecules, reporting only sensitivity and specificity for each chemical group. Difference in accuracies was not statistically significant (Fig. 8), probably because this dataset is small and is mainly composed of very easy to predict molecules.

The XenoSite predictions on an external dataset of 54 molecules. The external dataset in the Peng et al. (2014) study is used to test our model. The published performance of SOM-UGT are depicted as stars. XenoSite ROC plots are depicted as solid black lines. Heuristic ROC plots are depicted as dash lines. The points closest to the upper left corner of the XenoSite ROC plots are chosen for comparison to SOM-UGT. The performance difference between our model and the SOM-UGT on this test set is not statistically significant (Fisher’s exact test (2 × 4) two-tailed P-values: AlOH: 0.636, ArOH: 0.340, COOH:0.058, Nitrogen: 0.229)
A test set built using 49 molecules that the heuristic model does not predict accurately according to the Top-1 metric, showed XenoSite is significantly more accurate than the heuristic model, SOM-UGT and SOMP. A subset of 49 difficult molecules was identified from the 141 external test set molecules. Each of these 49 molecules was not correctly predicted by the heuristic model. The sites of UGT metabolism for these 49 molecules were predicted using SOMP’s website and SOM-UGT software. Top-N and substructure AUC metrics were used to assess the performance of the two models. The true positive, true negative, false positive and false negative values were calculated using the point closet to the upper left conner on the ROC curve. SOMP was unable to make predictions on two molecules with positive charges in this testing set while all other models could. As shown in Table 2, XenoSite outperforms all other models across all accuracy metrics. Specifically, XenoSite has statistically-significant better performance than SOMP, SOM-UGT and the heuristic model according to the Top-2, -3 and Atypical AUC metrics.
XenoSite is more accurate than all other methods on 49 difficult to predict molecules
. | Heuristic . | XenoSite . | SOM-UGT . | SOMP . |
---|---|---|---|---|
Top-1 | 0.00 | 53.06 | 33.70 | 49.88 |
Top-2 | 82.86 | 91.84 | 56.13 | 71.43 |
Top-3 | 94.29 | 97.96 | 63.70 | 79.59 |
AlOH-AUC | 75.61 | 77.53 | 63.32 | 74.97 |
ArOH-AUC | 60.33 | 68.03 | 48.59 | 64.05 |
COOH-AUC | 60.71 | 56.67 | 43.33 | 48.21 |
Nitrogen-AUC | 86.50 | 91.91 | 49.26 | 92.13 |
Atypical-AUC | 89.82 | 99.43 | 50.00 | 93.67 |
. | Heuristic . | XenoSite . | SOM-UGT . | SOMP . |
---|---|---|---|---|
Top-1 | 0.00 | 53.06 | 33.70 | 49.88 |
Top-2 | 82.86 | 91.84 | 56.13 | 71.43 |
Top-3 | 94.29 | 97.96 | 63.70 | 79.59 |
AlOH-AUC | 75.61 | 77.53 | 63.32 | 74.97 |
ArOH-AUC | 60.33 | 68.03 | 48.59 | 64.05 |
COOH-AUC | 60.71 | 56.67 | 43.33 | 48.21 |
Nitrogen-AUC | 86.50 | 91.91 | 49.26 | 92.13 |
Atypical-AUC | 89.82 | 99.43 | 50.00 | 93.67 |
To compute the Top-N performances, a global SOM-UGT model was constructed predicting positive sites that were predicted positive by any of the four published substructure-specific models. For each metric, the highest performance is bold, along with any scores not statistically different from the best performance (using a P-value cutoff of 0.05). XenoSite is the best performing model. The number of metabolized to nonmetabolized AlOH, ArOH, COOH, Nitrogen and Atypical sites are, respectively, 26/44, 17/23, 2/15, 8/36 and 2/1062.
XenoSite is more accurate than all other methods on 49 difficult to predict molecules
. | Heuristic . | XenoSite . | SOM-UGT . | SOMP . |
---|---|---|---|---|
Top-1 | 0.00 | 53.06 | 33.70 | 49.88 |
Top-2 | 82.86 | 91.84 | 56.13 | 71.43 |
Top-3 | 94.29 | 97.96 | 63.70 | 79.59 |
AlOH-AUC | 75.61 | 77.53 | 63.32 | 74.97 |
ArOH-AUC | 60.33 | 68.03 | 48.59 | 64.05 |
COOH-AUC | 60.71 | 56.67 | 43.33 | 48.21 |
Nitrogen-AUC | 86.50 | 91.91 | 49.26 | 92.13 |
Atypical-AUC | 89.82 | 99.43 | 50.00 | 93.67 |
. | Heuristic . | XenoSite . | SOM-UGT . | SOMP . |
---|---|---|---|---|
Top-1 | 0.00 | 53.06 | 33.70 | 49.88 |
Top-2 | 82.86 | 91.84 | 56.13 | 71.43 |
Top-3 | 94.29 | 97.96 | 63.70 | 79.59 |
AlOH-AUC | 75.61 | 77.53 | 63.32 | 74.97 |
ArOH-AUC | 60.33 | 68.03 | 48.59 | 64.05 |
COOH-AUC | 60.71 | 56.67 | 43.33 | 48.21 |
Nitrogen-AUC | 86.50 | 91.91 | 49.26 | 92.13 |
Atypical-AUC | 89.82 | 99.43 | 50.00 | 93.67 |
To compute the Top-N performances, a global SOM-UGT model was constructed predicting positive sites that were predicted positive by any of the four published substructure-specific models. For each metric, the highest performance is bold, along with any scores not statistically different from the best performance (using a P-value cutoff of 0.05). XenoSite is the best performing model. The number of metabolized to nonmetabolized AlOH, ArOH, COOH, Nitrogen and Atypical sites are, respectively, 26/44, 17/23, 2/15, 8/36 and 2/1062.
XenoSite and the heuristic model run very quickly, taking less than a second per molecule. We expect SOMP, which is based on fingerprints, is similarly fast. However, SOM-UGT requires 59 minutes to predict the 49 molecules in the test set. In this regard, SOM-UGT is substantially slower than other approaches, and therefore less useful for screening large numbers of molecules.
These comparisons not only show that our modeling approach outperformed existing methods in tests, but also highlights a key deficiency in the literature. Accuracy in this problem is strongly driven by the number of easy molecules that are trivially predicted by a heuristic method. We recommend future modeling effort should be directed towards predicting molecules performance according to the heuristic model on test sets in future studies.
4 Conclusion
This study introduces two approaches to predicting UGT SOMs: (i) a statistics-based heuristic model and (ii) XenoSite, a neural network trained on a large database of UGT metabolism. XenoSite accurately predicts observed SOMs of known substrates 86% of the time, and outperforms existing methods, including the heuristic model, on difficult molecules. XenoSite might be most useful in contexts where atypical sites of UGT metabolism are important and, therefore, the heuristic model is less accurate.
5 Supporting information
The supporting information includes a full list of descriptors used in this study, all the test sets, predictions for our methods on the test molecules, and a python implementation of the heuristic model. XenoSite is publicly available at http://swami.wustl.edu/xenosite.
Acknowledgements
The authors thank Jed Zaretzki for helpful discussions and Matthew Matlock for developing the XenoSite code, which is extended in this work. We also thank the developers of the open-source cheminformatics tools Open Babel and RDKit. Molecule rendering in Figure 2 utilized OEDepict.
Funding
Research reported in this publication was supported by the National Library Of Medicine of the National Institutes of Health under Award Number R01LM012222. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. We also appreciate the generous support of both the Department of Immunology and Pathology at the Washington University School of Medicine and the Washington University Center for Biological Systems Engineering. Computations were performed using the facilities of the Washington University Center for High Performance Computing, which were partially funded by NIH grants (1S10RR022984-01A1 and 1S10OD018091-01).
Conflict of Interest: none declared.
References
Author notes
Associate Editor: Jonathan Wren