ADMETlab 2.0: an integrated online platform for accurate and comprehensive predictions of ADMET properties

Abstract Because undesirable pharmacokinetics and toxicity of candidate compounds are the main reasons for the failure of drug development, it has been widely recognized that absorption, distribution, metabolism, excretion and toxicity (ADMET) should be evaluated as early as possible. In silico ADMET evaluation models have been developed as an additional tool to assist medicinal chemists in the design and optimization of leads. Here, we announced the release of ADMETlab 2.0, a completely redesigned version of the widely used AMDETlab web server for the predictions of pharmacokinetics and toxicity properties of chemicals, of which the supported ADMET-related endpoints are approximately twice the number of the endpoints in the previous version, including 17 physicochemical properties, 13 medicinal chemistry properties, 23 ADME properties, 27 toxicity endpoints and 8 toxicophore rules (751 substructures). A multi-task graph attention framework was employed to develop the robust and accurate models in ADMETlab 2.0. The batch computation module was provided in response to numerous requests from users, and the representation of the results was further optimized. The ADMETlab 2.0 server is freely available, without registration, at https://admetmesh.scbdd.com/.


/ 17
• Results interpretation: The predicted logP of a compound is given as the logarithm of the molar concentration (log mol/L). Compounds in the range from 0 to 3 log mol/L will be considered proper.

logD7.4
• The logarithm of the n-octanol/water distribution coefficients at pH=7.4. To exert a therapeutic effect, one drug must enter the blood circulation and then reach the site of action. Thus, an eligible drug usually needs to keep a balance between lipophilicity and hydrophilicity to dissolve in the body fluid and penetrate the biomembrane effectively. Therefore, it is important to estimate the n-octanol/water distribution coefficients at physiological pH (logD7.4) values for candidate compounds in the early stage of drug discovery.
• Results interpretation: The predicted logD7.4 of a compound is given as the logarithm of the molar concentration (log mol/L). Compounds in the range from 1 to 3 log mol/L will be considered proper.

Medicinal Chemistry
QED [1] • A measure of drug-likeness based on the concept of desirability. QED is calculated by integrating the outputs of the desirability functions based on eight drug-likeness related properties, including MW, log P, NHBA, NHBD, PSA, Nrotb, the number of aromatic rings (NAr), and the number of alerts for undesirable functional groups. Here, average descriptor weights were used in the calculation of QED. The QED score is calculated by taking the geometric mean of the individual desirability functions, given by = , where di indicates the dth desirability function and n = 8 is the number of drug-likeness related properties.
• Results interpretation: The mean QED is 0.67 for the attractive compounds, 0.49 for the unattractive compounds and 0.34 for the unattractive compounds considered too complex.
• Empirical decision: > 0.67: excellent (green); ≤ 0.67: poor (red) SAscore [2] • Synthetic accessibility score is designed to estimate ease of synthesis of drug-like molecules, based on a combination of fragment contributions and a complexity penalty. The score is between 1 (easy to make) and 10 (very difficult to make). The synthetic accessibility score (SAscore) is calculated as a combination of two components: = − .

/ 17
Fsp 3 [3] • Fsp 3 , the number of sp3 hybridized carbons/total carbon count, is used to determine the carbon saturation of molecules and characterize the complexity of the spatial structure of molecules. It has been demonstrated that the increased saturation measured by Fsp 3 and the number of chiral centers in the molecule increase the clinical success rate, which might be related to the increased solubility, or the fact that the enhanced 3D features allow small molecules to occupy more target space.
• Results interpretation: Fsp3 ≥ 0.42 is considered a suitable value.

MCE-18 [4]
• MCE-18 stands for medicinal chemistry evolution in 2018, and this measure can effectively score molecules by novelty in terms of their cumulative sp3 complexity. It can effectively score structures by their novelty and current lead potential in contrast to simple and in many cases false positive sp3 index, and given by the following equation: AR is the presence of an aromatic or heteroaromatic ring (0 or 1), NAR is the presence of an aliphatic or a heteroaliphatic ring (0 or 1), CHIRAL is the presence of a chiral center (0 or 1), SPIRO is the presence of a spiro point (0 or 1), sp3 is the portion of sp3-hybridized carbon atoms (from 0 to 1), Cyc is the portion of cyclic carbons that are sp3 hybridized (from 0 to 1), Acyc is a portion of acyclic carbon atoms that are sp3 hybridized (from 0 to 1), and Q1 is the normalized quadratic index.
• Results interpretation: < 45: uninteresting, trivial, old scaffolds, low degree of 3D complexity and novelty; 45~63: sufficient novelty, basically follow the trends of currently observed in medicinal chemistry; 63~78: high structural similarity to the compounds disclosed in patent records; >78: need to be inspected visually to assess their target profile and drug-likeness.
• Empirical decision: ≥ 45：excellent (green); ＜45: poor (red) NPscore [5] • The Natural Product-likeness score is a useful measure which can help to guide the design of new molecules toward interesting regions of chemical space which have been identified as "bioactive regions" by natural evolution. The calculation consists of molecule fragmentation, table lookup, and summation of fragment contributions.
• Results interpretation: The calculated score is typically in the range from −5 to 5. The higher the score is, the higher the probability is that the molecule is a NP.
• Empirical decision: 0 violations: excellent (green); otherwise: poor (red) PAINS [10] • Pan Assay Interference Compounds (PAINS) is one of the most famous frequent hitters filters, which comprises 480 substructures derived from the analysis of FHs determined by six target-based HTS assay.
By application of these filters, it is easier to screen false positive hits and to flag suspicious compounds in screening databases. One of the most authoritative medicine magazines Journal of Medicinal Chemistry even requires authors to provide the screening results with the PAINS alerts of active compounds when submitting manuscripts.
• Results interpretation: If the number of alerts is not zero, the users could check the substructures by the DETIAL button. [11] • Thiol reactive compounds. There are 75 substructures in this endpoint.

ALARM NMR Rule
• Results interpretation: If the number of alerts is not zero, the users could check the substructures by the DETIAL button.
• Results interpretation: If the number of alerts is not zero, the users could check the substructures by the DETIAL button.
• Results interpretation: If the number of alerts is not zero, the users could check the substructures by the DETIAL button.

Caco-2 Permeability
• Before an oral drug reaches the systemic circulation, it must pass through intestinal cell membranes via passive diffusion, carrier-mediated uptake or active transport processes. The human colon adenocarcinoma cell lines (Caco-2), as an alternative approach for the human intestinal epithelium, has been commonly used to estimate in vivo drug permeability due to their morphological and functional similarities. Thus, Caco-2 cell permeability has also been an important index for an eligible candidate drug compound.
• Results interpretation: The predicted Caco-2 permeability of a given compound is given as the log cm/s. A compound is considered to have a proper Cao-2 permeability if it has predicted value >-5.15log cm/s.

MDCK Permeability
• Madin−Darby Canine Kidney cells (MDCK) have been developed as an in vitro model for permeability screening. Its apparent permeability coefficient, Papp, is widely considered to be the in vitro gold standard for assessing the uptake efficiency of chemicals into the body. Papp values of MDCK cell lines are also used to estimate the effect of the blood-brain barrier (BBB).
• Results interpretation: The unit of predicted MDCK permeability is cm/s. A compound is considered to have a high passive MDCK permeability for a Papp > 20 x 10 -6 cm/s, medium permeability for 2-20 x 10 -6 cm/s, low permeability for < 2 x 10 -6 cm/s.

Pgp-inhibitor
• The inhibitor of P-glycoprotein. The P-glycoprotein, also known as MDR1 or 2 ABCB1, is a membrane protein member of the ATP-binding cassette (ABC) transporters superfamily. It is probably the most promiscuous efflux transporter, since it recognizes a number of structurally different and apparently unrelated xenobiotics; notably, many of them are also CYP3A4 substrates.
• Results interpretation: Category 0: Non-inhibitor; Category 1: Inhibitor. The output value is the probability of being Pgp-inhibitor, within the range of 0 to 1.

Pgp-substrate
• As described in the Pgp-inhibitor section, modulation of P-glycoprotein mediated transport has significant pharmacokinetic implications for Pgp substrates, which may either be exploited for specific therapeutic advantages or result in contraindications.
• Results interpretation: Category 0: Non-substrate; Category 1: substrate. The output value is the probability of being Pgp-substrate, within the range of 0 to 1.

HIA
• Human intestinal absorption. As described above, the human intestinal absorption of an oral drug is the essential prerequisite for its apparent efficacy. What's more, the close relationship between oral bioavailability and intestinal absorption has also been proven and HIA can be seen an alternative indicator for oral bioavailability to some extent.
• Result interpretation: A molecule with an absorbance of less than 30% is considered to be poorly absorbed.
Accordingly, molecules with a HIA >30% were classified as HIA-(Category 0), while molecules with a HIA < 30% were classified as HIA+(Category 1). The output value is the probability of being HIA+, within the range of 0 to 1.

F20%
• The human oral bioavailability 20%. For any drug administrated by the oral route, oral bioavailability is undoubtedly one of the most important pharmacokinetic parameters because it is the indicator of the efficiency of the drug delivery to the systemic circulation.
• Result interpretation: Molecules with a bioavailability ≥ 20% were classified as F20%-(Category 0), while molecules with a bioavailability < 20% were classified as F20%+ (Category 1). The output value is the probability of being F20%+, within the range of 0 to 1.

F30%
• The human oral bioavailability 30%. For any drug administrated by the oral route, oral bioavailability is undoubtedly one of the most important pharmacokinetic parameters because it is the indicator of the efficiency of the drug delivery to the systemic circulation.
• Result interpretation: Molecules with a bioavailability ≥ 30% were classified as F30%-(Category 0), while molecules with a bioavailability < 30% were classified as F30%+ (Category 1). The output value is the probability of being F30%+, within the range of 0 to 1.

PPB
• Plasma protein binding. One of the major mechanisms of drug uptake and distribution is through PPB, thus the binding of a drug to proteins in plasma has a strong influence on its pharmacodynamic behavior. PPB can directly influence the oral bioavailability because the free concentration of the drug is at stake when a drug binds to serum proteins in this process.
• Result interpretation: A compound is considered to have a proper PPB if it has predicted value < 90%, and drugs with high protein-bound may have a low therapeutic index.

VD
• Volume Distribution. The VD is a theoretical concept that connects the administered dose with the actual initial concentration present in the circulation and it is an important parameter to describe the in vivo distribution for drugs. In practical, we can speculate the distribution characters for an unknown compound according to its VD value, such as its condition binding to plasma protein, its distribution amount in body fluid and its uptake amount in tissues.
• Result interpretation: The unit of predicted VD is L/kg. A compound is considered to have a proper VD if it has predicted VD in the range of 0.04-20L/kg.

BBB Penetration
• Drugs that act in the CNS need to cross the blood-brain barrier (BBB) to reach their molecular target. By contrast, for drugs with a peripheral target, little or no BBB penetration might be required in order to avoid CNS side effects.
• Result interpretation: The unit of BBB penetration is cm/s. Molecules with logBB > -1 were classified as BBB+ (Category 1), while molecules with logBB ≤ -1 were classified as BBB-(Category 0). The output value is the probability of being BBB+, within the range of 0 to 1.

Fu
• The fraction unbound in plasms. Most drugs in plasma will exist in equilibrium between either an unbound state or bound to serum proteins. Efficacy of a given drug may be affect by the degree to which it binds proteins within blood, as the more that is bound the less efficiently it can traverse cellular membranes or diffuse.

CYP 1A2 / 2C19 / 2C9 / 2D6 / 3A4 inhibitor CYP 1A2 / 2C19 /2C9 / 2D6 / 3A4 substrate
• Based on the chemical nature of biotransformation, the process of drug metabolism reactions can be divided into two broad categories: phase I (oxidative reactions) and phase II (conjugative reactions). The human cytochrome P450 family (phase I enzymes) contains 57 isozymes and these isozymes metabolize approximately two-thirds of known drugs in human with 80% of this attribute to five isozymes--1A2, 3A4, 2C9, 2C19 and 2D6. Most of these CYPs responsible for phase I reactions are concentrated in the liver.
• Result interpretation: Category 0: Non-substrate / Non-inhibitor; Category 1: substrate / inhibitor. The output value is the probability of being substrate / inhibitor, within the range of 0 to 1.

CL
• The clearance of a drug. Clearance is an important pharmacokinetic parameter that defines, together with the volume of distribution, the half-life, and thus the frequency of dosing of a drug.

T1/2
• The half-life of a drug is a hybrid concept that involves clearance and volume of distribution, and it is arguably more appropriate to have reliable estimates of these two properties instead.

Toxicology hERG Blockers
• The human ether-a-go-go related gene. The During cardiac depolarization and repolarization, a voltagegated potassium channel encoded by hERG plays a major role in the regulation of the exchange of cardiac action potential and resting potential. The hERG blockade may cause long QT syndrome (LQTS), arrhythmia, and Torsade de Pointes (TdP), which lead to palpitations, fainting, or even sudden death.
• Result interpretation: Molecules with IC50 more than 10 μM or less than 50% inhibition at 10 μM were classified as hERG -(Category 0), while molecules with IC50 less than 10 μM or more than 50% inhibition at 10 μM were classified as hERG+ (Category 1). The output value is the probability of being hERG+, within the range of 0 to 1.

H-HT
• The human hepatotoxicity. Drug induced liver injury is of great concern for patient safety and a major cause for drug withdrawal from the market. Adverse hepatic effects in clinical trials often lead to a late and costly termination of drug development programs.

AMES Toxicity
• The Ames test for mutagenicity. The mutagenic effect has a close relationship with the carcinogenicity, and it is the most widely used assay for testing the mutagenicity of compounds.
• Result interpretation: Category 0: AMES negative(-); Category 1: AMES positive(+). The output value is the probability of being toxic, within the range of 0 to 1.

Rat Oral Acute Toxicity
• Determination of acute toxicity in mammals (e.g. rats or mice) is one of the most important tasks for the safety evaluation of drug candidates.

Skin Sensitization
• Skin sensitization is a potential adverse effect for dermally applied products. The evaluation of whether a compound, that may encounter the skin, can induce allergic contact dermatitis is an important safety concern.
• Result interpretation: Category 1: Sensitizer; Category 0: Non-sensitizer. The output value is the probability of being toxic, within the range of 0 to 1.

Carcinogencity
• Among various toxicological endpoints of chemical substances, carcinogenicity is of great concern because of its serious effects on human health. The carcinogenic mechanism of chemicals may be due to their ability to damage the genome or disrupt cellular metabolic processes. Many approved drugs have been identified as carcinogens in humans or animals and have been withdrawn from the market.
• Result interpretation: Category 1: carcinogens; Category 0: non-carcinogens. Chemicals are labelled as active (carcinogens) or inactive (non-carcinogens) according to their TD50 values. The output value is the probability of being toxic, within the range of 0 to 1.

Eye Corrosion / Irritation
• Assessing the eye irritation/corrosion (EI/EC) potential of a chemical is a necessary component of risk assessment. Cornea and conjunctiva tissues comprise the anterior surface of the eye, and hence cornea and conjunctiva tissues are directly exposed to the air and easily suffer injury by chemicals. There are several substances, such as chemicals used in manufacturing, agriculture and warfare, ocular pharmaceuticals, cosmetic products, and household products, that can cause EI or EC.
• Result interpretation: Category 1: corrosives / irritants chemicals; Category 0: non-corrosives / non-irritants chemicals. The output value is the probability of being toxic, within the range of 0 to 1.

Respiratory Toxicity
• Among these safety issues, respiratory toxicity has become the main cause of drug withdrawal. Druginduced respiratory toxicity is usually underdiagnosed because it may not have distinct early signs or symptoms in common medications and can occur with significant morbidity and mortality.Therefore, careful surveillance and treatment of respiratory toxicity is of great importance.
• Result interpretation: Category 1: respiratory toxicants; Category 0: non-respiratory toxicants. The output value is the probability of being toxic, within the range of 0 to 1.

Bioconcentration Factor
• The bioconcentration factor BCF is defined as the ratio of the chemical concentration in biota as a result of absorption via the respiratory surface to that in water at steady state. It is used for considering secondary poisoning potential and assessing risks to human health via the food chain. The unit of BCF is log10(L/kg).

NR-AR
• Androgen receptor (AR), a nuclear hormone receptor, plays a critical role in AR-dependent prostate cancer and other androgen related diseases. Endocrine disrupting chemicals (EDCs) and their interactions with steroid hormone receptors like AR may cause disruption of normal endocrine function as well as interfere with metabolic homeostasis, reproduction, developmental and behavioral functions.
• Result interpretation: Category 1: actives ; Category 0: inactives. The output value is the probability of being AR agonists, within the range of 0 to 1.

NR-AR-LBD
• Androgen receptor (AR), a nuclear hormone receptor, plays a critical role in AR-dependent prostate cancer and other androgen related diseases. Endocrine disrupting chemicals (EDCs) and their interactions with

Computational Biology & Drug Design Group
13 / 17 steroid hormone receptors like AR may cause disruption of normal endocrine function as well as interfere with metabolic homeostasis, reproduction, developmental and behavioral functions.
• Result interpretation: Category 1: actives ; Category 0: inactives. Molecules that labeled 1 in this bioassay may bind to the LBD of androgen receptor. The output value is the probability of being actives, within the range of 0 to 1.

NR-AhR
• The Aryl hydrocarbon Receptor (AhR), a member of the family of basic helix-loop-helix transcription factors, is crucial to adaptive responses to environmental changes. AhR mediates cellular responses to environmental pollutants such as aromatic hydrocarbons through induction of phase I and II enzymes but also interacts with other nuclear receptor signaling pathways.
• Result interpretation: Category 1: actives ; Category 0: inactives. Molecules that labeled 1 may activate the aryl hydrocarbon receptor signaling pathway. The output value is the probability of being actives, within the range of 0 to 1.

NR-Aromatase
• Endocrine disrupting chemicals (EDCs) interfere with the biosynthesis and normal functions of steroid hormones including estrogen and androgen in the body. Aromatase catalyzes the conversion of androgen to estrogen and plays a key role in maintaining the androgen and estrogen balance in many of the EDCsensitive organs.
• Result interpretation: Category 1: actives ; Category 0: inactives. Molecules that labeled 1 are regarded as aromatase inhibitors that could affect the balance between androgen and estrogen. The output value is the probability of being actives, within the range of 0 to 1.

NR-ER
• Estrogen receptor (ER), a nuclear hormone receptor, plays an important role in development, metabolic homeostasis and reproduction. Endocrine disrupting chemicals (EDCs) and their interactions with steroid hormone receptors like ER causes disruption of normal endocrine function. Therefore, it is important to understand the effect of environmental chemicals on the ER signaling pathway.
• Result interpretation: Category 1: actives ; Category 0: inactives. The output value is the probability of being actives within the range of 0 to 1.

NR-ER-LBD
• Estrogen receptor (ER), a nuclear hormone receptor, plays an important role in development, metabolic homeostasis and reproduction. Two subtypes of ER, ER-alpha and ER-beta have similar expression

Computational Biology & Drug Design Group
14 / 17 patterns with some uniqueness in both types. Endocrine disrupting chemicals (EDCs) and their interactions with steroid hormone receptors like ER causes disruption of normal endocrine function.
• Result interpretation: Category 1: actives ; Category 0: inactives. The output value is the probability of being actives within the range of 0 to 1.

NR-PPAR-gamma
• The peroxisome proliferator-activated receptors (PPARs) are lipid-activated transcription factors of the nuclear receptor superfamily with three distinct subtypes namely PPAR alpha, PPAR delta (also called PPAR beta) and PPAR gamma (PPARg). All these subtypes heterodimerize with Retinoid X receptor (RXR) and these heterodimers regulate transcription of various genes. PPAR-gamma receptor (glitazone receptor) is involved in the regulation of glucose and lipid metabolism.
• Result interpretation: Category 1: actives ; Category 0: inactives. The output value is the probability of being actives within the range of 0 to 1.

SR-ARE
• Oxidative stress has been implicated in the pathogenesis of a variety of diseases ranging from cancer to neurodegeneration. The antioxidant response element (ARE) signaling pathway plays an important role in the amelioration of oxidative stress. The CellSensor ARE-bla HepG2 cell line (Invitrogen) can be used for analyzing the Nrf2/antioxidant response signaling pathway. Nrf2 (NF-E2-related factor 2) and Nrf1 are transcription factors that bind to AREs and activate these genes.
• Result interpretation: Category 1: actives ; Category 0: inactives. The output value is the probability of being actives within the range of 0 to 1.

SR-ATAD5
• ATPase family AAA domain-containing protein 5. As cancer cells divide rapidly and during every cell division they need to duplicate their genome by DNA replication. The failure to do so results in the cancer cell death. Based on this concept, many chemotherapeutic agents were developed but have limitations such as low efficacy and severe side effects etc. Enhanced Level of Genome Instability Gene 1 (ELG1; human ATAD5) protein levels increase in response to various types of DNA damage.
• Result interpretation: Category 1: actives ; Category 0: inactives. The output value is the probability of being actives within the range of 0 to 1.

SR-HSE
• Heat shock factor response element. Various chemicals, environmental and physiological stress conditions may lead to the activation of heat shock response/ unfolded protein response (HSR/UPR). There are three heat shock transcription factors (HSFs) (HSF-1, -2, and -4) mediating transcriptional regulation of the human HSR.
• Result interpretation: Category 1: actives ; Category 0: inactives. The output value is the probability of being actives within the range of 0 to 1.

SR-MMP
• Mitochondrial membrane potential (MMP), one of the parameters for mitochondrial function, is generated by mitochondrial electron transport chain that creates an electrochemical gradient by a series of redox reactions. This gradient drives the synthesis of ATP, a crucial molecule for various cellular processes.
Measuring MMP in living cells is commonly used to assess the effect of chemicals on mitochondrial function; decreases in MMP can be detected using lipophilic cationic fluorescent dyes.
• Result interpretation: Category 1: actives ; Category 0: inactives. The output value is the probability of being actives within the range of 0 to 1.

SR-p53
• p53, a tumor suppressor protein, is activated following cellular insult, including DNA damage and other cellular stresses. The activation of p53 regulates cell fate by inducing DNA repair, cell cycle arrest, apoptosis, or cellular senescence. The activation of p53, therefore, is a good indicator of DNA damage and other cellular stresses.
• Result interpretation: Category 1: actives ; Category 0: inactives. The output value is the probability of being actives within the range of 0 to 1.

Acute Toxicity Rule
• Molecules containing these substructures may cause acute toxicity during oral administration. There are 20 substructures in this endpoint.
• Results interpretation: If the number of alerts is not zero, the users could check the substructures by the DETIAL button.

Genotoxic Carcinogenicity Rule
• Molecules containing these substructures may cause carcinogenicity or mutagenicity through genotoxic mechanisms.There are 117 substructures in this endpoint.

Computational Biology & Drug Design Group
16 / 17 • Results interpretation: If the number of alerts is not zero, the users could check the substructures by the DETIAL button.

NonGenotoxic Carcinogenicity Rule
• Molecules containing these substructures may cause carcinogenicity through nongenotoxic mechanisms.
There are 23 substructures in this endpoint.
• Results interpretation: If the number of alerts is not zero, the users could check the substructures by the DETIAL button.

Skin Sensitization Rule
• Molecules containing these substructures may cause skin irritation.There are 155 substructures in this endpoint. Molecules containing these substructures may cause skin irritation.
• Results interpretation: If the number of alerts is not zero, the users could check the substructures by the DETIAL button.

Aquatic Toxicity Rule
• Molecules containing these substructures may cause toxicity to liquid(water). There are 99 substructures in this endpoint.
• Results interpretation: If the number of alerts is not zero, the users could check the substructures by the DETIAL button.

NonBiodegradable Rule
• Molecules containing these substructures may be non-biodegradable. There are 19 substructures in this endpoint.
• Results interpretation: If the number of alerts is not zero, the users could check the substructures by the DETIAL button.

SureChEMBL Rule
• Molecules matching one or more structural alerts are considered to have MedChem unfriendly status.
There are 164 substructures in this endpoint.
• Results interpretation: If the number of alerts is not zero, the users could check the substructures by the DETIAL button.

FAF-Drugs4 Rule
• Molecules containing these substructures may be toxic. There are 154 substructures collected form FAF-Drugs4 webserver in this endpoint.
Results interpretation: If the number of alerts is not zero, the users could check the substructures by the DETIAL button.