There are at least two good reasons for the on-going interest in drug–target interactions: first, drug-effects can only be fully understood by considering a complex network of interactions to multiple targets (so-called off-target effects) including metabolic and signaling pathways; second, it is crucial to consider drug-target-pathway relations for the identification of novel targets for drug development. To address this on-going need, we have developed a web-based data warehouse named SuperTarget, which integrates drug-related information associated with medical indications, adverse drug effects, drug metabolism, pathways and Gene Ontology (GO) terms for target proteins. At present, the updated database contains >6000 target proteins, which are annotated with >330 000 relations to 196 000 compounds (including approved drugs); the vast majority of interactions include binding affinities and pointers to the respective literature sources. The user interface provides tools for drug screening and target similarity inclusion. A query interface enables the user to pose complex queries, for example, to find drugs that target a certain pathway, interacting drugs that are metabolized by the same cytochrome P450 or drugs that target proteins within a certain affinity range. SuperTarget is available at http://bioinformatics.charite.de/supertarget.
In the last decade, non-commercial drug- or target-related databases have been established. Millions of compounds can be found in databases like ChEMBL (1) or PubChem (2) and their availability can be checked via databases like ZINC (3).
Several databases collect binding data on small molecules, in particular, drugs. A comprehensive and manually curated resource is DrugBank (4), which contains 4300 targets related to about 7000 compounds, including 1500 FDA-approved drugs. Another notable database is the Therapeutic Target Database (TTD) (5), which holds target information on approximately 2000 classified targets linked to 5000 compounds, including 1500 approved drugs. Surprisingly, the overlap between these two databases is small (data not shown). KEGG DRUG is a database for approved drugs, which comprises drug–target interactions, drug classifications as well as information about drug structure development (6). Other databases collect drug–target data with a special focus regarding medical indications [e.g. cancer (7) and infection (8)], technical aspects [e.g. pharmacophores (9) or scaffold hoppers (10)], side effects (11) or special metabolic pathways (12). The database STITCH is focused on the relation of >70 000 chemicals to targets from hundreds of different organisms (13). To understand the complex effects of drugs, the relation of their targets in signaling and metabolic pathways are important and reflected in a number of databases, e.g. KEGG (6) or Reactome (14).
In 2008, the first version of SuperTarget was developed with the intention to accentuate drug–target interactions themselves and to provide references to other resources for more elaborate analysis (15). Adverse drug reactions are a common reason for the rejection of drug candidates during clinical trial or withdrawal after approval. For example, the cyclo-oxygenase inhibitor rofecoxib (nonsteroidal anti-inflammatory drug) was withdrawn worldwide because of severe cardiovascular side effects, which may be caused by unanticipated interactions with potassium and calcium channels (16).
The analysis of drug–target interactions can play a crucial role to improve the process of drug design and admission. SuperTarget provided a variety of drug–target interactions and affected biological pathways in a user-friendly manner. This second release of SuperTarget contains a core dataset of ∼330 000 drug–target interactions, of which about 310 000 interactions have binding affinity data. We consider a drug–target relation as a specific interaction of a small chemical compound, which could be used to treat or diagnose a disease. Thus, SuperTarget now enables scientists to carry out not only qualitative but also quantitative analysis of drug–target interactions.
SuperTarget at present contains an updated version of the original dataset. In 2011, the core dataset consists of 6219 targets and 195 770 drugs and putative drugs of which about 2500 are approved drugs, that are classified by the World Health Organization (WHO), resulting in 332 828 drug–target interactions. The list of targets was selected using the PROMISCUOUS database (17). New drug interactions were added from inhouse text mining, supplemented by manual curations, SuperSite (18), CaRe (7), SuperCyp (12) and DrugBank (4) using the target list mentioned above. Target synonyms and external database identifiers were updated as defined in the UniProtKB database (19). The information content of drug entities was enlarged by general properties such as molecular weight, lipophilicity (logP) and known side effects as defined by the Sider database (11). Binding affinities of drug–target interactions were added from BindingDB entries (20). SuperTarget offers references to various resources, which provide more detailed information, i.e. specific links to PubChem, UniProtKB, DrugBank, the RCSB Protein Data Bank (PDB), PubMed and the BindingDB.
Relations from other databases, namely DrugBank (4), KEGG (6), PDB(21), SuperDrug (22) and TTD (5) were checked for drug–target interactions not identified using the preceding steps. If those interactions could be confirmed by literature listed in PubMed, the references were included in SuperTarget otherwise the describing database is referenced. To provide users with further information on drug–target interactions, SuperTarget provides links to physicochemical properties and further structural information of drugs. Proven or potential target proteins are represented as stored in UniProtKB (19), by functional annotations extracted from GO (23), and by related pathway information provided by KEGG (6) (compare Figure 1).
SuperTarget enables users to link drugs and targets to biomolecular pathways. Pathways are given as defined in the KEGG database. Furthermore, targets can be searched using gene ontology (GO) terms. The Anatomical Therapeutical Chemical (ATC) classification of drugs (24) is useful for searching drugs in distinct indication areas and for analyzing co-occurrence of drugs/targets in different diseases, which could be combined with side effect searches. A special section is dedicated to drug interactions with enzymes of the Cytochrome P450 family (CYP). CYPs are mono-oxygenases whose functions in humans include the detoxification of foreign substances via chemical modification. Hence, they play a crucial role in drug metabolism. The features described above were part of the first release, but have now been updated extensively. Among other new features, SuperTarget now introduces integration of protein–protein interaction data from the ConsensusPathDB (25). In addition, target sequence and drug similarities are incorporated in SuperTarget. Drug similarities are computed using Tanimoto coefficients of pre-calculated fingerprints as implemented in SuperPred (26).
The accessibility and presentation of data is as important as its accumulation and integration. The SuperTarget web interface offers a variety of ways to obtain and view information about the drugs and targets:
To provide quick access to the data, a simple full-text search called ‘Targle’ was implemented, which returns hits separately for drugs, targets and pathways.
There is also a dedicated search section for each type of entity: i.e. drugs, targets, pathways, gene ontologies and CYPs. The user is either able to select predefined identifiers or to type in a variety of search terms. For instance, targets can be searched by synonyms, UniProtKB identifiers, PDB identifiers, KEGG target identifiers or EC numbers. The results section for each entity provides detailed information about each instance, which includes SMILES and InChI strings and a list of putative targets for drug results as well as a list of protein–protein interactions and similar targets for target results. All different entities are cross-linked, thus details of putative targets and affected pathways are easily accessible from drug search results.
For more sophisticated searches, an advanced search option is available, which includes general properties, e.g. a desired number of H-bond donors, or characteristics associated with particular drugs or targets such as affinity values. This option allows an arbitrary combination of different search criteria. Hence, the user is able to perform a variety of complex searches.
CASE STUDY: VASCULAR ENDOTHELIAL GROWTH FACTOR RECEPTORS
The following case study illustrates a useful application of the advanced search capability. Pathways, which include Vascular Endothelial Growth Factors (VEGF), play an important role in angiogenesis and provide targets for anti-angiogenic cancer therapy (27). It has been shown that normalization of tumor vasculature via inhibition of VEGFR-2 receptor reduces hypoxia and cell proliferation and increases accessibility to other chemotherapeutic drugs (28). Therefore, it may be desirable to find specific inhibitors of VEGFR-2, which do not affect the other receptor subtype VEGFR-1. A general idea about the specificity of an inhibitor is provided by its IC50 value. This refers to the compound concentration, which is needed for a half-maximum inhibition of the corresponding biological process. Thus, potent inhibitors show low IC50 values whereas weak inhibitors exhibit relatively high ones. The IC50 values for the two different VEGFR subtypes should differ by at least an order of magnitude (e.g. nM versus µM). Compounds with the desired properties can be searched in SuperTarget as follows:
First, both receptor subtypes are searched by their UniProt name in the target section and are added to the basket. Second, the advanced search section is used to identify potential drugs that inhibit VEGFR-2 but not VEGFR-1. In more detail, the VEGFR-1 receptor subtype is added from the basket to the query as an ‘is ligand of (“VGFR1_HUMAN”)’ search criterion. A range of IC50 values is defined as a second search criterion (1001–100 000 nM for VEGFR-1). This criterion is automatically associated with the first one. The steps mentioned above are then repeated for the second receptor subtype, this time the desired range for the IC50 value is 0–100 nM. This query returns nine candidates. Since different experiments may suggest different IC50 values for the same biological process, the results have to be manually curated. The corresponding information is shown in the interaction detail sections of the compounds and VEGFR-1 or VEGFR-2, which are accessible from the drug details page of each compound. The results of our analysis suggest that at least six putative drugs with this feature may be worth further investigation, i.e. ChEMBL205413, ChEMBL205610, ChEMBL209919, ChEMBL213507, ChEMBL380397 and ChEMBL398610 (Figure 2).
In two following steps, the crude cell biological context of these six drugs can be analyzed. The six putative drugs are added to the basket. In the next step, additional targets of the six putative drugs can be identified. Similarly to the steps mentioned above, for each of the six putative drugs the advanced search option is used to search targets, which are strongly inhibited by the compound, i.e. have IC50 values between 0 and 100 nM. Each additional target is added to the basket. Three of the putative drugs, ChEMBL209919, ChEMBL213507 and ChEMBL380397, also show a strong inhibitory effect on the mast/stem cell-growth factor receptor (UniProt name: FLT3_HUMAN). ChEMBL398610 exhibits low IC50 values toward the FL cytokine receptor (UniProt name: FLT3_HUMAN), the high affinity nerve growth factor receptor (UniProt name: NTRK1_HUMAN) and VEGFR-3 (UniProt name: VGFR3_HUMAN).
In a final step, the list of additional targets can be used to identify pathways, which are affected by the six putative drugs mentioned in the first paragraph of this case study. Either the advanced search option or the pathway link in the list of results can be used to identify pathways containing VEGFR-2. VEGFR-2 is associated with the human cytokine–cytokine interaction, endocytosis, focal adhesion and the VEGF signaling pathway. All of the six putative drugs are likely to have an impact on these pathways. In a similar fashion, additional pathways comprising the mast/stem cell growth factor receptor, which is a target of ChEMBL209919, ChEMBL213507 and ChEMBL380397, are the human hemapoietic cell lineage, melagonesis, acute myeloid leukemia pathway and pathways in cancer. For the compound ChEMBL398610, the union of pathways, which contain at least one of its targets, appears to be interesting, i.e. pathways comprising the FL cytokine receptor or the high-affinity nerve growth factor receptor or VEGFR-3. Hence, these three targets are added as pathway search criteria in the advanced search section and combined by OR conjunctions. In addition to the pathways associated with VEGFR-2, the results include the Human MAPK signaling, neurotrophin signaling, hemapoietic cell lineage, apoptosis, acute myeloid leukemia, thyroid cancer pathway and pathways in cancer.
SuperTarget is one of the largest resources of validated drug–target interactions including quantitative data. We carefully assessed the most relevant data sections for each entity and provide links to many other resources. The updated search engine allows users to obtain information starting from a variety of entry points and to perform complex queries.
SuperTarget is available via the web site without registration: http://bioinformatics.charite.de/supertarget. It can be used under the terms of the Creative Commons Attribution-Noncommercial-Share Alike 3.0 License.
BMBF (MedSys 0315450A), DFG (RTG ‘Computational Systems Biology’ GRK1772 and IRTG ‘Systems Biology of Molecular Networks’ GRK1360), EU (SynSys) and NIH (GM070064). Funding for open access charge: DFG GRK1772.
Conflict of interest statement. None declared.
The authors wish to thank Helen Taubman for proofreading.