Abstract

Developing a drug de novo is a laborious and costly endeavor. Thus, the repositioning of already approved drugs for the treatment of new diseases is promising and valuable. One computational approach to repositioning exploits the structural similarity of binding sites of known and new targets. Here, we review computational methods to represent and align binding sites. We review available tools, present success stories and discuss limits of the approach.

‘The most fruitful basis for the discovery of a new drug is to start with an old drug.’ (Noble laureate James Black) [1]

INTRODUCTION

Drug repositioning reduces costs and improves efficiency

Since drug development is a very costly and failure-prone venture, the pharmaceutical industry puts effort in the repositioning of withdrawn or already approved drugs. The costs for bringing such a drug to market are ∼60% lower [2] than the development of a novel drug, which costs roughly one billion US dollars [3]. Drug repositioning is the process of finding new indications for existing drugs and is also known as drug repurposing or reprofiling. The benefits for the pharmaceutical industry are manifold [4]. Beside the lower drug development costs and the reduced time for approval, a new drug application can help to expand the patent life and increase the return for the investment in the development of the drug. Moreover, the success rate in the repurposing process is considerably higher. Drug repositioning is also promising for the about 200 compounds [5], which are shelved by the pharmaceutical industry every year because they failed in clinical trials and were then not further investigated. These drugs could be quickly marketed for new indications [5], thus reducing the attrition rates [6]. Especially for small companies and public research, repositioning drugs off-patent [7] gives them the opportunity to participate in a process they could otherwise hardly afford financially. Drug repositioning also has a high value for patients [8]. Medications for diseases which were not treatable before can be found while a knowledge about possible side effects, pharmacokinetics and interactions with other drugs already exists. For a comparison of de novo drug discovery and drug repositioning see [9].

There are often no drugs present for rare diseases. Such orphan drugs need to be developed (or discovered) [10]. For this purpose, the US Food and Drug Administration (FDA) launched a database [11] of approved drugs, which are promising to be repositioned to orphan diseases.

Exploiting binding site similarity for drug repositioning

There have been a number of approaches to computational drug repositioning, including similarity of side effects mined from literature [12, 13], similarity of gene expression profiles of different diseases [14] and structural similarity of binding sites [15]. In this review, we focus on the latter, since it provides insight into the mode of action of the drug and since there are already some success stories. Although there are drugs binding to other biological targets, we will concentrate on proteins here. To complement the space in the protein–drug interactome, which is not covered by this approach, we briefly introduce some other methods at the end of this review.

To reduce attrition rates (taking advantage of the ever growing amount of available data), using computational methods to find alternative targets (or similar binding sites) is getting important in the beginning of the drug discovery process. The examples of staurosporine binding to synapsin I [16] and the high affinity binding of celecoxib to carbonic anhydrases [17] as well as many others, show that similar binding sites exist among different proteins. It is reasonable to conclude that similar binding sites most likely bind the same ligands. This observation can be exploited, using binding site comparison methods to find new targets for known drugs leading eventually to drug repositioning (Figure 1).

Figure 1:

The drug repositioning process using binding site similarity. Two proteins A and B have similar binding sites and thus can be aligned (C). The found binding site similarity suggests that the ligand D may also bind to the protein B. This gives a candidate for drug repositioning.

Figure 1:

The drug repositioning process using binding site similarity. Two proteins A and B have similar binding sites and thus can be aligned (C). The found binding site similarity suggests that the ligand D may also bind to the protein B. This gives a candidate for drug repositioning.

Shedding light on the protein–drug interaction space helps to better understand drug modes of action and can help to reduce drug doses. The identification of off-targets gives the opportunity to optimize drugs to gain a higher selectivity and thus reduce side effects.

Promiscuity of drugs

The prerequisite for drug repositioning is polypharmacology or drug promiscuity, meaning that one drug binds to multiple distinct targets. Recent experiments have shown that promiscuity of proteins in ligand binding as well as in function is not as rare as previously thought [18, 19]. Moreover, degeneracy (partial redundancy) was found to be a key design principle in biological systems [20], since it increases the adaptability of an organism to environmental changes. Promiscuous drugs are common [21] and thus there is an enormous potential to find novel targets for already known drugs based on the approved targets. The definition of a drug target is often not clear. If one defines a drug target as a protein to which a drug molecule binds physically, there are about 320 known targets for approved drugs [22]. Applying a more loose definition and including experimental drugs can easily lead to 6000 or more ‘drug targets’ [23].

A study of a data set of 276 122 bioactive compounds revealed that 35% of them are known to bind to more than one target [24]. A quarter of these was found to bind to proteins from different gene families. In a recent analysis of 189 807 bioactive compounds in PubChem, 62% of these were found to bind to more than one target [25]. Among the remaining 38%, about half of them were highly selective. The permissive binding of a drug to off-targets can be the cause of adverse side effects but may in contrast also increase its efficacy like reported for anticancer and antipsychotic drugs [21, 26, 27]. Moreover, the old ‘one drug—one target’ paradigm erodes and so called ‘dirty drugs’ gain popularity [26, 28, 29]. Similarities in the key pharmacophores of structurally different proteins can lead to high affinity binding of the same drug [17]. Obviously, proteins within the same family are more likely to show this phenomenon. One way to identify promiscuous proteins is chemocentric, which exploits chemical similarity among the ligands [30, 31]. This approach is weakened by the finding that chemical similarity seems to contribute much less to similarity in the biological activity profile than the established dogma suggests [32]. The other prospect is the binding site centric chemical space, which is likely to become the new paradigm in drug discovery [15] and is the focus of this review.

Examples of repositioned drugs

Neglected tropical diseases are particularly valuable for drug repositioning. This is because the development of new drugs is not profitable for such rare diseases. Especially, the issue of emerging resistances can be addressed by repositioning. For example, eflornithine was developed as an anticancer compound and is now used against women's facial hair [33] and more importantly, African trypanosomiasis (sleeping sickness) [34]. Eflornithine inhibits the ornithine decarboxylase in humans as well as in the protozoa. These proteins are very similar in sequence and binding site. The drugs pentamidine, amphothericin B (originally an antifungal drug) and miltefosine were all repositioned from other indications for the chemotherapy of leishmaniasis [35]. In 2007, the anti-estrogen drug tamoxifen, used in breast cancer therapy, was shown to be very effective against leishmaniasis [36]. These examples show that repositioned drugs play a central role in the treatment of these diseases. By screening a library of 2687 mostly approved drugs in vitro, the antihistamine astemizole was identified as an antimalarial agent [37]. For the neglected tropical disease onchocerciasis (river blindness), the veterinary anthelmintic closantel was proven to be active against the nematode causing this disease [38]. The in vitro screening against the parasite’s chitinase identified closantel as being highly active against it, which was previously not known. Thus, drug repositioning approaches can also help to shed light on the mechanism of action of approved drugs. Combinations of approved drugs (multidrug cocktails) can yield treatments for diseases being untreatable otherwise. The identification of such cocktails can also be supported by in silico methods [39] but will not be subject to this review. For example, a combination of the β-lactamase inhibitor clavulanate with a β-lactam antibiotic is effective against the extensively drug-resistant Mycobacterium tuberculosis [40]. HIV protease inhibitors and other anti-HIV drugs were found to be promising cancer therapeutics [41, 42]. Beneficial effects in cancer treatment were also observed for several cyclooxygenase inhibitors. For one of these, namely sulindac, Lee et al. [43] showed recently that it binds to the protein dishevelled being a key part of the Wnt-signaling pathway. An unregulated Wnt signaling is found in many cancers. Thus, sulindac forms a new lead in the development of new cancer drugs [44]. Earlier, Weber et al. [17] showed that the binding site of celecoxib in the cyclooxygenase-2 has a similar analog in the carbonic anhydrase. In experiments, they showed a nanomolar binding of celecoxib to the carbonic anhydrase isoenzymes. This offers a new drug for the treatment of glaucoma and possibly also for cancer.

There are many more examples of (potential) drug repositionings described in the literature [2, 9, 45–49]. As for the prominent example of sildenafil (Viagra) [50], many of the known repurposings are serendipitous findings.

Identification of new drug targets

To systematically find new drug targets, and thus candidates for repositioning, one can choose between various techniques. In the best case, a combination of the techniques introduced in the following would be applied, thus allowing to find hits that were not found by one particular method.

Non-in silico approaches involve noninvasive imaging, in vivo disease animal models, improved bioanalytics and in vitro screening [4, 5]. Non-invasive imaging is a bioluminescence technique, where mice were genetically engineered such that they express luciferase when the test compound hits a particular protein. Improved bioanalytics roots in administering the drug to rats and examining their biological fluids. Liquid chromatography and mass spectrometry are employed to quantify interesting metabolites. For example, a drop in the glucose level makes a drug a possible diabetic agent [5]. In vitro target-based screening with already known drugs can yield compounds that could be directly used in clinical trials [48]. These assays are normally employed stepwise, starting with sentinel assays and using specific assays, for example, of proteins in the hit pathway. Particular problems with this approach arise when applying it on large scale. Building a chemical library of all approved or clinical trial drugs is very challenging. It is estimated that there are around 9000 of such drugs [2]. Such a comprehensive chemical library is not yet available. Building a library like this will be expensive. Specifically obtaining compounds under patent will be problematic and costly [2], but there are efforts [2] in such a direction.

In silico approaches can help not only to avoid animal testing but also for screening purposes and significantly decrease drug development costs. They make a huge amount of data accessible and processable, which evade large-scale experimental investigations, e.g. due to patents. The retrieval of data is not an issue anymore, since most drug structures, mapped to a wealth of additional information, are freely available online [25, 51–54] together with a fast growing amount of structural data of biological macromolecules in the Protein Data Bank (PDB) [55, 56] and complementing databases [57, 58]. With the huge number of software tools to support the drug repositioning process, there is nowadays the opportunity to complement and partly replace the aforementioned methods by in silico approaches.

By discussing binding site similarity in support of drug repositioning in depth and outlining several other in silico approaches, this review aims at supporting systematic drug repositioning.

EXPLOITING BINDING SITE SIMILARITY FOR DRUG REPOSITIONING

One of the reasons for polypharmacology is the presence of similar binding sites among proteins. These proteins either belong to the same family, thus showing a similar tertiary structure or are distinct in their overall structure. The promiscuous binding of drugs is a major cause of (adverse) side effects, but at the same time, opens the door for the repositioning of (withdrawn) drugs.

The binding site centric chemical space seems to become the new paradigm in structure-based drug discovery [15]. The wealth of tools available (as downloadable executable or webserver) for binding site comparison [15, 59–68] together with the ever growing structural data in the PDB gives the opportunity to extensively use protein structural data to find new drug targets and thus, identify candidate drugs for repositioning. The aim of all these methods is to reveal structural similarity in protein binding site pairs, although not necessarily similar in sequence or fold. Searching the whole PDB for a specific binding site can identify unexpected potential off-targets right at the beginning of the drug discovery process.

Binding site representation

Current binding site comparison tools allow the quantification of local similarity between protein structures. The algorithms underlying the available tools are mostly based on geometrical comparisons. One thing is common to all the local structural comparison methods: they abstract from the binding site, converting it into a discrete simplified representation. But this commonality is what distinguishes them.

Binding sites can be represented in many ways: examples are pharmacophoric features, atoms or whole residues. These are represented by one or more points and are labeled according to their physicochemical or geometric properties. There are also approaches, which abstain from such a labeling and use, e.g. substitution matrices to decide whether an amino acid alignment is favorable [69, 70]. Only surface residues, or at least such residues that are close to the surface, are considered. The binding pockets of a protein are either identified by the analysis of the surface, by cocrystallized ligands or by user input. Some tools also analyze the complete protein surface. The points can be assigned grid-based or irregularly, represent parts of the residues (Cα- or pseudo atoms-like centers of functional groups), individual atoms or neighborhood properties. Such points are called features. The resulting feature patterns of two binding sites are then structurally aligned such that the number of matched features is optimized. The binding site features are either represented as geometric patterns or as numerical fingerprints [59].

Alignment algorithms

The local structural alignment programs typically first create the reduced geometrical representation of the binding site. This is to reduce the complexity of the pair-wise comparison. Second, properties are assigned to the points. Finally, the binding sites are aligned. There are three key approaches for binding site superimposition.

The simplest method for the alignment is iterative search for the best translation/rotation. The largest alignment is normally considered best. Because this method is slow, other approaches are preferred [59].

Grouping close points into triplets and testing all possible triplets reduces the pattern complexity. Geometric matching of such triplets gives the transformations for the superimposition. A more efficient method is geometric hashing [71], which maps all triangles similar in shape, regardless of their orientation, to identical or very close hash slots. Searching for all triplets of the query cavity in the hash keys of the target hash table and aligning these triplets gives a transformation for each pair of triplets. The most frequent transformation corresponds to the largest alignment [59].

Clique detection is another frequently used method. A graph is constructed for each, the query (Q) and the target binding site (T). The points are represented as nodes colored according to the points’ labels (often physicochemical properties like hydrogen bond donor/acceptor). Edges connect all nodes within a defined distance range (e. g. ≤12 Å) and are labeled with the distance of the points. Nodes q and t from Q and T are paired if they are colored identically. These node pairs [q, t] are used to construct the product graph P. An edge in P is formed between [q, t] and [r, u] iff (q,r) ∈ EQ and (t, u) ∈ ET and the distance labels are equal (e.g. ±1.5 Å) for the edges (q, r) and (t, u). The largest clique in P then corresponds to the largest alignment [59]. Clique detection as well as matching of triplets favors the detection of local motifs. This graph-based method is used, for example, in the ProBiS algorithm, which uses pseudo atoms, representing chemical functional groups [61] and the Sequence-Order Independent Profile–Profile Alignment (SOIPPA) algorithm, which uses a Cα-representation of the amino acid residues to achieve a better applicability to low resolution structures and homology models [72, 73].

Another method, based on the binding sites numerical fingerprints, is used in SiteAlign [64]. The fingerprints are derived by projecting Cβ atoms and their side chains’ physicochemical properties to a discretized sphere. It employs a fuzzier matching of binding sites, allowing a variation in structure of ∼3 Å [root mean square deviation, (RMSD)]. Hoffmann et al. [63] recently introduced an approach that groups binding sites into ‘clouds’ of labeled atoms, which they then align. Their rationale was to provide a method allowing the identification of similar pockets, independently of individual atoms.

Alignment-free binding site comparison methods [66, 68] have emerged in the last 2 years and are reported to be as accurate as alignment-dependent tools. They are also suitable to detect local similarity in the binding sites [74].

Scoring functions

To judge the quality of a binding site alignment, a scoring function has to be applied. The scoring can be directly incorporated in the alignment process, excluding bad scoring alignments. A key influence to the score is the number of aligned points. Basic scores are Tanimoto-index like (number of aligned features divided by the total number of features). The matched points can also be weighted (e.g. by RMSD) to account for geometry or other (physicochemical) properties. Mismatches are normally not penalized, which enables the detection of partial similarities (local motifs) and the comparison of cavities of different size [59]. Amino acid substitution matrices have also been successfully applied to identify similar binding sites [69]. Normalized scores permit the ranking of different alignments of one query to different target binding sites. A possible normalization is dividing the target–query score by the query–query score. Scores have to be adjusted with the application in mind: residues interacting with the ligand, e.g. through hydrogen bonds or π-stacking, could be ranked higher when aligned.

Successful alignment

Apart from the method used for the binding site alignment and the scoring function, there are three critical parameters, which influence the success of the site alignment: (i) resolution of the binding site, (ii) used distance tolerance and (iii) the complexity of feature descriptors. Thus, methods, which use a comprehensive (i.e. one point for each atom) representation (like grid-based methods) are more prone to false negatives, which are due to slight changes in the binding sites in comparison to Cα atom-based descriptions. The sensitivity to atomic coordinates should be low for the comparison of unrelated proteins. Nevertheless, side chain orientation is important to consider because otherwise, residues buried in the protein core could be aligned to a binding site residue [59]. On the other hand, a residue, which is buried in the crystal structure, could become an important residue for ligand interaction due to a conformational change upon binding. Thus, a fuzzier binding site representation allowing a higher flexibility of the side chains should be preferred. Even if this leads to more false positives.

Case studies

Before we discuss the limits of these approaches, we review successful applications (Tables 1 and 2).

Table 1:

Examples of the usage of binding site comparison tools to identify new drug targets

Name Purpose Targets Validated URL 
SiteAlign [64New target of staurosporine [16]. 134 bioinfo-pharma.u-strasbg.fr 
SOIPPA [72Targets of entacapone and tolcapone [76]. – funsite.sdsc.edu 
 Off-targets of cholesteryl ester transfer protein inhibitors [77]. 204 –  
 Alternative targets [73]. 87  
Minai et al. [78Alternative targets [78]. 540  
Name Purpose Targets Validated URL 
SiteAlign [64New target of staurosporine [16]. 134 bioinfo-pharma.u-strasbg.fr 
SOIPPA [72Targets of entacapone and tolcapone [76]. – funsite.sdsc.edu 
 Off-targets of cholesteryl ester transfer protein inhibitors [77]. 204 –  
 Alternative targets [73]. 87  
Minai et al. [78Alternative targets [78]. 540  

The table lists the name of the tools together with the number of proposed new targets and the number of experimentally validated targets.

Table 2:

Examples of drugs, which could be shown to bind to targets unknown so far and have similar binding sites in the already known and the new found target

Drug name Known target New target Indication (possible) 
Eflornithine Ornithine decarboxylase (Homo sapiensOrnithine decarboxylase (Trypanosoma brucei) [34Sleeping sickness 
Celecoxib Cyclo oxygenase-2 Carbonic anhydrase [17Glaucoma/cancer 
Staurosporine Pim-1 kinase Synapsin I [16– 
Entacapone Catechol-O-methyltransferase Enoyl–acyl carrier protein reductase [76Tuberculosis 
4,5-dihydroxy-3- (1-naphthyldiazenyl)-2,7- naphthalenedisulfonic acid RNA editing ligase 1 (T. bruceiMitochondrial 2-enoyl thioester reductase (H. Sapiens) [73– 
  UDP-galactose 4′ epimerase (T. brucei) [73Sleeping sickness 
Ibuprofen Phospholipase A2 Porcine pancreatic elastase [78– 
Drug name Known target New target Indication (possible) 
Eflornithine Ornithine decarboxylase (Homo sapiensOrnithine decarboxylase (Trypanosoma brucei) [34Sleeping sickness 
Celecoxib Cyclo oxygenase-2 Carbonic anhydrase [17Glaucoma/cancer 
Staurosporine Pim-1 kinase Synapsin I [16– 
Entacapone Catechol-O-methyltransferase Enoyl–acyl carrier protein reductase [76Tuberculosis 
4,5-dihydroxy-3- (1-naphthyldiazenyl)-2,7- naphthalenedisulfonic acid RNA editing ligase 1 (T. bruceiMitochondrial 2-enoyl thioester reductase (H. Sapiens) [73– 
  UDP-galactose 4′ epimerase (T. brucei) [73Sleeping sickness 
Ibuprofen Phospholipase A2 Porcine pancreatic elastase [78– 

A possible disease application is given if known.

An issue when designing kinase inhibitors is the high binding site similarity among the 518 kinases in the human genome. Milletti and Vulpetti [75] use binding affinity data of 17 kinase inhibitors bound to 189 kinases in the PDB to provide a predictive model for kinase inhibitor binding (they term it ‘inhibition map’). In the employed binding site comparison algorithm, the pocket is represented by points in space lying on spheres of radii between 2 and 16.8 Å, assigning the physicochemical properties of the atoms to the points. For each binding site atom, such spheres constitute its fingerprint. To address side chain flexibility, points representing atoms close to the Cα carry the physicochemical properties of the side chains. Using this method to screen the whole PDB, they identified and confirmed kinase targets which were not in their original data sets. Their prediction of kinase inhibitor promiscuity resulted in many false positives with respect to the predicted targets, compared to the data sets of binding affinity values.

De Franchi et al. [16] compared the staurosporine binding pocket (using SiteAlign [64], described earlier) in the Pim-1 kinase to over 6000 protein ligand binding sites and found synapsin I (involved in the regulation of neurotransmitter release) as a potential target for staurosporine. They experimentally validated the binding of staurosporine (Kd = 0.3 μM) and other kinase inhibitors to synapsin I.

Kinnings et al. [76] used an approach involving binding site comparisons to identify alternative targets for a marketed drug. Their pipeline (Figure 2) consists of: (i) the extraction of the known drug binding site from a 3D structure; (ii) the identification of similar binding sites (using the SOIPPA algorithm [72], which employs clique detection and a Cα representation); and (iii) docking to the putative target to assess atomic interactions. With this approach, they were able to identify and experimentally verify a new target of entacapone. This drug is used in the treatment of Parkinson's disease, where its confirmed target is the catechol-O-methyltransferase. Applying binding site comparison, they identified a similar pocket in the enoyl–acyl carrier protein reductase. Using entacapone to inhibit this protein, which is essential for the fatty acid biosynthesis in M. tuberculosis, offers a way to treat infections with such multiple drug resistant strains. This is because here the resistance-mediating prodrug state is avoided [76]. The two proteins lack similarity in sequence, but show a similar tertiary structure. Xie et al. [77] used this approach to find off-targets of cholesteryl ester transfer protein inhibitors.

Figure 2:

The workflow used in many studies to find new targets based on already known ones. Information about known drug–target pairs are used to identify the binding site of the query protein(s). Structures from the PDB and/or theoretical models are obtained. To reduce the set of structures, which have to be compared, e.g. sequence clustering can be applied. Subsequently, the structures are aligned. Additionally, methods, like docking the ligand of the query binding site to the target binding site, can be used to get more confident hits and to reduce the result data set.

Figure 2:

The workflow used in many studies to find new targets based on already known ones. Information about known drug–target pairs are used to identify the binding site of the query protein(s). Structures from the PDB and/or theoretical models are obtained. To reduce the set of structures, which have to be compared, e.g. sequence clustering can be applied. Subsequently, the structures are aligned. Additionally, methods, like docking the ligand of the query binding site to the target binding site, can be used to get more confident hits and to reduce the result data set.

Subsequently, the aforementioned binding site comparison algorithm [72] is used in another multidimensional approach [73] to identify potential alternative targets. That is, looking for binding sites similar to the known binding site (query site) of a given drug: an inhibitor for the Trypanosoma brucei RNA editing ligase 1 (TbREL1) serves as an example in this article. The approach of Durrant et al. [73] (see also Figure 2) consists of: (i) clustering 110 000 protein chains in the PDB (as of 2007) and picking a cluster representative for each cluster, resulting in 12 646 clusters with 30% sequence identity; (ii) using the SOIPPA algorithm, they select 218 cluster representatives having potentially a binding site similar to the query site; (iii) expanding the set of protein chains by adding all cluster members of step (ii) and restricting the set to human proteins leads to 654 protein chains; and (iv) docking the drug to each of the 654 chains and subsequent selection gives the final dataset of 87 probable protein targets (including 12 human targets) for the query drug. Most of the human proteins are involved in metabolism and DNA synthesis, repair and replication. The four targets with the highest docking scores were experimentally validated. Two of their predicted targets showed an inhibition in the micromolar range (IC50 of 33.5 μM and ∼0.7 μM), while the other two remained uninhibited. Moreover, the first two did not show global similarity in sequence or structure.

Minai et al. [78] demonstrated the applicability and usefulness of binding site comparison to find unknown protein–drug interactions (Figure 2). For each solvent-accessible binding site atom i, they computed a feature vector, which encodes the molecular neighborhood of i in terms of physicochemical properties depending on the distance to the atom i (maximum distance is 3.2 Å). Triplets of atoms from the query and the target binding site are used to compute an optimal alignment. The applied similarity score is based on euclidean distance and Tanimoto coefficient calculations. They used a non-redundant representation of the PDB (9708 chains, as of November 2004) and compared 48 347 binding sites therein binding to each other resulting in 10 403 ligand binding site pairs with similar binding site structures but dissimilar folds. For 281 of these, they found similar ligands with a similar binding mode in the PDB structures. In addition, they could show that the found pairs correlate with biological activity data in the PubChem BioAssay database. One novel found protein–drug interaction (ibuprofen binding to porcine pancreatic elastase, the original target is phospholipase A2) was validated in a nuclear magnetic resonance (NMR) experiment. In total, they predicted 540 proteins (with 209 different binding sites) binding 105 drugs by docking these ligands to the appropriate pockets.

Park and Kim [69] studied the relationship between ligands, in terms of their protein binding sites, by constructing and analyzing a network of ligands as nodes, which were connected by an edge if they were bound to similar binding sites. They used a clique detection algorithm to perform an all-against-all comparison of about 15 000 binding sites binding one of the 4208 ligands. Nodes are the amino acid residues represented by their Cα atoms. Physicochemical properties of the amino acids are ignored. Instead, they used the BLOSUM62 to score amino acid substitutions according to their empirical frequency. The score from the substitution matrix is applied if two residues match spatially and a score of −10 for a mismatch. Additionally, binding sites have to be of equal size to be compared (maximum 1.4-fold larger). Such a simple representation adds tolerance to deviations in the binding site structures of two proteins (low resolution structures). Using EC numbers (enzyme classification according to the catalyzed chemical reaction), they demonstrated that there is a strong relation between the ligand binding site and protein function.

Gold and Jackson [79] used geometric hashing with binding sites represented on atom level to build a database of binding sites (SitesBase).

DISCUSSION

The aforementioned examples show that protein ligand binding sites can be similar across very different proteins. Indeed, similar binding sites may bind the same ligand. Binding site comparison tools offer a fast possibility to exploit the similarities to identify new drug targets in silico, saving money and time. Its advantage over chemocentric approaches like virtual ligand screening (reverse docking) is that the sampling of the ligand conformational space is avoided. This eludes hits due to biologically inactive ligand conformations [74].

In the following, we will discuss problems and limitations, arising when binding site comparisons are employed. These involve the selection of an appropriate tool, computation time, the lack of structures and the clustering of these, as well as the flexibility of protein binding sites and the binding modes of ligands.

Choice of the algorithm

The decision, which algorithm or tool to use should always be driven by the kind of investigation and its specific requirements. A major impact on the identification of analog binding sites has the resolution of the structure. If low-resolution structures and/or homology models ought to be employed, then a fuzzier representation of the binding site (e.g. side chains represented by Cα atoms), allowing side chain flexibility, should be the matter of choice. A tolerance (of e.g. 3 Å) can be employed for the alignment to achieve more fuzziness. For finding new drug targets, the detection of local similarities in the binding sites (like in clique based or geometric hashing algorithms) should be favored over whole binding site alignments. See Table 1 for examples of applied tools.

Computation time

Finally, the computation time needed for a single alignment should be considered. Tools with lower computation time requirements are more suitable for large scale comparisons. In general, geometric hashing is faster than clique detection. The time needed for a single binding site comparison lies between 50 ms and 5 min [59]. For large-scale binding site comparisons, tools running on local computers are more suitable. Whereas, for single case studies, webserver implementations offer a fast and straightforward way to compare binding sites.

Protein structures

Despite its many opportunities, this approach suffers from a lack of structural data. SwissProt currently contains 522 019 peptide sequences [80], while the PDB currently holds 64 346 protein structures containing 39 895 unique sequences [81]. Furthermore, of the 1504 drug targets (with assigned Uniprot IDs) in the Therapeutic Target Database [51], only 753 are present in the PDB (internal analysis, V. Joachim Haupt, unpublished data). The gap between the structure and sequence space can at the moment only be filled by theoretical models [82–84]. Clearly, this gain in structural data has to be paid by loosing accuracy. Currently, homology modeling gives the highest overall accuracy. The quality of these models is mainly dependent on the existence of appropriate templates (homologs with a high sequence identity). For close homologs, state-of-the-art tools calculate in most cases models with an RMSD of about 2 Å from the experimental structure. To achieve this, a sequence identity of above 35% is mostly sufficient (according to data from CASP7 and CASP8) [85]. Nevertheless, if proper templates are absent, the reliability of theoretical models is still far from high.

Clustering

Sequence-based clustering can be applied to notably reduce the set of binding sites to compare. This strategy is used in several studies, aiming at finding new drug targets, but has a huge problem: a cluster representative is used in the subsequent binding site comparisons. With a decreasing degree of sequence identity used for the clustering, the probability of different conformations—and thus different binding sites—within the clusters grows. Again, a single amino acid substitution is sufficient to prevent ligand binding. Thus, in our opinion, if ever, clustering should only be employed with a high sequence identity of ≥95%.

Binding site and flexibility

Since most of the algorithms require the definition of a binding pocket as input, the result is also influenced by this definition. A binding site can be picked with automatized methods [15, 67, 86]. In the last years, big advances have been made in this area. Nevertheless, the analysis of the protein surface is still a challenge and the identification of pockets is not always successful. Beside the difficulties in extracting continuous surface patches, the binding site might be ‘hidden’: due to the flexibility of the binding site, the binding pocket may not form before ligand binding [15]. Thus, making its identification nearly impossible if a structure with cocrystallized ligand is absent.

The inherent flexibility of the binding pocket is a general problem and a known limitation in structure-based drug discovery. Only a small portion of the protein–ligand complexes are compliant with the old ‘key and lock’ hypothesis of Emil Fischer. The vast majority of the protein–ligand complexes are likely to be explained by induced fit and conformational selection theory [15]. Thus, the query binding site should be taken from a structure with co-crystallized (drug) ligand. Even better would be the usage of an ensemble of query binding sites (populated and unpopulated conformers) for subsequent comparisons. This helps to partly overcome the aforementioned limitations. Some strategies were developed [87] to select suitable conformations. However, modeling pocket flexibility is still a challenge being only partly addressed by molecular dynamics or multiple present solved structures. Finally scoring functions—to assess the similarity of two binding sites—are problematic, as they are in virtual ligand docking [88]. But not only the binding site may adopt an alternative conformation upon ligand binding. Also the ligand may show tremendously different conformers in different pockets. A principal issue of binding site comparison tools is the sensitivity of most tools to atomic coordinates, being probably the main reason for the discrepancy between the number of available tools and reported candidates for repositioning.

A general drawback of the new targets proposed by binding site comparisons are false positives. To find similar binding sites, a tie has to be broken between accuracy (a single residue substitution can prevent ligand binding) and fuzziness (small differences in the binding sites should be ignored, since the ligand binds anyway). This balancing act between the two contradictionary rationales is hard—if not impossible—to perform. Taking into account ligand information may be an option (e.g. protein–ligand fingerprints) [74] here. Ligand docking succeeding the alignment can also be employed to reduce the number of false positives.

Ligand binding modes

If ligands bind to binding sites different in structure and/or physicochemical properties, the other sites will most likely not be identified by the local structural alignment if only one query site is used. Instead, using an ensemble of query structures can, at least partly, help to overcome this problem. Such a ligand-specific set of binding sites could be constructed by collecting all structures of known targets and identify the binding sites of the ligand (either co-crystallized, by binding site detection or by docking). With a growing number of protein structures, the set of pockets for a ligand will become more and more complete.

As known from protein–protein binding sites [89], different binding partners recognize the same protein in different ways. This is also true for small molecule ligands (e. g. the anticancer drugs epothilone B and taxol binding differently to the same binding site in β-Tubulin) [90]. But there are two sides of the same coin. A promiscuous ligand can show different conformers [75], thus being able to bind to multiple proteins with completely different binding pockets. On the other hand, similar conformers (e.g. more rigid ligands) can bind to different binding pockets. In this case, different ligand binding sites ‘evolved’ to bind the same ligand, however, in different ways [91]. This is clearly where the limits of binding site comparisons are touched. For protein–protein interaction sites, it has been shown that interfaces binding the same interaction partner can be very different except from a few key residues [89]. Although having quite different binding sites, examples like subtilisin and trypsin show that no obvious similarity in sequence or structure is needed to bind the same inhibitors in a similar way [70]. If e.g. only two key pharmacophoric features are needed for ligand binding, it is hard to detect a similar binding site in another protein. Such an alternative target is very unlikely to be found with binding site comparison. Instead, one of other approaches, briefly introduced in the next section, can be employed to reveal such interactions.

Other computational approaches

Systems biology

Clearly, systems biology has an impact on drug development and discovery right from the beginning. This is in particular true for drug repositioning [14, 92]. Hopkins [29] believes the integrated system-wide view of drug interactions in combination with phenotype data—‘network pharmacology’—to become the new paradigm in drug discovery. One of his concluding statements in his 2008 review is:

‘Network pharmacology re-introduces the old idea that understanding the biological and kinetic profile of the drug is more important than individual validation of targets or combinations of targets’.

What the pharmaceutical industry currently is fighting with, is the integration and subsequent visualization of the available data, e.g. from various omics approaches, text mining, various experiments and clinical trials, to facilitate decision making and catalyze the drug development process [93]. In particular, network-based approaches are likely to make key contributions to drug repurposing [94] and were already shown to be successful [95]. Power Graphs offer a possibility to generate edge-reduced representations of networks, which dramatically facilitate visual inspection [96].

Among others, drug–target networks (possibly including information on diseases and phenotypes) can be used in the development of ‘dirty drugs’ and for drug repositioning or off-target identification [97].

The aforementioned approaches are only applicable to well-characterized molecules (presence of, e.g. structural or side effect data). Gene expression profile-based methods can be employed if these data are absent. The expression profiles derived from microarray experiments are then used to generate signatures of the drug action [14]. For example, Iorio et al. [98] provide such an approach (available as webserver), which combines transcriptional profiles of drug responses across multiple cell lines and dosages. Using similarity in the gene expression profiles, they can find previously unknown drug applications. They compared the consensus profile of a known autophagy (degradation of cell components; similar to phagocytosis) enhancer to the profiles of other drugs. Thus, they found and experimentally validated that the experimental drug fasudil promotes cellular autophagy. With that effect, this drug can be applied in disorders due to protein misfolding (e.g. neurodegenerative disorders like Alzheimer's disease) [98].

Text mining and other approaches

The biomedical literature is a valuable source of biological activity data. Text mining, together with ontologies can be used to identify targets [99], to extract drug-disease [100], drug–target relationships [13] and activity information of drugs. In particular, text mining is an alternative if comprehensive data of targets (e.g. structures of membrane proteins) are absent. The generation of ontological profiles for drugs and targets and their matching gives the opportunity to predict new drug–target relations [13].

There is indication that drugs with a similar therapeutic effect share similar or identical off-targets. This observation could be exploited to identify novel drug targets for already known drugs.

Yamanishi et al. [101] recently showed that drug–target relationships correlate stronger with pharmacological effect similarity than with chemical structure similarity. This strongly underlines the applicability of approaches based on side effect (or other pharmacological effect) profiles. In their study, they used pharmacological effect similarity of drugs (together with chemical structure similarity and genomic sequences) to predict new protein targets.

Apparently, Campillos et al. [102] were the first who used phenotypic side effect data to predict new drug targets 2 years ago. They applied their approach to 1018 drugs related by their side effects. Thus, they predicted 261 unexpected drug pairs similar in their side effect profile. Of these, 20 were tested experimentally, validating 13 unknown drug–target relationships (11 of these with Ki ≤ 10 μM). They used the Unified Medical Language Sytem (UMLS) ontology to classify the side effects and the relations between the terms in the ontology to reveal relations between drugs annotated with closely related terms [102]. Since this year, there is a database (SIDER) of side effects for 888 drugs available [12]. Similarly, therapeutic uses of approved drugs can be employed to find alternative drug applications. Chiang et al. [103] hypothesize that, if two drugs share similar therapies, one of the drugs is likely to be also applicable to the indications of the other drug. To validate their suggested drug repositionings, they checked clinical trial data for their candidates. Indeed, they found that one of their suggestions is 12× more likely to appear in the clinical trial data than arbitrary disease–drug relations. Also, statistical tools are successfully employed to detect meaningful adverse drug effects in electronic medical records [104].

Yang et al. [105] used a Chemical Protein Interactome (CPI), which they construct of a limited number of 10 drugs and 46 targets, to predict novel targets for these drugs. The CPI is constructed by docking all 10 drugs and further 34 decoy drugs to the protein binding sites, using a corrected docking score. They found that drugs with the same therapeutic area show similar CPI profiles. An inverse docking approach is also used by Bernard et al. [106], they found phosphodiesterase 4 to be a new target for the drug tofisopam used in central nervous system disorders. Inhibitors of this enzyme are under development for the treatment of respiratory diseases including asthma [107]. Virtual screening or reverse docking has been extensively employed in drug discovery but has rarely led to key contributions [108]. Again, protein flexibility is a major issue.

Using chemical similarity among ligands, Keiser et al. [30] identified thousands and validate experimentally 23 new drug–target pairs in 1400 proteins. An approach consisting of a pharmacophore representation of 7302 PDB binding sites with co-crystallized ligands is proposed by Liu et al. [109]. They provide a web sever, which performs the alignment of a submitted ligand structure to all pocket models in the database. The performance in their test case of tamoxifen has room for improvements since 10 of the 14 confirmed targets are among the top 300 hits.

Schneider et al. [110] study the application of Self Organizing Maps (SOMs)—artificial neural networks—for the classification of ligand binding sites in proteins [111]. In one application, sets of ligands (each ligand is represented by its pharmacophoric features) binding to a target, were used to train a SOM for this target. This results in a finger print (feature vector) for each target. Similar fingerprints refer to similar targets. This approach can be easily used to identify alternative drug targets.

This section provided a brief introduction to computational methods other than binding site comparison. It is an open question, which of these methods will be the main contributors to future drug repurposings. In our opinion, approaches on a systems level will be among them. Text mining has a great potential to become a key player in drug repositioning due to the ever growing amount of available literature [13]. Campillos et al. [102] impressively demonstrated how side effect profiles of drugs can be used to predict new drug targets. To retrieve such information, text mining will be the technique of choice since such information is rather contained in text documents than stored in databases. Docking approaches still suffer from insufficient scoring functions and thus hardly provide success stories [108]. Although chemical similarity of drugs does not in general lead to the binding of the same target [32], Keiser et al. [30] identified 23 new drug–target pairs using this approach. It remains to be shown what the contribution of pharmacophore mapping approaches and machine learning approaches like artificial neural networks will be.

CONCLUSION

Although the generalization of the observation that similar binding sites bind the same ligands is often true, the converse does not hold: a ligand can bind to very different binding sites. Thus, binding site comparison can only cover a part of possible drug repositionings, but with a high reliability: If the binding site is similar, the binding of the same ligands is quite likely as the presented studies show.

Current algorithms have shown to be applicable to identify such similar binding sites. Specifically with improvements in the detection of target sites, which deviate from the query in conformation, alternative drug targets can be quickly identified. This will help the big pharmaceutical companies to reduce their drug development costs and at the same time help patients with reduced drug side effects. Furthermore, drug repositioning in a systematic fashion has the potential to lead to a number of new medications, also for rare diseases. On the other hand, there are still challenges and the current approaches have still much room for improvement (e.g. solve the problem of modeling protein flexibility in reasonable computation time).

As mentioned before, binding site comparison cannot discover all alternative drug targets. A step in that direction is clearly the combination with the other approaches briefly introduced in this review. Such a combined strategy can also provide accumulated evidence if a drug–target interaction is predicted by various approaches, improving the quality of the prediction.

Clearly, the prediction of new targets only gives a guess, which has to be substantiated by further experimental analysis like affinity studies or structure elucidation. The assessment of the biological significance and falsifiability of the prediction is exactly what forms the bottleneck. Thousands of interactions are predicted while experimental validation is far behind. Going further in that direction poses the danger of being ignored by the practitioners, by burying them with result data. Nevertheless, if wisely employed, these methods form a shortcut in the drug discovery pipelines and give the opportunity to perform purposeful experiments.

It is surprising, that such binding site-centric studies are so rarely applied (in comparison to virtual screening/docking approaches), unless the presented studies draw a clear picture of success and these approaches offer a great potential to shape the future of drug design.

Key Points

  • Finding new uses for already know drugs has often led to new treatment options.

  • Similar binding sites bind the same ligands and exist among different proteins.

  • The development of tools to align such sites in silico allowed the identification of drug targets, which were unknown so far.

  • This approach is complemented by many other computational methods.

  • Together, they facilitate a systematic identification of unknown drug targets and make suggestions for drug repositioning.

FUNDING

This work was supported by the BMBF, the German Federal Ministry of Education and Research, (CLSD project) and the European Union (Ponte and PPI-Marker).

References

1
Raju
TN
The Nobel chronicles. 1988: James Whyte Black, (b 1924), Gertrude Elion (1918-99), and George H Hitchings (1905-98)
Lancet
 , 
2000
, vol. 
355
 pg. 
1022
 
2
Chong
CR
Sullivan
DJ
New uses for old drugs
Nature
 , 
2007
, vol. 
448
 (pg. 
645
-
6
)
3
DiMasi
JA
Hansen
RW
Grabowski
HG
The price of innovation: new estimates of drug development costs
J Health Econ
 , 
2003
, vol. 
22
 (pg. 
151
-
85
)
4
Insa
R
Drug repositioning: Filling the gap
European Biopharmaceutical Rev Summer
 , 
2010
 
http://www.samedanltd.com/magazine/12/issue/132/article/2698
5
Tartaglia
LA
Complementary new approaches enable repositioning of failed drug candidates
Expert Opin Investig Drugs
 , 
2006
, vol. 
15
 (pg. 
1295
-
8
)
6
Nielsch
U
Schäfer
S
Wild
H
, et al.  . 
One target-multiple indications: a call for an integrated common mechanisms strategy
Drug Discov Today
 , 
2007
, vol. 
12
 (pg. 
1025
-
31
)
7
Aronson
JK
Old drugs–new uses
Br J Clin Pharmacol
 , 
2007
, vol. 
64
 (pg. 
563
-
5
)
8
Tobinick
EL
The value of drug repositioning in the current pharmaceutical market
Drug News Perspect
 , 
2009
, vol. 
22
 (pg. 
119
-
25
)
9
Ashburn
TT
Thor
KB
Drug repositioning: identifying and developing new uses for existing drugs
Nat Rev Drug Discov
 , 
2004
, vol. 
3
 (pg. 
673
-
83
)
10
Tambuyzer
E
Rare diseases, orphan drugs and their regulation: questions and misconceptions
Nat Rev Drug Discov
 , 
2010
, vol. 
9
 (pg. 
921
-
9
)
11
The US Food and Drug Administration
New resources for drug developers: the rare disease repurposing database
  
12
Kuhn
M
Campillos
M
Letunic
I
, et al.  . 
A side effect resource to capture phenotypic effects of drugs
Mol Syst Biol
 , 
2010
, vol. 
6
 pg. 
343
 
13
Plake
C
Schroeder
M
Computational polypharmacology with text mining and ontologies
Curr Pharm Biotechnol
 , 
2011
, vol. 
3
 (pg. 
449
-
57
)
14
Dudley
JT
Schadt
E
Sirota
M
, et al.  . 
Drug discovery in a multidimensional world: Systems, patterns and networks
J Cardiovasc Transl Res
 , 
2010
, vol. 
3
 (pg. 
438
-
47
)
15
Pérot
S
Sperandio
O
Miteva
M
, et al.  . 
Druggable pockets and binding site centric chemical space: a paradigm shift in drug discovery
Drug Discovery Today
 , 
2010
, vol. 
15
 (pg. 
656
-
67
)
16
De Franchi
E
Schalon
C
Messa
M
, et al.  . 
Binding of protein kinase inhibitors to synapsin I inferred from pair-wise binding site similarity measurements
PLoS One
 , 
2010
, vol. 
5
 pg. 
e12214
 
17
Weber
A
Casini
A
Heine
A
, et al.  . 
Unexpected nanomolar inhibition of carbonic anhydrase by COX-2-selective celecoxib: new pharmacological opportunities due to related binding site recognition
J Med Chem
 , 
2004
, vol. 
47
 (pg. 
550
-
7
)
18
Nobeli
I
Favia
AD
Thornton
JM
Protein promiscuity and its implications for biotechnology
Nat Biotechnol
 , 
2009
, vol. 
27
 (pg. 
157
-
67
)
19
Kim
WK
Henschel
A
Winter
C
, et al.  . 
The many faces of protein-protein interactions: A compendium of interface geometry
PLoS Comput Biol
 , 
2006
, vol. 
2
 pg. 
e124
 
20
Whitacre
J
Bender
A
Degeneracy: a design principle for achieving robustness and evolvability
J Theor Biol
 , 
2010
, vol. 
263
 (pg. 
143
-
53
)
21
Imming
P
Sinning
C
Meyer
A
Drugs, their targets and the nature and number of drug targets
Nat Rev Drug Discov
 , 
2006
, vol. 
5
 (pg. 
821
-
34
)
22
Overington
JP
Al-Lazikani
B
Hopkins
AL
How many drug targets are there?
Nat Rev Drug Discov
 , 
2006
, vol. 
5
 (pg. 
993
-
6
)
23
Wishart
DS
Knox
C
Guo
AC
, et al.  . 
DrugBank: a comprehensive resource for in silico drug discovery and exploration
Nucleic Acids Res
 , 
2006
, vol. 
34
 (pg. 
D668
-
72
)
24
Paolini
GV
Shapland
RHB
van Hoorn
WP
, et al.  . 
Global mapping of pharmacological space
Nat Biotechnol
 , 
2006
, vol. 
24
 (pg. 
805
-
15
)
25
Li
Q
Cheng
T
Wang
Y
, et al.  . 
PubChem as a public resource for drug discovery
Drug Discov Today
 , 
2010
, vol. 
15
 (pg. 
1052
-
7
)
26
Frantz
S
Drug discovery: playing dirty
Nature
 , 
2005
, vol. 
437
 (pg. 
942
-
3
)
27
Mencher
SK
Wang
LG
Promiscuous drugs compared to selective drugs (promiscuity can be a virtue)
BMC Clin Pharmacol
 , 
2005
, vol. 
5
 pg. 
3
 
28
Hopkins
AL
Mason
JS
Overington
JP
Can we rationally design promiscuous drugs?
Curr Opin Struct Biol
 , 
2006
, vol. 
16
 (pg. 
127
-
36
)
29
Hopkins
AL
Network pharmacology: the next paradigm in drug discovery
Nat Chem Biol
 , 
2008
, vol. 
4
 (pg. 
682
-
90
)
30
Keiser
MJ
Setola
V
Irwin
JJ
, et al.  . 
Predicting new molecular targets for known drugs
Nature
 , 
2009
, vol. 
462
 (pg. 
175
-
81
)
31
Wass
MN
Sternberg
MJE
Prediction of ligand binding sites using homologous structures and conservation at CASP8
Proteins
 , 
2009
, vol. 
77
 (pg. 
147
-
51
)
32
Martin
YC
Kofron
JL
Traphagen
LM
Do structurally similar molecules have similar biological activity?
J Med Chem
 , 
2002
, vol. 
45
 (pg. 
4350
-
8
)
33
Wolf
JE
Shander
D
Huber
F
, et al.  . 
Randomized, double-blind clinical evaluation of the efficacy and safety of topical eflornithine HCl 13.9% cream in the treatment of women with facial hair
Int J Dermatol
 , 
2007
, vol. 
46
 (pg. 
94
-
8
)
34
Pepin
J
Milord
F
Guern
C
, et al.  . 
Difluoromethylornithine for arseno-resistant Trypanosoma brucei gambiense sleeping sickness
Lancet
 , 
1987
, vol. 
2
 (pg. 
1431
-
3
)
35
Ouellette
M
Drummelsmith
J
Papadopoulou
B
Leishmaniasis: drugs in the clinic, resistance and new developments
Drug Resist Updat
 , 
2004
, vol. 
7
 (pg. 
257
-
66
)
36
Miguel
DC
Yokoyama-Yasunaka
JKU
Andreoli
WK
, et al.  . 
Tamoxifen is effective against leishmania and induces a rapid alkalinization of parasitophorous vacuoles harbouring leishmania (leishmania) amazonensis amastigotes
J Antimicrob Chemother
 , 
2007
, vol. 
60
 (pg. 
526
-
34
)
37
Chong
CR
Chen
X
Shi
L
, et al.  . 
A clinical drug library screen identifies astemizole as an antimalarial agent
Nat Chem Biol
 , 
2006
, vol. 
2
 (pg. 
415
-
6
)
38
Gloeckner
C
Garner
AL
Mersha
F
, et al.  . 
Repositioning of an existing drug for the neglected tropical disease onchocerciasis
Proc Natl Acad Sci USA
 , 
2010
, vol. 
107
 (pg. 
3424
-
9
)
39
Yao
L
Evans
JA
Rzhetsky
A
Novel opportunities for computational biology and sociology in drug discovery
Trends Biotechnol
 , 
2009
, vol. 
27
 (pg. 
531
-
40
)
40
Hugonnet
JE
Tremblay
LW
Boshoff
HI
, et al.  . 
Meropenem-clavulanate is effective against extensively drug-resistant Mycobacterium tuberculosis
Science
 , 
2009
, vol. 
323
 (pg. 
1215
-
8
)
41
Chow
WA
Jiang
C
Guan
M
Anti-HIV drugs for cancer therapeutics: back to the future?
Lancet Oncol
 , 
2009
, vol. 
10
 (pg. 
61
-
71
)
42
Bernstein
WB
Dennis
PA
Repositioning HIV protease inhibitors as cancer therapeutics
Curr Opin HIV AIDS
 , 
2008
, vol. 
3
 (pg. 
666
-
75
)
43
Lee
HJ
Wang
NX
Shi
DL
, et al.  . 
Sulindac inhibits canonical Wnt signaling by blocking the PDZ domain of the protein dishevelled
Angew Chem Int Ed
 , 
2009
, vol. 
48
 (pg. 
6448
-
52
)
44
Simard
JR
Rauh
D
Chemical and structural biology to direct the repurposing of sulindac
Chem Med Chem
 , 
2009
, vol. 
4
 (pg. 
1793
-
5
)
45
Chong
CR
Xu
J
Lu
J
, et al.  . 
Inhibition of angiogenesis by the antifungal drug itraconazole
ACS Chem Biol
 , 
2007
, vol. 
2
 (pg. 
263
-
70
)
46
Giambelli
C
Fei
DL
Wang
H
, et al.  . 
Repurposing an old anti-fungal drug as a hedgehog inhibitor
Protein & Cell
 , 
2010
, vol. 
1
 (pg. 
417
-
8
)
47
de Boer
TP
Nalos
L
Stary
A
, et al.  . 
The anti-protozoal drug pentamidine blocks KIR2.x-mediated inward rectifier current by entering the cytoplasmic pore region of the channel
Br J Pharmacol
 , 
2010
, vol. 
159
 (pg. 
1532
-
41
)
48
O’Connor
KA
Roth
BL
Finding new tricks for old drugs: an efficient route for public-sector drug discovery
Nat Rev Drug Discov
 , 
2005
, vol. 
4
 (pg. 
1005
-
14
)
49
Singhal
S
Mehta
J
Desikan
R
, et al.  . 
Antitumor activity of thalidomide in refractory multiple myeloma
N Engl J Med
 , 
1999
, vol. 
341
 (pg. 
1565
-
71
)
50
Goldstein
I
Lue
TF
Padma-Nathan
H
, et al.  . 
Oral sildenafil in the treatment of erectile dysfunction sildenafil study group
N Engl J Med
 , 
1998
, vol. 
338
 (pg. 
1397
-
1404
)
51
Zhu
F
Han
B
Kumar
P
, et al.  . 
Update of TTD: Therapeutic target database
Nucleic Acids Res
 , 
2010
, vol. 
38
 (pg. 
D787
-
91
)
52
Wishart
DS
Knox
C
Guo
AC
, et al.  . 
DrugBank: a knowledgebase for drugs, drug actions and drug targets
Nucleic Acids Res
 , 
2008
, vol. 
36
 (pg. 
D901
-
6
)
53
Günther
S
Kuhn
M
Dunkel
M
, et al.  . 
SuperTarget and Matador: resources for exploring drug-target relationships
Nucleic Acids Res
 , 
2008
, vol. 
36
 (pg. 
D919
-
22
)
54
Liu
T
Lin
Y
Wen
X
, et al.  . 
BindingDB: a web-accessible database of experimentally determined protein-ligand binding affinities
Nucleic Acids Res
 , 
2007
, vol. 
35
 (pg. 
D198
-
201
)
55
Berman
HM
Westbrook
J
Feng
Z
, et al.  . 
The Protein Data Bank
Nucleic Acids Res
 , 
2000
, vol. 
28
 (pg. 
235
-
42
)
56
Rose
PW
Beran
B
Bi
C
, et al.  . 
The RCSB protein data bank: redesigned web site and web services
Nucleic Acids Res
 , 
2011
, vol. 
39
 (pg. 
D392
-
401
)
57
Kellenberger
E
Muller
P
Schalon
C
, et al.  . 
sc-PDB: an annotated database of druggable binding sites from the protein data bank
J Chem Inf Model
 , 
2006
, vol. 
46
 (pg. 
717
-
27
)
58
Joosten
RP
te Beek
TAH
Krieger
E
, et al.  . 
A series of PDB related databases for everyday needs
Nucleic Acids Res
 , 
2011
, vol. 
39
 (pg. 
D411
-
9
)
59
Kellenberger
E
Schalon
C
Rognan
D
How to measure the similarity between protein ligand-binding sites?
Current Computer-Aided Drug Design
 , 
2008
, vol. 
4
 (pg. 
209
-
20
)
60
Das
S
Krein
MP
Breneman
CM
Pesdserv: a server for high-throughput comparison of protein binding site surfaces
Bioinformatics
 , 
2010
, vol. 
26
 (pg. 
1913
-
4
)
61
Konc
J
Janežič
D
ProBiS algorithm for detection of structurally similar protein binding sites by local structural alignment
Bioinformatics
 , 
2010
, vol. 
26
 (pg. 
1160
-
8
)
62
Gherardini
PF
Ausiello
G
Helmer-Citterich
M
Superpose3D: A local structural comparison program that allows for user-defined structure representations
PLoS One
 , 
2010
, vol. 
5
 pg. 
e11988
 
63
Hoffmann
B
Zaslavskiy
M
Vert
JP
, et al.  . 
A new protein binding pocket similarity measure based on comparison of clouds of atoms in 3D: application to ligand prediction
BMC Bioinformatics
 , 
2010
, vol. 
11
 pg. 
99
 
64
Schalon
C
Surgand
JS
Kellenberger
E
, et al.  . 
A simple and fuzzy method to align and compare druggable ligand-binding sites
Proteins
 , 
2008
, vol. 
71
 (pg. 
1755
-
78
)
65
Ren
J
Xie
L
Li
WW
, et al.  . 
SMAP-WS: a parallel web service for structural proteome-wide ligand-binding site comparison
Nucleic Acids Res
 , 
2010
, vol. 
38
 (pg. 
W441
-
4
)
66
Yeturu
K
Chandra
N
PocketMatch: a new algorithm to compare binding sites in protein structures
BMC Bioinformatics
 , 
2008
, vol. 
9
 pg. 
543
 
67
Henrich
S
Salo-Ahen
OMH
Huang
B
, et al.  . 
Computational approaches to identifying and characterizing protein binding sites for ligand design
J Mol Recognit
 , 
2010
, vol. 
23
 (pg. 
209
-
19
)
68
Weill
N
Rognan
D
Alignment-free ultra-high-throughput comparison of druggable protein-ligand binding sites
J Chem Inf Model
 , 
2010
, vol. 
50
 (pg. 
123
-
35
)
69
Park
K
Kim
D
Binding similarity network of ligand
Proteins
 , 
2008
, vol. 
71
 (pg. 
960
-
71
)
70
Henschel
A
Kim
WK
Schroeder
M
Equivalent binding sites reveal convergently evolved interaction motifs
Bioinformatics
 , 
2006
, vol. 
22
 (pg. 
550
-
5
)
71
Brakoulias
A
Jackson
RM
Towards a structural classification of phosphate binding sites in protein-nucleotide complexes: an automated all-against-all structural comparison using geometric matching
Proteins
 , 
2004
, vol. 
56
 (pg. 
250
-
60
)
72
Xie
L
Bourne
PE
Detecting evolutionary relationships across existing fold space, using sequence order-independent profile-profile alignments
Proc Natl Acad Sci USA
 , 
2008
, vol. 
105
 (pg. 
5441
-
6
)
73
Durrant
JD
Amaro
RE
Xie
L
, et al.  . 
A multidimensional strategy to detect polypharmacological targets in the absence of structural and sequence homology
PLoS Comput Biol
 , 
2010
, vol. 
6
 pg. 
e1000648
 
74
Rognan
D
Structure-based approaches to target fishing and ligand profiling
Mol Inform
 , 
2010
, vol. 
29
 (pg. 
176
-
87
)
75
Milletti
F
Vulpetti
A
Predicting polypharmacology by binding site similarity: From kinases to the protein universe
J Chem Inf Model
 , 
2010
, vol. 
50
 (pg. 
1418
-
31
)
76
Kinnings
SL
Liu
N
Buchmeier
N
, et al.  . 
Drug discovery using chemical systems biology: repositioning the safe medicine comtan to treat multi-drug and extensively drug resistant tuberculosis
PLoS Comput Biol
 , 
2009
, vol. 
5
 pg. 
e1000423
 
77
Xie
L
Li
J
Xie
L
, et al.  . 
Drug discovery using chemical systems biology: identification of the protein-ligand binding network to explain the side effects of cetp inhibitors
PLoS Comput Biol
 , 
2009
, vol. 
5
 pg. 
e1000387
 
78
Minai
R
Matsuo
Y
Onuki
H
, et al.  . 
Method for comparing the structures of protein ligand-binding sites and application for predicting protein-drug interactions
Proteins
 , 
2008
, vol. 
72
 (pg. 
367
-
81
)
79
Gold
ND
Jackson
RM
Fold independent structural comparisons of protein-ligand binding sites for exploring functional relationships
J Mol Biol
 , 
2006
, vol. 
355
 (pg. 
1112
-
24
)
80
The Swiss-Prot group of the Swiss Institute of Bioinformatics
UniprotKB/Swiss-Prot protein knowledgebase release 2010_11 statistics
  
http://www.expasy.org/sprot/relnotes/relstat.html (26 November 2010, date last accessed)
81
The RCSB Protein Data Bank
PDB statistics
  
http://www.pdb.org/pdb/static.do?p=general_information/pdb_ statistics/index.html (26 November 2010, date last accessed)
82
Kiefer
F
Arnold
K
Künzli
M
, et al.  . 
The SWISS-MODEL repository and associated resources
Nucleic Acids Res
 , 
2009
, vol. 
37
 (pg. 
D387
-
92
)
83
Pieper
U
Webb
BM
Barkan
DT
, et al.  . 
Modbase, a database of annotated comparative protein structure models, and associated resources
Nucleic Acids Res
 , 
2010
, vol. 
39
 (pg. 
D465
-
74
)
84
Arnold
K
Kiefer
F
Kopp
J
, et al.  . 
The protein model portal
J Struct Funct Genomics
 , 
2009
, vol. 
10
 (pg. 
1
-
8
)
85
Zhang
Y
Protein structure prediction: when is it useful?
Curr Opin Struct Biol
 , 
2009
, vol. 
19
 (pg. 
145
-
55
)
86
Huang
B
Schroeder
M
LIGSITEcsc: predicting ligand binding sites using the connolly surface and degree of conservation
BMC Struct Biol
 , 
2006
, vol. 
6
 pg. 
19
 
87
Rueda
M
Bottegoni
G
Abagyan
R
Recipes for the selection of experimental protein conformations for virtual screening
J Chem Inf Model
 , 
2010
, vol. 
50
 (pg. 
186
-
93
)
88
Plewczynski
D
Laźniewski
M
Augustyniak
R
, et al.  . 
Can we trust docking results? Evaluation of seven commonly used programs on PDBbind database
J Comput Chem
 , 
2011
, vol. 
32
 (pg. 
742
-
55
)
89
Martin
J
Beauty is in the eye of the beholder: proteins can recognize binding sites of homologous proteins in more than one way
PLoS Comput Biol
 , 
2010
, vol. 
6
 pg. 
e1000821
 
90
Nettles
JH
Li
H
Cornett
B
, et al.  . 
The binding mode of epothilone A on α,β-tubulin by electron crystallography
Science
 , 
2004
, vol. 
305
 (pg. 
866
-
9
)
91
Gherardini
PF
Ausiello
G
Russell
RB
, et al.  . 
Modular architecture of nucleotide-binding pockets
Nucleic Acids Res
 , 
2010
, vol. 
38
 (pg. 
3809
-
16
)
92
Schrattenholz
A
Groebe
K
Soskic
V
Systems biology approaches and tools for analysis of interactomes and multi-target drugs
Methods Mol Biol
 , 
2010
, vol. 
662
 (pg. 
29
-
58
)
93
Campbell
SJ
Gaulton
A
Marshall
J
, et al.  . 
Visualizing the drug target landscape
Drug Discov Today
 , 
2010
, vol. 
15
 (pg. 
3
-
15
)
94
Pujol
A
Mosca
R
Farrés
J
, et al.  . 
Unveiling the role of network and systems biology in drug discovery
Trends Pharmacol Sci
 , 
2010
, vol. 
31
 (pg. 
115
-
23
)
95
Cockell
SJ
Weile
J
Lord
P
, et al.  . 
An integrated dataset for in silico drug discovery
J Integr Bioinform
 , 
2010
, vol. 
7
 pg. 
116
 
96
Royer
L
Reimann
M
Andreopoulos
B
, et al.  . 
Unraveling protein networks with Power Graph Analysis
PLoS Comput Biol
 , 
2008
, vol. 
4
 pg. 
e1000108
 
97
Lee
S
Park
K
Kim
D
Building a drug–target network and its applications
Expert Opin Drug Dis
 , 
2009
, vol. 
4
 (pg. 
1177
-
89
)
98
Iorio
F
Bosotti
R
Scacheri
E
, et al.  . 
Discovery of drug mode of action and drug repositioning from transcriptional responses
Proc Natl Acad Sci USA
 , 
2010
, vol. 
107
 (pg. 
14621
-
6
)
99
Agarwal
P
Searls
DB
Literature mining in support of drug discovery
Brief Bioinform
 , 
2008
, vol. 
9
 (pg. 
479
-
92
)
100
Baker
NC
Hemminger
BM
Mining connections between chemicals, proteins, and diseases extracted from Medline annotations
J Biomed Inform
 , 
2010
, vol. 
43
 (pg. 
510
-
9
)
101
Yamanishi
Y
Kotera
M
Kanehisa
M
, et al.  . 
Drug-target interaction prediction from chemical, genomic and pharmacological data in an integrated framework
Bioinformatics
 , 
2010
, vol. 
26
 (pg. 
i246
-
54
)
102
Campillos
M
Kuhn
M
Gavin
AC
, et al.  . 
Drug target identification using side-effect similarity
Science
 , 
2008
, vol. 
321
 (pg. 
263
-
6
)
103
Chiang
AP
Butte
AJ
Systematic evaluation of drug-disease relationships to identify leads for novel drug uses
Clin Pharmacol Ther
 , 
2009
, vol. 
86
 (pg. 
507
-
10
)
104
Almenoff
JS
Pattishall
EN
Gibbs
TG
, et al.  . 
Novel statistical tools for monitoring the safety of marketed drugs
Clin Pharmacol Ther
 , 
2007
, vol. 
82
 (pg. 
157
-
66
)
105
Yang
L
Chen
J
Shi
L
, et al.  . 
Identifying unexpected therapeutic targets via chemical-protein interactome
PLoS One
 , 
2010
, vol. 
5
 pg. 
e9568
 
106
Bernard
P
Dufresne-Favetta
C
Favetta
P
, et al.  . 
Application of drug repositioning strategy to tofisopam
Curr Med Chem
 , 
2008
, vol. 
15
 (pg. 
3196
-
203
)
107
Spina
D
PDE4 inhibitors: current status
Br J Pharmacol
 , 
2008
, vol. 
155
 (pg. 
308
-
15
)
108
Schneider
G
Virtual screening: an endless staircase?
Nat Rev Drug Discov
 , 
2010
, vol. 
9
 (pg. 
273
-
6
)
109
Liu
X
Ouyang
S
Yu
B
, et al.  . 
Pharmmapper server: a web server for potential drug target identification using pharmacophore mapping approach
Nucleic Acids Res
 , 
2010
, vol. 
38
 (pg. 
W609
-
14
)
110
Schneider
P
Tanrikulu
Y
Schneider
G
Self-organizing maps in drug discovery: compound library design, scaffoldhopping, repurposing
Curr Med Chem
 , 
2009
, vol. 
16
 (pg. 
258
-
66
)
111
Schneider
G
Schneider
P
“Promiscuous” ligands and targets provide opportunities for drug design
Proceedings of “Systems Chemistry”
 , 
2008
Bozen, Italy