Summary: Progress in structural biology depends on several key technologies. In particular tools for alignment and superposition of protein structures are indispensable. Here we describe the use of the TopMatch web service, an effective computational tool for protein structure alignment, for the visualization of structural similarities, and for highlighting relationships found in protein classifications. We provide several instructive examples.
Availability: TopMatch is available as a public web service at http://services.came.sbg.ac.at
Today we face an explosion of newly determined protein structures in part fueled by the various protein structure initiatives. As a result the public repository (PDB) will soon surpass 50 000 entries (Berman et al., 2000). This data base represents our knowledge of protein molecules but the amount of information is overwhelming. To make progress the structures need to be organized, classified and quantified in various ways. For this task and the subsequent retrieval, analysis and visualization of the often intricate relationships structure comparison techniques are indispensable.
Michael Levitt and coworkers (Kolodny et al., 2005) recently presented a most comprehensive analysis of major structure alignment programs. They remark that comparing the various programs is a delicate task and by highlighting the limitations of existing methods they conclude that there is a need for better structural alignment methods. It is indeed surprising that after half a century of protein structure research no generally accepted standards for protein structure alignment have emerged.
A particular difficulty is that as long as existing structural similarities remain undetected we cannot check whether or not any particular method is able to recognize that relationship. According to Kolodny et al., 2005 such difficult examples may be found in existing protein structure classifications by searching for similarities among distinct SCOP (Andreeva et al., 2007) folds or distinct CATH (Greene et al., 2007) architectures or topologies. Here we take up this suggestion and provide a small selection of examples drawn from ongoing classification projects. In these projects we make extensive use of a suite of structure alignment techniques called TopMatch. TopMatch is the successor of ProSup, a program previously used in several large scale structure comparison projects (e.g. Sippl et al., 2001).
We have now completed a web service to make the TopMatch program accessible to the structural biology community. The quality of alignments is essential but ease of use, speed and in particular proper visualization are important ingredients in the interpretation and analysis of structure alignments. The chief goal of this communication is to demonstrate the use of this service by a set of instructive examples drawn from ongoing structure classification initiatives (Suhrer et al., 2007a, b).
In the description of alignments we call the first structure the query (q) and the second structure the target (t). In general a query and target can be aligned in many different ways (Feng and Sippl, 1996). Hence, TopMatch reports a ranked list of alignments. The alignments are characterized by a small set of parameters. The most significant of these is the length of an alignment (the number of residue pairs that are structurally equivalent). We call this the absolute similarity S(q,t). From the alignment we compute a sequence score using a structure derived substitution matrix (Prlic et al., 2000). If this score is positive it is added to S(q,t) and this combined score is used to rank the alignments. Additional useful parameters are the root-mean-square error of superposition (RMS), percentage of sequence identity (Identity), the relative similarity s(q, t) = 100 × 2 S(q, t)/(Lq + Lt), and the relative query and target cover defined as cq = 100 × S(q,t)/Lq and ct = 100 × S(q, t)/Lt, respectively (here Lq and Lt are the respective sequence lengths). Relative similarity and relative cover are simple and intuitive measures describing the extent of mutual similarity amongst two structures.
Figure 1 illustrates the application of TopMatch using a small set of examples. We first demonstrate that for the investigation of structural similarities it is often necessary but also convenient to take into account the manifold of distinct alignments. We then present several examples that may be considered difficult in the sense of Kolodny et al., 2005 where the respective structures reside in distinct SCOP folds and CATH topologies although they share extensive structure similarity.
We note that the 2D projections shown in Figure 1 do not fully reveal the often complex, intricate, or obscure relationships. We therefore encourage the interested reader to contemplate these examples in 3D using the TopMatch service. We have spent considerable efforts to make the use of this service as convenient as possible. For example, whereas computation of structural alignments of SCOP and CATH domains and their visualization generally requires that the domain definitions are supplied by the user, TopMatch recognizes the domain names automatically. Additional information on the efficient use of TopMatch and proper interpretation of the results is provided by the web service.
Conflict of Interest: none declared.