-
PDF
- Split View
-
Views
-
Cite
Cite
Jean Marc Kwasigroch, Marianne Rooman, Prelude&Fugue, predicting local protein structure, early folding regions and structural weaknesses, Bioinformatics, Volume 22, Issue 14, July 2006, Pages 1800–1802, https://doi.org/10.1093/bioinformatics/btl176
- Share Icon Share
Abstract
Summary: Prelude&Fugue are bioinformatics tools aiming at predicting the local 3D structure of a protein from its amino acid sequence in terms of seven backbone torsion angle domains, using database-derived potentials. Prelude&Fugue computes all lowest free energy conformations of a protein or protein region, ranked by increasing energy, and possibly satisfying some interresidue distance constraints specified by the user. Prelude&Fugue detects sequence regions whose predicted structure is significantly preferred relative to other conformations in the absence of tertiary interactions. These programs can be used for predicting secondary structure, tertiary structure of short peptides, flickering early folding sequences and peptides that adopt a preferred conformation in solution. They can also be used for detecting structural weaknesses, i.e. sequence regions that are not optimal with respect to the tertiary fold.
Availability: Author Webpage
Contact: [email protected]
The programs Prelude&Fugue predict the local 3D structure of a peptide, protein or protein region in the absence of tertiary interactions (Rooman et al., 1991, 1992). The input is the amino acid sequence and the output contains the predicted local 3D structures described in terms of seven letters, representing each a (ϕ, ψ, ω)-backbone torsion angle domain. The letter A represents the domain that includes the α-helices, C the 310-helices, B the β-type extended structures, P the poly-proline-type extended structures, G and E the positive-ϕ conformations mirror symmetrical to the A/C and B/P domains, respectively, and O the cis-conformations. Side-chain degrees of freedom are neglected. By assigning to each letter the central (or average) value of the corresponding (ϕ, ψ, ω) domain, and considering average bond lengths and angles, a sequence of letters uniquely defines a 3D structure. However, two structures represented by the same succession of (ϕ, ψ, ω) letters usually differ, as the actual and average (ϕ, ψ, ω) values are not identical. Only for short peptides does this representation yield a good approximation of the 3D backbone structure, whereas for longer peptides it describes the local 3D structure.
Prelude&Fugue use the same potentials and protein representation and receive the same protein sequence as input, but their goals and predictions are different. Prelude&Fugue predicts the N backbone conformations of lowest free energy of the input sequence or of a segment of it, where N and the segment limits are specified by the user. Moreover, the user can impose up to four constraints on interresidue distances between any of the heavy backbone atoms or Cβs. The predicted conformations are in this case the lowest free energy conformations satisfying the constraints. Note that for sufficiently loose constraints, the distances are simply monitored. The output contains the N lowest free energy conformations represented as (ϕ, ψ, ω) strings and ordered as a function of increasing free energy, the values of the constrained interresidue distances (if any) and the backbone coordinate root mean square deviation (r.m.s.d.) relative to the lowest energy structure. The user can also ask for the lowest energy conformations up to a threshold value of the free energy or r.m.s.d. He can moreover perform a steric hindrance test that keeps only the predicted structures whose Cα atoms do not come closer than 2.5 Å. For each predicted conformation, the main chain and Cβ coordinates are supplied in protein Data Bank format (Berman et al., 2000).
When applied to full-protein sequences, Prelude&Fugue provides a local 3D structure prediction, similar to a secondary structure prediction but with seven (ϕ, ψ, ω) assignments. When applied to short peptides, Prelude&Fugue yields a genuine 3D structure prediction, as the (ϕ, ψ, ω) letters allow to represent basically all secondary structures and turn motifs. This prediction is however only valid if tertiary interactions within the peptide may be overlooked, given that they are not taken into account in the potentials. The information about pairwise r.m.s.d. given in the output allows the user to appreciate the variability of the predicted lowest energy structures. In particular, regions where the predicted (ϕ, ψ, ω) assignments are conserved among lowest energy structures are likely to have a well-defined structure, whereas the others are likely to be more flexible. Moreover, the differences in free energy help to refine the appreciation of the stability of the predicted conformations. For example, if the lowest energy structure displays a sizable free energy gap, of 0.5 kcal/mol or more, with respect to the next conformation in the ranking that is significantly different, as monitored by a large r.m.s.d., this structure can be considered as preferred and to display some (marginal) stability.
In contrast to Prelude&Fugue, Prelude&Fugue is designed to be applied to full-protein sequences. It compiles the predictions of Prelude&Fugue on a given protein and identifies strongly predicted segments. It proceeds by dividing the sequence in short overlapping segments of 5–15 residues, and by applying Prelude&Fugueto each segment. A segment is retained if its lowest free energy conformation displays an energy gap of 0.5 kcal/mol at least relative to the next best structure that is sufficiently different in terms of r.m.s.d. The number of retained segments that map onto each sequence position is called the confidence. It measures the strength of the prediction: the higher the confidence, the higher the probability of coincidence of the predicted and native structures. Segments with high confidence values are likely to adopt a preferred conformation when excised from the rest of the chain or to be formed at the very beginning of the folding process (Rooman et al., 1991, 1992; Rooman and Wodak, 1992). This hierarchic view of folding is supported by experimental data and theoretical considerations (Baldwin and Rose, 1999). The predicted segments typically correspond to a helical or extended stretch of 5–10 residues or to a turn. Note that the user has the choice between taking into account or neglecting the sequence environment of the segments in the predictions, i.e. the eight residues upstream and the eight residues downstream. The former possibility entails considering the predicted segment covalently linked to the rest of the chain and thus predicting early folding events, whereas the latter possibility is akin to excising the segment from the chain and considering it in isolation. The prediction scores of Prelude&Fugue are shown in Table 1. More details are given on the website.
State . | Prelude&Fugue scores . | Prelude&Fugue scores . |
---|---|---|
A | 65a–57b | 77a–85b |
C | 31–40 | 36–26 |
A or C | 68–69 | 83–85 |
B | 53–50 | 66–67 |
P | 39–32 | 50–29 |
B or P | 66–59 | 79–69 |
G | 35–44 | 36–52 |
E | 46–64 | 24–65 |
O | 7–13 | 9–13 |
G, E or O | 38–71 | 30–54 |
Average | ||
Seven states | 47–48 | 66–65 |
Three states | 64–65 | 77–78 |
State . | Prelude&Fugue scores . | Prelude&Fugue scores . |
---|---|---|
A | 65a–57b | 77a–85b |
C | 31–40 | 36–26 |
A or C | 68–69 | 83–85 |
B | 53–50 | 66–67 |
P | 39–32 | 50–29 |
B or P | 66–59 | 79–69 |
G | 35–44 | 36–52 |
E | 46–64 | 24–65 |
O | 7–13 | 9–13 |
G, E or O | 38–71 | 30–54 |
Average | ||
Seven states | 47–48 | 66–65 |
Three states | 64–65 | 77–78 |
aSpecificity = Ncorrect/Npredicted.
bSensitivity = Ncorrect/Nobserved.
State . | Prelude&Fugue scores . | Prelude&Fugue scores . |
---|---|---|
A | 65a–57b | 77a–85b |
C | 31–40 | 36–26 |
A or C | 68–69 | 83–85 |
B | 53–50 | 66–67 |
P | 39–32 | 50–29 |
B or P | 66–59 | 79–69 |
G | 35–44 | 36–52 |
E | 46–64 | 24–65 |
O | 7–13 | 9–13 |
G, E or O | 38–71 | 30–54 |
Average | ||
Seven states | 47–48 | 66–65 |
Three states | 64–65 | 77–78 |
State . | Prelude&Fugue scores . | Prelude&Fugue scores . |
---|---|---|
A | 65a–57b | 77a–85b |
C | 31–40 | 36–26 |
A or C | 68–69 | 83–85 |
B | 53–50 | 66–67 |
P | 39–32 | 50–29 |
B or P | 66–59 | 79–69 |
G | 35–44 | 36–52 |
E | 46–64 | 24–65 |
O | 7–13 | 9–13 |
G, E or O | 38–71 | 30–54 |
Average | ||
Seven states | 47–48 | 66–65 |
Three states | 64–65 | 77–78 |
aSpecificity = Ncorrect/Npredicted.
bSensitivity = Ncorrect/Nobserved.
Prelude&Fugue have been proven to be quite successful in several applications, and first of all in the prediction of the location of flickering early folding units (Rooman and Wodak, 1992) and of peptides that adopt a certain amount of structure in solution (Rooman et al., 1992). For example, Prelude&Fugue has been used to identify four protein fragments in cytochrome c2 and calcium-binding protein, predicted to form preferably helical conformations in solution. Prelude&Fugue has then been applied to these fragments to get a more precise estimate of their (marginal) stability. These four peptides have been synthesized, and characterized by circular dichroism and nuclear magnetic resonance. A remarkable agreement between predictions and experiments has been observed, both in the relative stability of the peptides and in the limits of the structured regions (Pintar et al., 1994).
Though tertiary interactions are overlooked, the conformations strongly predicted by Prelude&Fugue, with high-confidence values, generally coincide with the native structures (Rooman et al., 1992). This can be explained by the fact that these conformations are so much preferred by local interactions along the chain that tertiary forces are not sufficient to break them. In some sequence regions, however, it happens that predicted and native structures differ. We interpret these regions as structural weaknesses, defined as regions whose intrinsic structural preferences are in contradiction with the tertiary fold. Such regions might be expected to slow down folding and make the protein more subject to structural modifications or alternative folding. We found such weaknesses often in proteins related to conformational diseases, such as the prion protein (Gilis and Rooman, 2000) and in 3D domain swapping proteins (Dehouck et al., 2003). Other computational approaches aiming at understanding folding and misfolding are reviewed in Dokholyan (2006).
Prelude&Fugue present several advantages compared with more established local/secondary structure prediction methods. In summary, they offer the possibilities of (1) predicting alternate conformations ranked by their relative stabilities, (2) identifying flexible or stable sequence regions, and (3) yielding structures compatible with interresidue distance constraints.
J.-P. Kocher is acknowledged for interesting suggestions. Funding to pay the Open Access publication charges for this article was provided by the European Community through the Concerted Action Quality of Life 2001-3-8.4. M.R. is research director at the Belgian Fund for Scientific Research (F.N.R.S.).
Conflict of Interest: none declared.
REFERENCES
Author notes
Associate Editor: Anna Tramontano