-
PDF
- Split View
-
Views
-
Cite
Cite
Lucas Chivot, Noé Mathieux, Anna Cosson, Antoine Bridier-Nahmias, Loïc Favennec, Jean-Christophe Gelly, Jérôme Clain, Romain Coppée, CONSTRUCT: an algorithmic tool for identifying functional or structurally important regions in protein tertiary structure, Bioinformatics, 2025;, btaf166, https://doi.org/10.1093/bioinformatics/btaf166
- Share Icon Share
Abstract
Evolutionary rates in protein-coding genes vary widely, reflecting functional and/or structural constraints. Essential or highly expressed proteins tend to evolve more slowly, and within a protein, different amino acid sites experience distinct selective pressures. Accurately modeling this variation is critical for identifying functional and/or structurally important amino acid sites. Standard methods assume independent substitution rates across sites, and the most conserved ones are widely distributed in protein tertiary structure. This is biologically unrealistic, as functional sites tend to cluster in three-dimensional space.
Here, we developed CONSTRUCT, an improved strategy for detecting functional and structurally important regions in protein tertiary structure. Given a set of orthologous sequences, CONSTRUCT first estimates site-specific substitution rates using the Rate4site model. These rates are then weighted by the rates of neighboring amino acid sites within an optimally defined window size, determined by the strongest spatial correlation. To refine clustering detection, CONSTRUCT can analyze either Cα atoms or the center of mass of amino acid sites, accounting for side chain orientation. Extensive simulations and validation on 14 functionally characterized proteins of diverse sizes, interspecies conservation levels, and taxonomic origins demonstrated the robustness of CONSTRUCT. The results highlight CONSTRUCT as a powerful tool for guiding site-directed mutagenesis experiments aimed at elucidating protein function.
The CONSTRUCT program and documentation are freely available at https://github.com/Rcoppee/CONSTRUCT.
Supplementary data are available at Bioinformatics online.