Abstract

Motivation

Evolutionary rates in protein-coding genes vary widely, reflecting functional and/or structural constraints. Essential or highly expressed proteins tend to evolve more slowly, and within a protein, different amino acid sites experience distinct selective pressures. Accurately modeling this variation is critical for identifying functional and/or structurally important amino acid sites. Standard methods assume independent substitution rates across sites, and the most conserved ones are widely distributed in protein tertiary structure. This is biologically unrealistic, as functional sites tend to cluster in three-dimensional space.

Results

Here, we developed CONSTRUCT, an improved strategy for detecting functional and structurally important regions in protein tertiary structure. Given a set of orthologous sequences, CONSTRUCT first estimates site-specific substitution rates using the Rate4site model. These rates are then weighted by the rates of neighboring amino acid sites within an optimally defined window size, determined by the strongest spatial correlation. To refine clustering detection, CONSTRUCT can analyze either Cα atoms or the center of mass of amino acid sites, accounting for side chain orientation. Extensive simulations and validation on 14 functionally characterized proteins of diverse sizes, interspecies conservation levels, and taxonomic origins demonstrated the robustness of CONSTRUCT. The results highlight CONSTRUCT as a powerful tool for guiding site-directed mutagenesis experiments aimed at elucidating protein function.

Availability

The CONSTRUCT program and documentation are freely available at https://github.com/Rcoppee/CONSTRUCT.

Supplementary information

Supplementary data are available at Bioinformatics online.

Information Accepted manuscripts
Accepted manuscripts are PDF versions of the author’s final manuscript, as accepted for publication by the journal but prior to copyediting or typesetting. They can be cited using the author(s), article title, journal title, year of online publication, and DOI. They will be replaced by the final typeset articles, which may therefore contain changes. The DOI will remain the same throughout.
This content is only available as a PDF.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
Associate Editor: Arne Elofsson
Arne Elofsson
Associate Editor
Search for other works by this author on:

Supplementary data