-
PDF
- Split View
-
Views
-
Cite
Cite
Joan Segura, Ruben Sanchez-Garcia, C O S Sorzano, J M Carazo, 3DBIONOTES v3.0: crossing molecular and structural biology data with genomic variations, Bioinformatics, Volume 35, Issue 18, September 2019, Pages 3512–3513, https://doi.org/10.1093/bioinformatics/btz118
- Share Icon Share
Abstract
Many diseases are associated to single nucleotide polymorphisms that affect critical regions of proteins as binding sites or post translational modifications. Therefore, analysing genomic variants with structural and molecular biology data is a powerful framework in order to elucidate the potential causes of such diseases.
A new version of our web framework 3DBIONOTES is presented. This version offers new tools to analyse and visualize protein annotations and genomic variants, including a contingency analysis of variants and amino acid features by means of a Fisher exact test, the integration of a gene annotation viewer to highlight protein features on gene sequences and a protein–protein interaction viewer to display protein annotations at network level.
The web server is available at https://3dbionotes.cnb.csic.es
Supplementary data are available at Bioinformatics online.
Spanish National Institute for Bioinformatics (INB ELIXIR-ES) and Biocomputing Unit, National Centre of Biotechnology (CSIC)/Instruct Image Processing Centre, C/ Darwin nº 3, Campus of Cantoblanco, 28049 Madrid, Spain.
1 Introduction
Next-generation sequencing has flooded many databases with biomedical data where single-nucleotide variations are associated with phenotypes or diseases (Zerbino et al., 2018). This information comprises collections of variant–disease pairs that can be used to infer which genomic variations might be involved in a particular disease. However, changes on the biochemical or structural features of the affected amino acids (if applicable) can be more informative in order to understand the causes of diseases. For that reason, some of the existent resources compiling variant–disease knowledge also annotates protein residues with biochemical features displaying what properties could be affected (Dingerdissen et al., 2018).
In this work, we present a new version of 3DBIONOTES (Segura et al., 2017; Tabas-Madrid et al., 2016) where different analysis tools and viewers have been integrated to find how genomic variants may affect the different protein residues. 3DBIONOTES is a web framework that integrates biological annotations and structural information of proteins from multiple sources (see Supplementary Section S1). In this version the application computes Fisher’s exact test in order to find what biochemical or structural features are statistically affected by the variants associated to a particular disease. Moreover, a new panel displays those annotated regions where the co-occurrence between protein features and variants are statistically enriched. In addition, a gene annotation viewer has been fully integrated to display protein features at gene level. Also, a protein–protein interaction (PPI) viewer has been included in such a way that the different annotations can be displayed at network level. Finally, and as an additional tool, a new type of query, request by set of proteins, has been implemented to explore and analyse PPI networks. Moreover, custom annotations, including variants, can be submitted and analysed with the biological features integrated in the application.
2 New features
2.1 Gene annotation viewer
In this version of 3DBIONOTES, a gene annotation viewer has been fully integrated (see Supplementary Section S2). This panel displays gene information from ENSEMBL database (Zerbino et al., 2018); the collected information includes: introns, exons, codifying regions and genomic variants. Moreover, ENSEMBL gene sequences are aligned with UniProt (UniProt Consortium, 2018) and PDB (Burley et al., 2018) amino acids in such a way that protein annotated regions can be highlighted on gene sequences and vice versa (see Supplementary Section S2).
2.2 Genomic variants contingency analysis
In order to find which protein regions are statistically affected by the genomic variants associated to a particular disease, Fisher’s exact test between the different annotations and the variants associated to the different diseases is computed. The main objective is to find co-occurrence of protein residues between the different structural or biochemical annotations and the variants associated to diseases. For example, most cancer related variants of the KRAS human protein map on its nucleotide binding region (see Section 3.1 and Supplementary Section S3).
2.3 Exploring PPI networks
A new type of query to request information for a set of proteins is now available. Moreover, a panel to visualize PPI networks using a graph-based representation has been integrated. This panel displays the physical binding between proteins when the information for a multimeric entry is requested or the PPIs that have been experimentally observed when a given set of proteins is submitted. In the first case, the contacts are computed using a distance threshold of 6 Å between heavy atoms. For the second case, PPI data is collected from Interactome3D (Mosca et al., 2013). Moreover, the network panel can display annotations at network level using a similar approach as dSysMap (Mosca et al., 2015) (see Supplementary Section S4).
2.4 Submitting custom annotations
The application supports the submission of custom data in such a way that users can analyse their own genomic variants or other annotations and compare them with 3DBIONOTES integrated data. The submitted information is fully integrated and the different visualization and analysis tools can be used to display and process the external data.
3 Use cases
3.1 Analysis on the human KRAS genomic variants
In this example we have analysed the genomic variants associated to KRAS human protein (UniProt accession P01116). KRAS protein is a GTPase that acts as a signalling switch in many transduction pathways including cell proliferation. The active state of KRAS occurs when the protein is bound to GTP. In this state, the protein recruits and activates other growth factors and cell signalling receptors. Upon GTP hydrolysis and conversion to GDP, KRAS is inactivated. KRAS mutations are known to be involved in different diseases such as multiple cancer types or neurofibromatosis (Simanshu, et al., 2017). We used 3DBIONOTES to analyse the co-occurrence of KRAS variants associated to diseases with the different biochemical annotations. The main reason was to check whether those variants occur in particular regions of KRAS or randomly distributed. Supplementary Figure S4 and Supplementary Tables S1 and S2 display the analysis panel of 3DBIONOTES and clearly show that many of those variants occur in the ‘Nucleotide Binding Site’ annotated regions. KRAS acts as on/off switch in many processes and its active or inactive form depends on the interaction with GTP or GDP, respectively. Then, mutations affecting KRAS GTP/GDP-binding sites may affect its activation and therefore, many cell regulatory processes.
3.2 GNB1 neurodevelopmental disability
This example illustrates how 3DBIONOTES can be used to analyse external variants. We have collected the variants of the G protein subunit beta (GNB1) associated to neurodevelopmental delay, hypotonia and seizures available in the work of Petrovski et al. (2016) (see Supplementary Table S3). GNB1 protein modulates transmembrane signalling pathways controlled by G protein-coupled receptors. We have requested the PPI network information for the GNB1 protein (UniProt accession P62873) and attached the collected variants to 3DBIONOTES. When the variants are mapped to the PPI network, many of them appear affecting the binding sites between GNB1 and other G proteins. Moreover, the contingency analysis identified that the co-occurrence between many of the GNB1-biding sites and the submitted variants was statistically significant (see Supplementary Fig. S8). Consequently, mutations of the GNB1-binding sites may affect the interaction with other G proteins and, thus, some of the cell signalling pathways involving G proteins.
Funding
This work was supported by Ministerio de Economía, Industria y Competitividad, Gobierno de España [grant No. BIO2016-76400-R(AEI/FEDER, UE)]; Comunidad de Madrid [grant No. S2017/BMD-3817]; Instituto de Salud Carlos III [grant No. PT13/0001/0009; INB Grant PT17/0009/0010 - ISCIII-SGEFI/ERDF]; Horizon 2020 [grant No. Elixir – EXCELERATE INFRADEV-3-2015, Proposal 676559] and iNEXT [INFRAIA-1-2014-2015, Proposal 653706]; Ministerio de Ciencia, Innovación y Universidades, Gobierno de España [Juan de la Cierva-E-28-2018-0015407 to J.S.]; and Ministerio de Educación, Cultura y Deporte [FPU-2015/264 to R.S.-G.].
Conflict of Interest: none declared.
References
UniProt Consortium. (