-
PDF
- Split View
-
Views
-
Cite
Cite
Irina Balaur, Ludovic Roy, Alexander Mazein, S Gökberk Karaca, Ugur Dogrusoz, Emmanuel Barillot, Andrei Zinovyev, cd2sbgnml: bidirectional conversion between CellDesigner and SBGN formats, Bioinformatics, Volume 36, Issue 8, April 2020, Pages 2620–2622, https://doi.org/10.1093/bioinformatics/btz969
- Share Icon Share
Abstract
CellDesigner is a well-established biological map editor used in many large-scale scientific efforts. However, the interoperability between the Systems Biology Graphical Notation (SBGN) Markup Language (SBGN-ML) and the CellDesigner’s proprietary Systems Biology Markup Language (SBML) extension formats remains a challenge due to the proprietary extensions used in CellDesigner files.
We introduce a library named cd2sbgnml and an associated web service for bidirectional conversion between CellDesigner’s proprietary SBML extension and SBGN-ML formats. We discuss the functionality of the cd2sbgnml converter, which was successfully used for the translation of comprehensive large-scale diagrams such as the RECON Human Metabolic network and the complete Atlas of Cancer Signalling Network, from the CellDesigner file format into SBGN-ML.
The cd2sbgnml conversion library and the web service were developed in Java, and distributed under the GNU Lesser General Public License v3.0. The sources along with a set of examples are available on GitHub (https://github.com/sbgn/cd2sbgnml and https://github.com/sbgn/cd2sbgnml-webservice, respectively).
Supplementary data are available at Bioinformatics online.
1 Introduction
Systems Biology standard formats such as Systems Biology Markup Language (SBML) (Hucka et al., 2003), System Biology Graphical Notation (SBGN) (Le Novère et al., 2009) and Biological Pathways Exchange Language (BioPAX) (Demir et al., 2010) have been developed to allow accurate computational description of biological systems and to assure complete interoperability for biological resources and full portability for biomodels. The SBGN standard focuses on a graphical representation. It includes three complementary languages for diagram representation, namely (i) the Process Description (PD) to describe biochemical interactions, (ii) the Activity Flow (AF) to represent information flow among biochemical entities and (iii) the Entity Relationship (ER) to illustrate relationships in which given entities participate in biological networks.
CellDesigner is a well-established Systems Biology Workbench that facilitates biological network management including diagram editing and mathematical exploration of biochemical interactions (Funahashi et al., 2007). Maps developed using CellDesigner are included in large-scale scientific efforts such as the PANTHER Pathway (Mi and Thomas, 2009), the BioModels (Le Novère et al., 2006), the Atlas of Cancer Signaling Network (ACSN) (Kuperstein et al., 2015) and the Virtual Metabolic Human (Noronha et al., 2019) databases and within the Disease Maps Project (Mazein et al., 2018). While there are several solutions managing separately SBGN-specific format (e.g. Gonçalves et al., 2013; van Iersel et al., 2012 ) and CellDesigner-specific format (e.g. Mi et al., 2011), the interoperability between the SBGN standard format and the CellDesigner file format (SBML extended with layout information) remains challenging given that CellDesigner uses a proprietary format. For example, in addition to generic protein (corresponding to macromolecule in the SBGN PD), CellDesigner also has receptor, ion channel and truncated protein glyphs. In addition to state transition (corresponding to the generic process glyph in SBGN), CellDesigner has specific types of processes including translation, transcription, transport and truncation. Therefore, there is a need for translation between the CellDesigner format and SBGN standards. Here, we present the cd2sbgnml library for two-way conversion of the CellDesigner file format and SBGN-ML. We also introduce a web service running this conversion library for use by systems biology software tools.
2 Methods
The cd2sbgnml converter is a standalone Java-based software developed for translation between the CellDesigner (version 4.4) file format, which is an extension of SBML with layout information, and the PD and AF languages of the SBGN standard format. The cd2sbgnml tool uses (i) the libSBGN library (van Iersel et al., 2012) dedicated to management (reading, writing) of the SBGN files and (ii) the JAXB library and a manually-curated XSD file (created specifically for CellDesigner 4.4) for handling CellDesigner files. The cd2sbgnml dual converter offers both a command line utility (scripts) and a graphical user interface to facilitate access to its functionality. A log file is also created to accommodate eventual messages of exceptions and warnings generated during the translation process.
Information on notes and annotations is preserved from CellDesigner to SBGN for the model, species, reactions and compartments. In addition, note elements are taken from the celldesigner:species, celldesigner:protein, celldesigner:gene, celldesigner:RNA and celldesigner:AntisenseRNA elements. For the SBGN to CellDesigner translation, information on notes and annotations is taken from maps and glyphs only but not arcs. Specifically, process, map and glyph annotations are used for <reaction>, <model> and <species> elements, respectively. Given that CellDesigner does not allow annotations for the individual components of complexes (called included species), information on RDF annotations of such elements is ignored during the translation. Finally, information on the style of nodes and edges (including color and annotation) is also conserved in both directions.
2.1 Managing SBGN to CellDesigner translation
SBGN process glyph representation during the translation process: A biological process is represented in SBGN by the process glyph that includes incoming and outgoing links (arcs) for consumption and production, respectively, and eventually, regulatory links (arcs) such as catalysis and inhibition. The connection between the process glyph and the corresponding incoming/outgoing arcs is made via SBGN ports; specifically, each process glyph has two ports, one for incoming and another for outgoing arcs. Given the fact that CellDesigner does not provide a specific representation for the SBGN ports, the cd2sbgnml converter introduced two graphical points associated with each process shape for an accurate illustration of the SBGN process glyphs (as shown in Supplementary file S1). The SBGN ports corresponding to the SBGN logical operators are treated similarly by the converter.
SBGN submap and perturbing agent glyphs: Given that the SBGN submap and perturbing glyphs are not addressed in CellDesigner notation, both glyphs are converted to phenotypes. The terminals inside the submaps are not translated, but the arcs pointing to each terminal are set to point to the replacing phenotype glyph. Those equivalence arcs are translated to instances of POSITIVE_INFLUENCE.
2.2 Managing CellDesigner to SBGN-ML translation
CellDesigner-specific protein-related representations such as the active/inactive states (active entities have a dashed line on the outside of their shape), the hypothetical entities (entities with a dashed border) and the binding region box are not translated since they have no equivalent in SBGN. Also, the number of a multimer unit greater than 2 is illustrated graphically in CellDesigner, but only numerically in SBGN (e.g. N:unit count).
Ports: Given that the port length is fixed, automatic port generation can cause changes for the arc orientation, introducing therefore some anesthetics, especially in compact networks, depending on how the SBGN is rendered. A more readable network can be obtained if enough space is left between every node.
The CellDesigner unknown logic gates have no correspondent in SBGN; in this case, the cd2sbgnml converter links directly the concerned input entities to the output process.
Direct links to other links: In CellDesigner, arcs can point to other arcs. However, this representation is valid in SBGN ER only, which is not currently supported by the cd2sbgnml converter. Thus, such links are discarded by the translator.
Representation of CellDesigner ‘mixed’ diagrams: While the reduced notation of CellDesigner is similar to the AF SBGN language, CellDesigner allows development of ‘mixed’ diagrams, where the main notation and reduced notation are combined. However, given that the SBGN standard has separate means to represent PD and AF projections, losing reactions of this or that type is unavoidable when converting such mixed diagrams.
2.3 cd2sbgnml as a web service
The cd2sbgnml conversion library can be easily turned into a web service through another library named cd2sbgnml-service. An example usage of this service can be found in Newt (release 1.1.0, available since June 21, 2018) under the File import and export menus. Newt (http://web.newteditor.org) is a free, web based, open source viewer and editor for pathways in the SBGN standard format.
3 Results
The cd2sbgnml converter was used to translate from CellDesigner’s proprietary SBML extension into SBGN a set of biological maps including those of the RECON Human Metabolic network (Noronha et al., 2019) and of the ACSN (Kuperstein et al., 2015). These maps are among the most comprehensive biological diagrams to date, available to the Systems Biology scientific community. For example, the Recon3D map contains 17 030 species (4140 unique metabolites and 12 890 proteins) and 13 543 metabolic reactions (Noronha et al., 2019) and the complete ACSN 2.0 map is composed of 13 interconnected signaling maps, containing around 9692 species (2997 proteins, 740 RNAs, 130 antisense RNAs, 808 genes, 665 simple molecules and 775 small molecules) and 8137 reactions (https://acsn.curie.fr/). Conversion from the CellDesigner file format to SBGN, using a MacBook Pro (3 GHz Intel Core i7, 16 GB), took ∼13 s for Recon3D and ∼18 s for the complete ACSN map. Details on map composition (number of species and reactions/processes involved) and conversion time for these map collections are given in Table S1 of Supplementary file S2. Figure 1 includes a simple illustration on the receptor representation in (Figure 1A) CellDesigner and in SBGN within (Figure 1B) the Newt editor and (Figure 1C) VANTED (Rohn et al., 2012), (based on the ACSN Dendritic Cell map: https://acsn.curie.fr). Specifically, receptor information is represented graphically as a shape with six sides in CellDesigner and is given by the unit of information glyph attached to the corresponding protein in SBGN.

Simple illustration on an IL6R receptor subnetwork in CellDesigner (A) and in SBGN within the Newt editor (B) and within the VANTED software (C). Source: the ACSN Dendritic Cell map: https://acsn.curie.fr
4 Conclusion
The cd2sbgnml tool is a fully-functional bi-directional converter between the CellDesigner-specific SBML extension and SBGN-ML formats. The use of the cd2sbgnml converter allows exploration and analysis of existing collections of maps developed in CellDesigner by a large set of well-established computational Systems Biology tools specialized into the SBGN standard management such as VANTED and Newt.
Integration of the cd2sbgnml functionality into the comprehensive System Biology Format Converter framework (Rodriguez et al., 2016), which manages translation between different Systems Biology standard formats including SBML, BioPAX, is planned.
Acknowledgements
We thank Marek Ostaszewski and Piotr Gawron for improving the CellDesigner version of ACSN.
Funding
This work was supported by the European Union Horizon 2020 research and innovation programme PRECISE [668858 to L.R. and E.B.], iPC [826121 to A.Z. and E.B.]; and the Scientific and Technological Research Council of Turkey [111E036 to U.D.]. This work was also supported by the Innovative Medicines Initiative Joint Undertaking under grant agreement no. IMI 115446 (eTRIKS) to Charles Auffray and Rudi Balling, resources of which are composed of financial contributions from the European Union’s Seventh Framework Programme (2007–2013) and EFPIA companies.
Conflict of Interest: none declared.
References
Author notes
Irina Balaur and Ludovic Roy wish it to be known that these authors contributed equally.