Motivation: There exist few simple and easily accessible methods to integrate ontologies programmatically in the R environment. We present
Availability: The package and sources are freely available in Bioconductor starting from version 2.8: http://bioconductor.org/help/bioc-views/release/bioc/html/ontoCAT.html or via the OntoCAT website http://www.ontocat.org/wiki/r.
The R package
Several hundreds of public ontologies and numerous private ontologies for describing biological data exist today. Using ontologies in R (Gentleman, 2008; R Development Core Team, 2006) is difficult due to the lack of uniform package support. At the same time numerous Java-based ontology projects are available.
gives unified, format-independent access to ontology terms and the ontology hierarchy represented in OWL and OBO formats;
provides basic methods for ontology traversal, such as searching for terms, listing a specific term's relations, showing paths to the term from the root element of the ontology, showing flattened-tree representations of the ontology hierarchy; and
supports working with groups of ontologies and with major public ontology repositories: searching for terms across ontologies, listing available ontologies and loading ontologies for further analysis as necessary.
No other package with similar functionality exists at the moment in the R environment.
The integration of the above functionality into R allows combining and automating ontology-related tasks. Different examples of ontology-related tasks that can be accomplished with the help of the
There is a large research community already using R to work with Gene Ontology (GO). Working with other ontologies is not as well–developed and
Single ontology traversal methods.
Methods to work across multiple ontologies.
2.1 Single ontology traversal methods
Reasoning over ontologies and extracting relationships is supported by using HermiT (Motik et al., 2009) reasoner. OBO ontologies are translated by OWL API (Horridge and Bechhofer, 2009) into valid OWL format that can be reasoned over.
Ontologies can also be loaded from ontology repositories. Two public repositories are supported: BioPortal for accessing and sharing biomedical ontologies (Noy et al., 2009), currently hosting 241 ontologies and the Ontology Lookup Service (OLS) for querying multiple ontologies (Cote et al., 2008), currently hosting 81 ontologies.
To load an ontology
The reference ontology supported by
When an ontology is loaded, other
No distinction is made between universals (classes) and particulars (instances) as they are both treated as ontology terms with parent–child relationship: class is treated as parent, instances are children of the class.
The advantage of using a reasoner in
An example of relationships that can be retrieved by
Below there are several examples of
- searchTerm(Ontology, ‘myocardium’)returns a list of terms where ‘myocardium’ is mentioned;
- getTermParentsById(Ontology, ‘EFO_0003087’)lists parents of the term ‘EFO_0003087’ (atrial myocardium): ‘EFO_0000819’ (myocardium) (see Fig. 1);
- getTermById(Ontology, ‘EFO_0000819’)returns ontology term by its accession: ‘EFO_0000819’ (myocardium);
- getTermChildren(Ontology, term)lists children of the term ‘EFO_0000819’ (myocardium): ‘EFO_0003087’ (atrial myocardium) and ‘EFO_0003088’ (ventricular myocardium) (see Fig. 1);
- showHierarchyDownToTerm(Ontology, ‘EFO_0000819’)prints out a flattened-tree representation of the ontology from the root term down to ‘EFO_0000819’ (myocardium) by using parent–child relationships;
- getTermRelationsById(Ontology, ‘EFO_0000815’, ‘has_part’)returns terms in relation ‘has_part’ with ‘EFO_0000815’ (heart): ‘EFO_0000277’ (atrium), ‘EFO_0000819’ (myocardium) and ‘EFO_0000317’ (cardiac ventricle) (see Fig. 1);
- getTermSynonyms(Ontology, term)returns a list of synonyms for the term ‘EFO_0000819’ (myocardium): ‘muscle of heart’, ‘cardiac muscle’, ‘heart muscle’; and
- getRootTerms(Ontology)returns a list of terms without parents in ontology of interest.
A number of self-descriptive methods like
2.2 Operations on multiple ontologies
To create a local batch of ontologies the
By default, a call to
After a batch of ontologies is created, various methods become available, including:
- searchTerm(batch, ‘heart’)searches for the term in all ontologies in the batch;
- searchTermInOLS(‘heart’)searches for the term in OLS repository;
- searchTermInBioPortal(‘heart’)searches for the term in BioPortal repository; and
- searchTermInAll(batch,‘heart’)searches for the term in all ontologies in the batch as well as in OLS and BioPortal repositories.
When the sought terms are found and term-specific operations (parent/child/other relationships retrieval, etc.) are needed, the
3 TECHNICAL DETAILS
The package is based primarily on the Ontology Common API Tasks Java library, on the OWL API and depends on
We provide two versions of
Light-weightontoCATpackage version is available in Bioconductor (http://bioconductor.org) starting from release 2.7, and includes all single-ontology functionality except for methods to work with multiple ontologies and search in OLS and BioPortal.
Full version includes batch methods and due to package size limitations are available only from the project website.
The package sources and full documentation are available at http://www.ontocat.org/wiki/r.
The package provides basic operations on ontologies represented in standard formats and enables searches in online ontology repositories: OLS and BioPortal.
Ontology Common API Tasks development team and the EBI Gene Expression Atlas development team.
Funding: European Community's Seventh Framework Programme projects GEN2PHEN (grant number 200754); SYBARIS (grant number 242220); NWO/Rubicon (grant number 825.09.008).
Conflict of Interest: none declared.