IRNdb: the database of immunologically relevant non-coding RNAs

MicroRNAs (miRNAs), long non-coding RNAs (lncRNAs) and other functional non-coding RNAs (ncRNAs) have emerged as pivotal regulators involved in multiple biological processes. Recently, ncRNA control of gene expression has been identified as a critical regulatory mechanism in the immune system. Despite the great efforts made to discover and characterize ncRNAs, the functional role for most remains unknown. To facilitate discoveries in ncRNA regulation of immune system-related processes, we developed the database of immunologically relevant ncRNAs and target genes (IRNdb). We integrated mouse data on predicted and experimentally supported ncRNA-target interactions, ncRNA and gene annotations, biological pathways and processes and experimental data in a uniform format with a user-friendly web interface. The current version of IRNdb documents 12 930 experimentally supported miRNA-target interactions between 724 miRNAs and 2427 immune-related mouse targets. In addition, we recorded 22 453 lncRNA-immune target and 377 PIWI-interacting RNA-immune target interactions. IRNdb is a comprehensive searchable data repository which will be of help in studying the role of ncRNAs in the immune system. Database URL: http://irndb.org


Introduction
Rapid development of high-throughput technologies enabled identification of non-coding RNAs (ncRNAs) as a highly abundant class of transcripts in the pervasively transcribed eukaryotic genome (1). Among ncRNAs, microRNAs (miRNAs), long non-coding RNAs (lncRNAs) and PIWIinteracting RNAs (piRNAs) have been attracting an increasing interest over the last decade as master regulators of numerous genes and diverse biological processes.
MiRNAs constitute a family of 22 nucleotides small ncRNAs that bind to target mRNAs to mediate posttranscriptional repression or degradation of the mRNA (2). Greater than 20 000 miRNA loci are described to date in different species, including over 2000 miRNAs in mammals (3). Greater than 50% of the mammalian genome is thought to be under miRNA control (4). Recently, a number of miRNAs were found to be important regulators acting in both adaptive and innate immune cells (5). Cooperative action of multiple miRNAs contributes to a systemic regulation of development and homeostasis of the immune system and the host response to invading pathogens (6)(7)(8). However, for the majority of miRNAs their roles in immune-related processes still remain to be elucidated.
PiRNAs constitute another class of small ncRNAs whose well-established function is to silence transposable elements in complex with PIWI proteins in germline cells (9). Recent studies provided evidence that piRNA might possess a wider range of functions including proteincoding gene silencing (including immune system-related genes) and epigenetic regulation (10).
LncRNAs are defined as ncRNAs longer than 200 nucleotides (11). Several classification systems and numerous subclasses of lncRNA genes have been described (12). LncRNAs are believed to possess a wide range of molecular functions including transcriptional regulation, regulation of mRNA processing, control of post-transcriptional events and regulation of protein activity (13). A number of lncRNAs were shown to regulate the expression of their adjacent immune genes and be functionally important in both innate and adaptive immunity (14). LncRNAs synthesized by both host and pathogen are involved in the regulation of host-pathogen interactions and might be crucial for the infection outcome (14). Despite the clear importance of certain lncRNAs for regulatory mechanisms, the functionality and biological role of the vast majority of lncRNAs remains unknown (13).
The numbers of experimental studies on ncRNA-target interactions as well as the numbers of computational prediction approaches and tools have increased drastically over the years (15,16). Accordingly, many databases have been developed, providing information on different aspects of ncRNA biology (17,18). Among these, miRBase serves as a central repository for up-to-date miRNA annotation and sequences (3).
In this study, we report the development of IRNdb, the first, to our knowledge, specialized database for immune-related ncRNA in mice, a model organism for the study of the immune system. IRNdb was developed with a goal of establishing a universal resource for mouse ncRNAs in immunological research. We integrated information on mouse miRNAs, lncRNAs, piRNAs and their immunologically relevant target genes. We combined multiple sources of experimentally supported and predicted miRNA-target interactions, thus eliminating the user's need to browse multiple websites and run different computational tools. In IRNdb, we visualize information on experimentally supported and predicted interactions separately, thus, allowing users to choose a category of interest. NcRNAs and target genes are linked to external databases to offer additional information such as sequences and tissue-specific expression. We implemented a feature that enables users to perform functional analysis and identification of significant biological pathways for any selected subset of target genes using Enrichr, a popular tool which offers a large collection of gene set libraries (31). In addition to the identification of significant pathways, IRNdb provides a full list of biological pathways associated with the features of interest. This approach is particularly beneficial for short lists of targets, where over-representation analyses might fail to identify statistically significant pathways. In addition, we list transcription factor binding sites (TFBS) identified upstream of miRNA genes to study their transcriptional regulation. IRNdb has a simple user-friendly interface, is easily accessible and fully searchable and provides functionality to download any findings. Hence, IRNdb offers a new data repository to improve our understanding of the roles of ncRNAs in the immune system.

Biological entities and interactions
We integrated public domain data from various sources ( Figure 1). Experimentally supported interactions between miRNAs and their target genes were retrieved from two databases: miRTarBase (19) and miRecords (20). Predicted interactions were obtained from nine tools/databases: DIANA microT-CDS (32), ElMMo3 (33), MicroCosm (27), microRNA.org (24), miRDB (34), miRNAMap2 (35), PicTar2 (DoRiNA 2.0) (26), PITA (25) and TargetScanMouse (23). We downloaded lists of miRNAtarget predictions from the respective tool/database websites. These tools employ different prediction algorithms including seed sequence match, free energy of miRNA-mRNA duplex and evolutionary conservation. We collected predicted miRNA-target interactions derived by at least two different tools. Experimentally supported and predicted miRNA-target interactions are listed separately in different tables; all corresponding sources are specified for each target. We only retained miRNAs and their interactions from the interactions resources where either a current miRBase ID was readily available or where the resource miRBase ID could be unambigously mapped to a miRBase accession through an alias/outdated miRBase IDs from miRBase. We also included lncRNA-target interaction from LncRNADisease (30), LncRNA2Target (21) and LncReg (36) and piRNA-target interactions from piRBase (22).
The set of genes involved in immunological processes was retrieved from the following sources: InnateDB (37) and SepticShock (http://www.septicshock.org/) were used as sources of mouse-specific immunological gene information. Immunologically relevant human genes were extracted from ImmPort (38), the Immunome database (39,40), immunogenetic-related information source (41) and InnateDB and the MAPK/NFKB network (37) and converted to mouse genes via the NCBI Homologene database (http://www.ncbi.nlm.nih.gov/homologene). To date, IRNdb is focused solely on mouse data.
We retained ncRNAs in IRNdb if they could be connected to one of the immunologically relevant target genes through the earlier mentioned interaction databases. In total, we were able to extract 12 930 experimentally supported interactions between 724 miRNAs and 2427 target genes (Table 1). In addition, we collected 183 752 predicted interactions between 1115 miRNAs and 5297 targets (Table 1). We only retained interactions predicted by at least two sources. However, many interactions were predicted through multiple databases/tools (Table 2), e.g. over 42% of interactions were predicted by more than two sources and 0.22% of miRNA-gene interactions were predicted by all nine databases/tools. A miRNA interacts on average with 17.9 (164.8) target genes through experimentally supported (predicted) interactions. On the other hand, a target gene interacts on average with 5.3 (34.7) miRNAs through experimentally supported (predicted) interactions. In addition to miRNAs, we collected lncRNAs and piRNAs. We were able to extract 22 453 interactions between 163 lncRNAs and 4521 genes and 377 interactions between 319 piRNAs and 84 genes. On average a lncRNA (piRNA) is connected to 137.7 (1.2) genes, and a target gene is on average connected to 5 (4.5) lncRNAs (piRNAs).
In addition, lists of up-and down-regulated genes in immunologically relevant experiments were downloaded from the Molecular Signature Database (MSigDB) (46). We integrated information on relative expression of 455 mature miRNAs in 68 small RNA libraries (47) to indicate mouse cells and tissues where miRNAs were shown to be expressed.
TFBS information was retrieved from ENCODE (48) and HT-ChIP (49). We mapped raw sequencing data to the mm10 genome build for each tissue and cell type separately and called peaks using MACS2 (50). We retained TFBS peaks within 2500 bp upstream of the 5 0 end of primary miRNAs with a threshold of false discovery rate adjusted p-Value (FDR) < 0.01. For each mature miRNA and all corresponding precursors, we record the identified transcription factors (TF), distance between TFBS and miRNA gene, corresponding FDR and cell type.

Website usage
Database and web-interface overview IRNdb collects information on ncRNAs targeting immunologically relevant protein-coding genes and biological annotation data (Figure 1). To access the content in IRNdb, we developed a web-interface (accessible at http://irndb. org), which is divided into four main sections 'miRNAs', 'piRNAs', 'lncRNAs' and 'Target genes'. In addition, the 'Documentation' section provides instructions and examples for using the database, contact information and a  statistical summary of the IRNdb interaction data. Later, we illustrate IRNdb features and functionality using miRNAs mmu-miR-125b-5p and mmu-miR-873a-5p, as well as their targets as examples.

The ncRNA-centred view
IRNdb provides an interface for the convenient retrieval of ncRNA and target information. Users can browse ncRNAs in tabular form that is searchable, e.g. by identifier or names. Clicking the ncRNA name in the table leads to an individual ncRNA entry (e.g. 'miRNA view' in case of miRNAs). For illustrative purposes, we focus on the role of mmu-miR-125b-5p in mouse macrophages. One of the known functions of miR-125b is targeting tumor necrosis factor alpha mRNA in macrophages in response to mycobacterial infection (51). An individual 'miRNA view' for mmu-miR-125b-5p provides five tabs with information regarding the ncRNA (Figure 2A). The 'Targets'-tab contains separate tables for experimentally supported and predicted ncRNA targets. In IRNdb, we visualize information on experimentally supported and predicted interactions separately, thus, allowing users to choose a category of interest. We indicate sources of every interaction so that users can use all of them or focus on the most reliable interactions predicted by many tools. The tables are searchable and Tnf can be found among experimentally supported targets. Searching for 'Tnf' among predicted targets reveals that Tnf receptor, tumour necrosis factor receptor superfamily member 1b (Tnfrsf1b) is predicted to be a target of mmu-miR-125b-5p. The column 'Target source' indicates three different sources for this prediction. To evaluate this potential target, we can switch to IRNdb 'Target gene view' by clicking the gene symbol link (Figure 2A). The 'Expression' field of the 'Target gene view' provides three links to different external sources where we can check if Tnfrsf1b is expressed in macrophages ( Figure 2B). The first link opens the EBI Expression Atlas (52) entry for mouse Tnfrsf1b. Under the EBI Expression Atlas 'Cell type' header, we find that Tnfrsf1b has a medium expression level in mouse macrophages. The second link on the IRNdb target view opens the FANTOM5 SSTAR catalogue entry for Tnfrsf1b (http://fantom.gsc.riken.jp/5/sstar) which, again, shows that Tnfrsf1b is expressed in mouse macrophages. Finally, we provide a link to TBvis, a resource where we implemented visualization of cap analysis gene expression (53) data for mouse macrophages cultivated under different conditions and infected with Mycobacterium tuberculosis (Mtb) (54,55). TBvis data for Tnf and Tnfrsf1b indicate that both genes show an increase in their expression in macrophages upon infection with Mtb.
If regulation of miRNA transcription is of interest, users can browse the table in the 'TFBS'-tab of the 'miRNA view'. The table lists TFs for which TFBSs were found upstream of miRNA in question, along with additional information on cell type, TFBS location and significance. The 'Cell types'-tab shows relative cloning frequencies of corresponding miRNA across a set of mouse tissues and cell types (47). Macrophages were not profiled in this experiment; however, we find that mmu-miR-125b-5p is expressed in other immune cell types including T-, B-and dendritic cells, which might indicate a complex role of mmu-miR-125b-5p in response to the infection.

Browsing and searching target genes
IRNdb provides a separate search interface for immune system-related ncRNA target genes. Genes can be searched using gene symbol, gene name, Entrez Gene ID or their parts. Moreover, users can select any set of genes by clicking on the corresponding rows and perform the gene set enrichment analysis (GSEA) by Enrichr tool (31) with 'RUN GSEA' button. The table provides information on the numbers of experimentally supported and predicted miRNAs, lncRNAs and piRNAs that target each gene. Users can switch to the 'Target gene view' while browsing ncRNA targets by clicking the gene symbol link.

The target-centred view
We continue the demonstration of how IRNdb can be used to characterize Tnfrsf1b, the predicted target of mmu-miR-125b-5p. In addition to the aforementioned 'Expression' field with three links to the external sources of tissue and cell type-specific gene expression, the 'Target gene view' provides six tabs ( Figure 2B).
The 'miRNAs', 'lncRNAs' and 'piRNAs'-tabs show tables with corresponding ncRNAs that target the gene in question. To date, for Tnfrsf1b IRNdb lists seven lncRNAs, six experimentally supported miRNAs and 37 predicted miRNAs. In the case of miRNAs, each table can be sorted by the number of sources of miRNA-target information. Thus, if one aims at testing some of the predicted Tnfrsf1b-targeting miRNAs, the mmu-let-7 family and mmu-miR-98-5p might represent miRNAs of interest as they were predicted by the largest number of tools (see Data sources, integration and implementation).
The 'Pathways'-and 'GO'-tabs provide pathways and processes associated with the target gene. They indicate that Tnfrsf1b is involved in apoptosis and oxidative damage-related processes. The 'Experiments'-tab provides additional functional information on the target, listing immunologically relevant experiments where the target was shown to be up-or down-regulated. Searching for 'macrophages' reveals that Tnfrsf1b was found to be up-or down-regulated in multiple experiments on macrophages. The experiments can be searched by any key words to retrieve, for instance, disease, stimulation or cell type of interest. We integrated experiments performed for mouse and human which is indicated in the 'Species' field.

Pathway analysis of ncRNA targets
We implemented two complementary approaches for ncRNA functional analysis via analysis of the corresponding target genes. We demonstrate how IRNdb can be used for functional analysis of ncRNAs and their targets on the example of miRNA mmu-miR-873a-5p.
A list of targets of a ncRNA of interest can be subjected to the GSEA by Enrichr tool (31) by pressing the 'RUN GSEA' button of the 'Targets'-tab of the ncRNA view. In the case of miRNAs, experimentally supported and predicted targets are analysed separately. Pressing the 'RUN GSEA' button for predicted targets of mmu-miR-873a-5p launches Enrichr in a new window, where users can browse numerous different biological categories. Among KEGG and WikiPathways terms, predicted targets of mmu-miR-873a-5p were significantly enriched in B cell receptor, IL-4 and TNF signalling pathways.
The list of experimentally supported targets can be analysed in a similar manner. However, in the case of mmu-miR-873a-5p, only six experimentally supported targets are reported in IRNdb to date. In such case, it might be beneficial to browse all pathways associated with these targets instead of focusing on statistically significant ones, as with such a small target set, no significance can be reliably calculated. With this purpose, we created 'Pathways'-tab of ncRNA view. It allows users to browse WikiPathways, KEGG and Reactome pathways that contain at least one target of the ncRNA under investigation. Pathways are presented as a table for each pathway source, respectively. Using this feature, we found that one of mmu-miR-873a-5p experimentally supported targets, tumour necrosis factor, alpha-induced protein 3 (Tnfaip3) is associated with the WikiPathway entry TNF-alpha NF-jB Signalling Pathway. These data combined propose that mmu-miR-873a-5p might have a strong role in Tnf signalling which is in agreement with previously published data (56). Moreover, mmu-miR-873a-5p can be further studied to elucidate its possible involvement in the regulation of B cell receptor and IL-4 signalling pathways.
Similarly to the 'Pathways'-tab, the 'GO'-tab provides users with a list of GO-process and GO-function terms associated with the ncRNA targets. We believe that exploring the 'GO'-and 'Pathways'-tab, in addition to GSEA, enhances IRNdb capability of identification of biological pathways and processes possibly affected by the ncRNA of interest.

Biological pathways
The IRNdb repository includes information on biological pathways with immunologically relevant ncRNA target genes. Users can reach the pathway data in three different ways by selecting the pathway from the ncRNA view's 'Pathways'-tab, from the target view's 'Pathways'-tab or from the 'by pathways'-option for a ncRNA class of interest in the navigation panel. In the latter case, the link leads to a 'Browse pathways'-page, where IRNdb lists for each pathway the immunologically relevant experimentally supported ncRNA target genes and corresponding targeting ncRNAs.
Here, we focus on 'TNF-alpha NF-jB Signalling Pathway' of the WikiPathways catalogue. As discussed earlier, two miRNAs mmu-miR-125b-5p and mmu-miR-873a-5p could play a role in regulating this pathway. After searching for the pathway in the table on the 'Wikipathway'-tab, we enter a pathway-centric view by clicking on the pathway name. The 'miRNAs'-tab of the 'Pathway view' lists miRNAs with experimentally supported or predicted targets in the pathway. Searching for 'mmu-miR-873a-5p' shows that, while Tnfaip3 is its only experimentally supported target in the pathway, mmu-miR-873a-5p is predicted to target an additional 13 genes in the pathway, including Ikbkb and Nfkbie which regulate the NF-kappa-B TF (57). Interestingly, mmu-miR-873a-3p has two predicted targets in the pathway as well. Similarly, mmu-miR-125b-5p has three experimentally supported and an additional 18 predicted targets in the 'TNF-alpha NF-jB Signalling Pathway'. 'LncRNA' and 'piRNA'-tabs provide tables with corresponding ncRNAs and their pathway associated targets.

Conclusion
Over the past years, research into ncRNAs has increased significantly within the scientific community. For example, several miRNA databases with diverse scopes and goals have been recently introduced (28,(58)(59)(60). MirDIP (28) and miRGate (59) were developed to address the issue of poor overlap among target prediction tools. A group of resources aims at deciphering biological functions of miRNAs. DIANA-miRPath (61) implements enrichment analysis of miRNA targets in KEGG pathways (42) and GO terms (44). MiRGator combines three miRNA target prediction programmes and focuses on the expression analysis and GSEA to facilitate functional annotation of miRNAs (62). MiTALOS accounts for miRNA tissue specificity by introducing a novel pathway analysis methodology (63).
Undoubtedly, these efforts lead to a better data accessibility and understanding of ncRNA biology and function. One of the insights in ncRNA control is that it is exerted in a tissue-and condition-specific manner. Hence, without a specialized resource, an investigation of ncRNA roles under a certain condition might require time-consuming navigation between multiple data sources and gathering scattered condition-specific information from published literature. To bridge the gap in the understanding of ncRNA control of the immune system, we developed IRNdb. To our knowledge, this is the first resource to focus on immune-related mouse ncRNAs and their target genes. IRNdb is an integrative, user-friendly and easy-to-use platform with a wide range of applications (Figure 3). IRNdb brings together experimentally supported and predicted ncRNA-target interactions enriched with pathway, GO and experimental information, which will benefit ncRNA research and will help in the functional characterization of ncRNAs. All data tables in IRNdb are searchable and downloadable in a variety of formats. In the future, we aim at integrating information on additional types of ncRNAs and expand IRNdb to include more mammalian systems.