‘miR2Disease’, a manually curated database, aims at providing a comprehensive resource of microRNA deregulation in various human diseases. The current version of miR2Disease documents 1939 curated relationships between 299 human microRNAs and 94 human diseases by reviewing more than 600 published papers. Around one-seventh of the microRNA–disease relationships represent the pathogenic roles of deregulated microRNA in human disease. Each entry in the miR2Disease contains detailed information on a microRNA–disease relationship, including a microRNA ID, the disease name, a brief description of the microRNA–disease relationship, an expression pattern of the microRNA, the detection method for microRNA expression, experimentally verified target gene(s) of the microRNA and a literature reference. miR2Disease provides a user-friendly interface for a convenient retrieval of each entry by microRNA ID, disease name, or target gene. In addition, miR2Disease offers a submission page that allows researchers to submit established microRNA–disease relationships that are not documented. Once approved by the submission review committee, the submitted records will be included in the database. miR2Disease is freely available at http://www.miR2Disease.org.
MicroRNAs are a class of endogenous single-stranded small noncoding RNAs that negatively regulate gene expression (1). MicroRNAs control mRNA degradation and/or translation inhibition through binding to complementary sites in the 3′-untranslated regions of target genes (2). In the past decade, hundreds of microRNAs have been identified in mammalian cells (3,4). Evidence suggests that microRNAs play critical roles in multiple biological processes, including cell cycle control, cell growth and differentiation, apoptosis, embryo development and so on (5–8).
In recent years, dozens of microRNA-related database systems have been developed. miRBase (9), miRGator (10) and miRGen (11) aim at providing complete repositories for microRNA annotation and nomenclature. TarBase (12), microRNAMap 2.0 (13) and microRNA.org (14) collect experimentally validated and/or computationally predicted microRNA–target relationships; microRNA.org also provides a collection of microRNA expression profiles in different tissues. Such database systems offer great resources in investigating the function of microRNA in gene regulation. In addition, several computational algorithms and web-based programs have been developed to computationally predict target genes/sites of microRNAs, such as TargetScan (15), PicTar (16), RNAhybrid (17) and PITA (18).
Accumulating evidence indicates that microRNAs play crucial roles in human disease development, progression, prognosis, diagnosis and evaluation of treatment response (19–24). Genome-wide association studies demonstrated that many human microRNA genes locate at genomic regions linked to cancer (25,26). Moreover, a recent study found that the absolute expression levels of many microRNAs were reduced significantly in tumors (27). The same study also revealed that using the expression levels of 217 microRNAs were more effective in cancer classification than mRNA microarrays containing more than 16 000 protein-coding genes (27). All these evidences support the strong necessities in understanding the function of microRNA in disease development.
In addition to these genome-wide studies, correlations between single microRNA deregulation and occurrences of human disease have been reported. Cimmino et al. (28) demonstrated that both miR-15a and miR-16-1 negatively regulate BCL2 at a post-transcriptional level, which induces apoptosis in a leukemic cell line model. Yang et al. (29) found that miR-1 overexpression slowed cardiac conduction and depolarized the cytoplasmic membrane by post-transcriptionally repressing KCNJ2 and GJA1, which accounted at least in part for its arrhythmogenic potential. Hitherto, over 100 review papers have been published in summarizing the relationship between microRNA deregulation and dozens of diseases (30–36). Despite these developments, detailed information on microRNA–disease relationships are scattered in literatures and there is no online repository for these microRNA–disease relationships. Therefore, we develop a manually curated database entitled ‘miR2Disease’ (http://www.miR2Disease.org), which provides a comprehensive resource of microRNA deregulation in various human diseases.
DATA COLLECTION AND DATABASE CONTENT
Initial entries describing the relationships between microRNA deregulation and occurrences of human disease are collected manually. We searched the PubMed database (37) with a list of keywords, such as ‘microRNA disease’, ‘miRNA disease’, ‘microRNA cancer’, ‘miRNA cancer’, etc. In the current release of miR2Disease, more than 600 literatures were reviewed, and 1939 curated relationships between 299 human microRNAs and 94 human diseases were documented. In the miR2Disease system, the disease terminologies were organized based upon a controlled medical vocabulary (Disease Ontology: http://diseaseontology.sourceforge.net/), which utilizes the Unified Medical Language System (UMLS) as source vocabulary (38). Such organization provides substantial advantage in terms of search and analysis. Each entry in the database contains detailed information on a microRNA–disease relationship, including a microRNA ID, the disease name, a brief description of the microRNA–disease relationship, an expression pattern of microRNA (upregulated or downregulated) in the disease state, the detection method used to derive microRNA expression pattern (microarray, northern blot, qRT-PCR, etc.), the experimentally validated target gene(s) extracted from the corresponding references and directly derived from TarBase and a literature reference. miR2Disease provides a user-friendly interface for an easy query of each entry by microRNA ID, disease name or experimentally validated target genes. In addition, convenient links are provided to other microRNA databases, including microRNA sequence and annotation in miRBase, experimentally validated microRNA target genes in TarBase and computationally predicted microRNA target genes in TargetScan, miRanda and PicTar. A hyper link to the literature reference in the NCBI PubMed database (37) together with an official PubMed ID and complete citation are also provided. In miR2Disease, around one-seventh of the microRNA–disease relationships represent the pathogenic roles of deregulated microRNA in human disease.
miR2Disease provides a search engine to query detailed information on each microRNA–disease relationship documented in the database. Users can query the database through microRNA ID, disease name or target genes.
miR2Disease offers a fuzzy search function. Combining with controlled disease vocabulary (disease ontology), fuzzy search functionality allows users to retrieve meaningful microRNA–disease relationship information without knowing the exact disease name documented in the database. Once certain disease name is received as a query term, the system first searches all the disease terminologies that contain the query words. The matching disease terminology will be listed in the context of a disease tree, which contains its ancestor nodes as well as all of its subcategories. In the matching disease tree, the categories that contain query terminologies are highlighted as bold, and a hyperlink is created for every category that contains documented microRNA–disease relationships. Through these hyperlinks, users can retrieve all the microRNA entries that are related to the selected disease terminology. In the search results page, clicking the ‘more …’ link at the end of the entry will lead to detail information of this relationship (Figure 1).
Similar to search by disease terminology, the nomenclatures of microRNAs could be complicated and confusing. It is not unusual that the original publication did not provide enough information on the exact microRNA in a large microRNA family. For instance, some publications simply report let-7 is related to adenoma, while others conclude that the expression level of let-7a-3 is downregulated in breast cancer. Therefore, through the fuzzy search function, miR2Diesease allows queries of microRNAs without knowing their exact identification numbers. Under the circumstances that one microRNA query results in multiple IDs, one can further select the microRNAs of the greatest interests. Similarly, Figure 1 provides a snapshot of the query workflow of ‘search by miRNA ID’.
In the miR2Disease system, the concept of target genes can be classified into three categories: the target genes reported in the original reference, the target genes documented in the TarBase system (experimentally validated targets) and the predicted targets from computational tools (Miranda, TargetScan or PicTar). The ‘search’ by ‘target genes’ functionality allows users to search microRNA–disease relationships from the target genes of the first two types (reported targets and experimentally validated targets). Searching through the computationally predicted targets will be included in the future release.
The causal relationship between disease and microRNA is documented and listed in both the searching result page and the detailed relationship page (Figure 1). In addition, the system also allows users to filter the search results by only selecting causal relationships.
miR2Disease provides a submission page that allows other researchers to submit important microRNA–disease relationships that are not documented. Once approved by the submission review committee, the submitted record will be included in the database, and made available to the public in the coming release. miR2Disease will be updated monthly.
Emerging evidence suggests that specific temporal and spatial microRNA expression is required for normal cellular development and differentiation, whereas many human diseases are associated with aberrant microRNA expression patterns. In order to provide a central resource for the biologists who study the relationship between microRNA and human disease, we develop miR2Disease, a database system aiming at providing a comprehensive resource of microRNA deregulation in various human diseases. miR2Disease not only offers an easy-to-use web interface to query the detailed information on microRNA–disease relationships documented in the database, but it also provides a submission page that allows other researchers to contribute to the data contents.
Figure 2A demonstrates the histogram of the number of diseases associated with the microRNAs archived in the database. Eighty-seven microRNAs (29.1%) are deregulated in only one disease; while 177 microRNAs (59.2%) demonstrate deregulation in no more than three diseases. The most prevalent microRNA appeared in the database is hsa-miR-21, which is deregulated in 59 documented diseases (over half of all the disease types). Five additional microRNAs are deregulated in over 25 types of diseases; they are: hsa-miR-155, hsa-miR-221, hsa-let-7a, hsa-miR-223 and hsa-miR-222, which are related to 45, 37, 28, 25 and 25 diseases, respectively. The histogram of the number of deregulated microRNAs associated with individual disease is shown in Figure 2B. No more than three microRNA deregulations were documented for 37 diseases (39.4%), 21 of which contain only one record. On the contrary, 13 diseases are related to more than 50 deregulated microRNAs. Hepatocellular carcinoma, prostate carcinoma and colorectal carcinoma top this list with 129, 98 and 87 deregulated microRNAs associated. One should notice that these numbers are highly dependent on the experimental approaches being used in the original publication. For instance, investigations using microarray technology usually observe more deregulated microRNAs than other low-throughput studies.
It remains to be seen whether the deregulation of microRNAs is a cause or consequence of the disease state. Around 1/7 of the microRNA–disease relationships in miR2Disease indicate the pathogenic role of microRNA deregulation in various diseases, including cancer, metabolic disease and cardiovascular disease. For instance, Ma et al. (39) reported that highly expressed miR-10b initiates tumor invasion and metastasis in breast cancer through translational inhibition of HOXD10, and eventually increases expression of RHOC, a pro-metastatic gene. Huang et al. (40) found that significant upregulation of miR-373 and miR-520c stimulates breast cancer cell migration and invasion by the suppression of CD44. Meng et al. (41) identified that the oncogene mitogen-activated protein kinase kinase kinase 8 (MAP3K8) is a target of miR-370, whose reduced expression elevates the MAP3K8 level and therefore contributes to tumor growth in cholangiocarcinoma. It has been reported that miR-375 controls insulin secretion by regulating the expression of myotrophin. Upregulation of miR-375 led to an enhanced inhibition of insulin release (42). It is also demonstrated that downregulation of miR-1 and miR-133 contributes to re-expression of HCN2/HCN4 and the electrical remodeling process in hypertrophic hearts (43).
Information in the miR2Disease can be used to describe the relationship among multiple diseases. To this end, we create a bipartite network that describes the causal relationship between 85 microRNAs and 32 cancer-related diseases (Figure 3). Within this network, lung carcinoma, breast cancer and ovarian cancer demonstrate the highest connectivity by affiliating with 25, 23 and 18 causal microRNAs, respectively. For microRNAs, hsa-let-7a deregulation is the causal effect for nine types of cancers, while hsa-miR-21, hsa-miR-124a, hsa-let-7c, hsa-miR-145, hsa-miR-16 and hsa-miR-221, each associate with five types of cancers or more (solid circles in the network). Using miR2Disease database, one can create this type of bipartite network by searching for the diseases or microRNAs of interest.
miR2Disease documented several potential mechanisms that cause the microRNA deregulation in various human diseases. First, microRNAs are located in the disease-causing genomic loci, including minimal regions of loss of heterozygosity, minimal amplicon, or breakpoint cluster regions (19,25). For instance, miR-15 and miR-16 are located in chromosome 13q14, a region that is deleted in more than half of B-cell chronic lymphocytic leukemia (CLL)s (B-CLL); consequently, both genes are deleted or downregulated in the majority (68%) of CLL cases (44). In contrast, the miR-17-92 polycistron is located in a genomic region that is amplified in human B-cell lymphomas, and therefore are overexpressed (45). Second, microRNA deregulation is caused by abnormal epigenetic patterns, including abnormal DNA methylation and histone-modification patterns (46). For instance, in normal conditions, the putative promoter region of let-7a-3 gene is heavily methylated in normal human tissues, but hypo-methylated in lung adenocarcinoma. Such promoter demethylation induced reactivation of the let-7a-3 with an oncogenic function and increased proliferation in lung cancer cell lines (47). In addition, aberrant hypermethylation leads to miR-9-1 inactivation in human breast cancer (48). Third, microRNA deregulation may be caused by abnormalities of the enzymes that are involved in microRNA biogenesis. For example, Otsuka et al. (49) unambiguously showed that miR-24 and miR-93 could target viral large protein (L protein) and phosphoprotein (P protein) genes. In Dicer1-deficient cells, a lack of host miR-24 and miR-93 was responsible for increased VSV replication. In the miR2Disease system, such information is documented in the ‘analysis’ session (a link through the home page).
In addition to the mechanisms described above, miR2Disease also contains entries where disease-causing genetic variations affect cellular functions through the disruption of microRNA target selection. For instance, loss-of-interaction variations in let-7:Hmga2 (50), miR-148a:HLA-G (51) and miR-433:FGF20 (52), are involved in myoma, asthma and Parkinson's disease, respectively. Chen et al. (53) elucidated that CCND1 mRNA is normally under the regulation of miR-16-1, and that truncated CCND1 mRNA escapes this regulation through deletion of the microRNA-binding sites in mantle cell lymphoma. In summary, loss-of-interaction variations in microRNA genes or the binding site(s) of microRNA represent a newly described pathogenic mechanism of disease.
Overall, miR2Disease provides a comprehensive resource of microRNA deregulation in human disease. We believe that miR2Disease will be of particular interest to the life science community and facilitates the biologists to unravel the role of microRNA in the pathogenesis of human disease.
As stated in the ‘data collection and database content’ session, the microRNA–disease relationships documented in the current release were collected through searching the PubMed database with a list of keywords, such as ‘microRNA disease’, ‘miRNA disease’, ‘microRNA cancer’, ‘miRNA cancer’, etc. Although we collected ∼2000 relationships from over 600 literatures, such mechanism suffers from the lack of comprehensiveness and systematicness. We plan to adopt two strategies to improve our data collection. First, we will use text-mining tools to help us prescreen PubMed abstracts that potentially describe the microRNA–disease relationships. Second, we will specifically target MeSH (Medical Subject Headings) vocabularies that are created and maintained at the National Library of Medicine. These strategies will no doubt increase the data comprehensiveness, and will impact the database records in the releases to come.
Not all relationships of microRNA in disease are equally valid. Some studies demonstrated more creditable microRNA–disease relationships than others.
The China National 863 High-Tech Program (2007AA02Z302 to Y.L. and 2007AA02Z329); the Indiana Genomics Initiative of Indiana University; the Lilly Endowment, Inc. (to Y.L., partial); National Natural Science Foundation of China (Grant No. 60671013). Funding for open access charge: The China National 863 High-Tech Program.
Conflict of interest statement. None declared.
The authors thank informative suggestions from Dr Kenneth P. Nephew at Indiana University.