Over 2000 microRNA (miRNA) sequences from different species have been submitted to the miRBase, the central online repository for miRNAs, making a total of 5071 miRNA loci, expressing 5922 distinct mature miRNA sequences. In this review, we have addressed the importance of the genetic variations in humans affecting miRNAs, their target genes and the genes involved in miRNA processing for individual risk of cancer, with particular emphasis on colorectal cancer. In fact, the number of studies suggesting that individual predisposition to cancer is modulated by genetic polymorphisms affecting the biogenesis of miRNA and the interaction between miRNAs and targets has risen steeply in the last few years. We also report the first evidence that variant alleles of single-nucleotide polymorphisms (SNPs) within miRNA genes and miRNA targets, previously associated with the risk of cancer, behave differently when tested in functional studies. The SNPs belonging to the miRNA world are certainly contributing to new insights in the field of the genetic predisposition to disease.
MicroRNAs (miRNAs) are single-stranded non-coding RNA molecules of ∼22 to 25 nucleotides, involved in gene regulation at the post-transcriptional level. Their function is to induce degradation of mRNA targets or to inhibit their translation, interacting with the 3′ untranslated regions (3′ UTR) (1,2). Lin-4 was the first discovered miRNA identified in 1993 in the nematode Caenorhabditis elegans. It is transcribed into an RNA molecule that can repress the expression of lin-14, a protein-coding gene that is relevant to developmental timing (3). Seven years later, another miRNA important in the larval development of C. elegans, let-7, largely conserved in a variety of organisms, was discovered (4). In the following years, hundreds of miRNAs were identified in a wide range of species, including humans (5–8). Computational analyses predict that up to 1000 miRNAs exist in the human genome (9), and since a single miRNA can bind to ∼100 different targets, it has been estimated that miRNAs may regulate up to 30% of the protein-coding genes (10). MiRNAs are initially transcribed as precursor molecules in the nucleus, where they are either organised in genomic clusters or present as individual genes. It has also been reported that miRNA genes could be located within exons or introns of non-coding genes (11), as well as in protein-coding genes (12). miRNAs originate from the nucleus, where the RNA polymerase II transcribes a long primary RNAs (pri-miRNAs) (13). Then, pri-miRNAs are processed by Drosha, a member of the RNase III enzyme family, into precursors (pre-miRNAs) of 70 nucleotides in length with a stem–loop structure (14). Pre-miRNAs are exported from the nucleus to the cytoplasm by the exportin-5 in a Ran-\ triphosphate (GTP)-dependent manner, where they are processed by another RNase III enzyme, Dicer. This causes the release of a double-stranded RNA duplex of ∼22 nucleotides that is incorporated into the RNA-induced silencing complex (15). In this complex, one strand is retained as the mature miRNA, whereas the other strand is generally degraded. The mature miRNA binds the 3′-UTR of target mRNAs through imperfect base pairing (16). Perfectly matched sequence complementarity is required only between the ‘seed’ region of the miRNA (nucleotides 2–7 of mature sequence) and the target mRNA (17). Such binding leads to degradation, destabilisation or translational inhibition of the mRNA and consequently silencing gene expression (18). Since the binding between miRNA and target mRNA does not require perfect complementarity, a single miRNA can affect a broad range of mRNAs and consequently the whole miRNA family possesses the potential to target and regulate thousands of genes (19–21). Approximately one-third of the protein-coding genes are controlled by miRNAs; thus, almost all cellular pathways are directly or indirectly influenced by miRNAs. miRNAs are involved in cell proliferation, cell differentiation (22), apoptosis (23) and metabolism (24). Moreover, recent studies have shown that miRNAs participate in human carcinogenesis as tumour suppressors or oncogenes (25–27). In cancer, an aberrant miRNA expression and/or function is frequently observed (25,28,29). Furthermore, as miRNA expression patterns strongly correlate with tumour type and stage (30,31), miRNAs can be used as clinical markers for cancer diagnosis and prognosis (32,33).
Thus, given the importance of miRNAs basically for all the cellular processes, it is nowadays very important to identify all the circuits where miRNAs can act. The most immediate approach is to predict, by means of algorithms, each target for a given miRNA, as well as the miRNAs potentially binding to a given gene. There are a number of algorithms that have been used to identify putative miRNA-binding sites. The miRBase database (http://microrna.sanger.ac.uk/targets/v3/) is divided into three parts: miRBase Registry includes the miRNA gene nomenclature; miRBase Sequence is the primary online repository for miRNA sequences data and annotation and miRBase Targets, a comprehensive new database of predicted miRNA target genes (34). miRAnda (http://www.microrna.org/) is another algorithm that considers the sequence complementarity between the mature miRNA and the target site, binding energy of the miRNA–target duplex and the evolutionary conservation of the target position in aligned UTRs of homologous genes (35). PicTar (http://pictar.bio.nyu.edu/) computes a maximum likelihood score that a given RNA sequence (3′-UTR region) is targeted by a fixed set of miRNA (36). The MicroInspector program (http://mirna.imbb.forth.gr/microinspector/) generates a list of possible target sites, sorted by free energy values. Adaptation of temperature and free energy setting, followed by a visual inspection of secondary structures, allows a detailed analysis. The program uses an ‘miRNA database’ (in multifasta format) based on ‘the miRNA registry’ (http://www.sanger.ac.uk/Software/rfam/mirna/index.shtml) (37). Diana-MicroT (http://www.diana.pcbi.upenn.edu/cgi-bin/micro_t.cgi) finds miRNA/target duplexes that are conserved in humans and mice with the minimum free energy (38). Finally, TargetScan (http://genes.mit.edu/targetscan) searches the 3′-UTRs for segments of perfect Watson–Crick complementarity to bases 2–8 of the miRNA (numbered from the 5′-end) and assigns a free energy to miRNA–target site interaction (39), given an internal database of miRNA and UTR sequences (Table I).
Since the sequence complementarity and thermodynamics of the binding play an essential role in the interaction of miRNAs with its targets, it is conceivable that sequence variations such as single-nucleotide polymorphisms (SNPs) in the seed region or in a target site could alter the miRNA–mRNA interaction and affect the expression of miRNA targets. A SNP may abolish, weaken or create a new miRNA target and, thus, it would probably lead to a corresponding decrease or increase in protein translation (40). Thus, SNPs residing within the 3′-UTRs of genes implicated in cancer or within miRNAs that function as tumour suppressors or oncogenes could contribute to tumorigenesis, for example affecting the individual risk to develop cancer (41).
Polymorphisms in miRNA sequences and cancer risk
SNPs in miRNA sequences may alter miRNA expression and/or maturation. Several recent reports have shown that SNPs in miRNA sequences are associated with risk of cancer. In the Chinese population, the associations between three SNPs in pre-miRNA regions (hsa-miR-196a2 rs11614913 C/T, hsa-miR-499 rs3746444 A/G and hsa-miR-146a rs2910164 G/C) and different form of cancers have been identified, including hepatocellular carcinoma (HCC) (42), familial breast and ovarian cancers (43), breast cancer (44), prostate cancer (45), papillary thyroid carcinoma (46), cervical squamous cell carcinoma (47), gastric cancer (48,49) and lung cancer (50). These studies suggested that those three SNPs may affect the production of mature miRNAs.
The finding of an association between the rs11614913 within the pre-miRNA-196a2 and increased risk of lung cancer reported in a Chinese population (50) was further replicated in a Korean population (51). The similar magnitude and same direction of association across two studies from different ethnic populations provides strong evidence that this SNP may play an important role in the development of lung cancer. Hishida et al. evaluated the combined influence of miR-146a rs2910164 and the TLR4 rs11536889 in Helicobacter pylori-related gastric cancer and they found that these two SNPs increase significantly the risk of severe gastric atrophy among the H. pylori-infected subjects in the Japanese population (52).
Some pri-miRNAs, as the genes coding protein, have their own promoter. SNPs in the promoter region are also ideal candidates as disease biomarkers. In a recent study, the authors hypothesised that the SNP rs4938723 in the promoter region of pri-miR-34b/c is associated with the risk of primary HCC. To test this hypothesis, they genotyped this SNP in a case–control study of 501 HCC patients and 548 cancer-free controls in a Chinese population, and they found that the rs4938723 was associated with a significantly increased risk of HCC (53). This is the first report that a potentially functional SNP in the promoter region of pri-miRNA may contribute to cancer susceptibility (53).
Polymorphisms in miRNA biogenesis genes and cancer risk
Defects in the miRNA biogenesis may affect miRNA expression, eventually tumorigenesis and ultimately clinical outcomes (54,55). Associations between these SNPs and cancer risk have also been reported in the literature. For example, a study conducted on 279 Caucasian patients with renal cell carcinoma and 278 matched controls showed that rs2740348 and rs7813 in the GEMIN4 gene were significantly associated with a reduced risk of this cancer (56). These SNPs were also found to be associated with a reduced risk of ovarian cancer (57). Moreover, the homozygous variant genotype of a non-synonymous SNP in the GEMIN3 gene (rs197414) was associated with significantly increased risk of bladder cancer (58) and oesophageal cancer (59).
Finally, two SNPs were found to be associated with an increased oesophageal cancer risk: rs11077 and rs14035, within XPO5 and RAN, respectively (59). The XPO5 SNP was also associated with an increased risk of renal cell carcinoma (56). The direct interactions between XPO5 and the RAN proteins are essential for the transportation of pre-miRNAs from nucleus to cytoplasm through the nuclear pore complex in a GTP-dependent manner (60).
Polymorphisms in miRNA targets and cancer risk
In a Chinese study of 256 cases and 367 controls, a 9 bp insertion/deletion polymorphism (rs16405) in the 3′-UTR of betaTrCP, target site for miR-920, has been found to be significantly associated with decreased risk of HCC (61). Similarly, still in the Chinese population, another insertion/deletion polymorphism (rs3783553) in the 3′-UTR of IL1A gene was associated with a decreased risk of HCC. This ‘TTCA’ insertion disrupts a binding site for miR-122 and miR-378, thereby increasing the translation of IL1A in vitro and in vivo (62). Song et al. provided evidence that rs16917496, residing within the target site for the miR-502 in the 3′-UTR of the gene SET8, is associated with an early onset of age of onset of breast cancer in a case–control study of 1110 breast cancer cases and 1097 controls in the Chinese population (63). Nicoloso et al. also hypothesised that SNPs in miRNA-binding sites could affect breast cancer susceptibility. In their investigation, two SNPs were associated with increased risk of both sporadic and familial breast cancer. The first SNP (rs799917) is within an exon of BRCA1, in a miR-638 binding site, and the second (rs334348) is in a binding site for the same miRNA but in the 3′-UTR of the TGFBR1 gene (64). This is the first study showing the association of miRNA-binding site SNPs within exonic sequences instead of the 3′-UTR of genes and cancer risk. Additional studies on breast cancer were performed to assess whether other polymorphisms in miRNA target sites are associated with the risk of this tumour. Kontorovich et al. identified 42 SNPs in 66 genes related to the BRCA1 and BRCA2 genes: 36 polymorphisms were in miRNAs-binding sites and 8 were in pre-miRNAs. A SNP in the miR-320-binding site of ATF1 (rs11169571) was found associated with an increased risk of developing breast and ovarian cancer in carriers of BRCA2 mutations (65).
The combined effect of polymorphisms in target sites and in miRNA sequences could additionally affect individual susceptibility to cancer. To test this hypothesis, Zhou et al. performed genotyping analysis for two SNPs, one in the pri-miR-218 (rs11134527) and the other in the 3′-UTR LAMB3 gene (rs2566) in a case–control study of 703 cervical cancer cases and 713 age-matched cancer-free controls in Chinese women (66). LAMB3 gene is a target of miR-218, and its expression is increased in the presence of the oncoprotein HPV-16 E6 through the regulation of miR-218. The down-regulation of miR-218 by HPV-16 E6 and the consequent over-expression of LAMB3 may promote viral infection of the surrounding tissue and eventually contribute to cervical carcinogenesis (67). The results of the study of Zhou showed that the pri-miR-218 variant GG homozygote was associated with a decreased risk of cervical cancer compared with the AA genotype, while CT and TT genotypes of the LAMB3 gene were associated with a significantly increased risk of cervical cancer compared with the wild-type CC genotype (66).
Polymorphisms in miRNA targets and colorectal cancer risk
Our group was among the first to show that polymorphic target sites for miRNAs are important in modulating the individual risk of colorectal cancer (CRC). In a previous study, we selected the 3′-UTRs of 129 genes involved in pathways implicated in CRC and we identified putative miRNA-binding sites by means of previously described specialised algorithms (PicTar, DIANA-MicroT, miRBase, miRanda, TargetScan and MicroInspector). We ranked 79 SNPs within these putative-binding sites for their ability to affect or impair the binding with predicted miRNAs by assessing the Gibbs-free energy ΔG variation (ΔΔG), through comparing the common alleles with their corresponding rare alleles (68). The eight top ranking SNPs (i.e. rs17281995 in CD86, rs1368439 in IL-12B, rs3135500 in NOD2, rs11677 in PLA2G2A, rs1051690 in INSR, rs16870224 in PTGER4, rs1131445 in IL16 and rs916055 in ALOX15) were candidates to be the most effective in altering the miRNA-binding sites. Thus, they were further evaluated in a case–control association study of a series of CRC cases and controls from the Czech Republic, one of the country with the highest incidence of CRC (69). Interestingly, two polymorphisms (rs17281995 in CD86 and rs1051690 in INSR) were associated with an increased risk of this cancer (70). We replicated the study in a distinct case–control setting, with subjects recruited from Barcelona, Spain. A weak association was found for the CD86 SNP, whereas a statistically significant association was found for the homozygotes carriers of the INSR SNP rare allele [odds ratio (OR) = 3.20; 95% confidence interval (CI) = 1.05–9.78] (71). In a more recent study, we have added to the 129 genes involved in CRC an additional 134 genes derived from a work of Wood et al. (72). They carried out a mutome study on 11 CRC patients and they selected a list of somatically mutated genes thought to be crucial in driving the development of CRC (72). For all the 263 genes, we selected the 3′-UTRs according to the UCSC genome browser (http://genome.ucsc.edu) and we identified the putative miRNA-binding sites by means of the specialised algorithms mentioned earlier. The predicted miRNA-binding sites were screened for the presence of SNPs by an extensive data mining in the SNP database (dbSNP; http://www.ncbi.nlm.nih.gov/SNP/). We excluded SNPs having a minor allele frequency (MAF) <0.24 in Caucasians: 99 SNPs in 67 genes were kept. The genes selected with this criterion were further evaluated for their expression in normal colonic mucosa. By using the database of the University of Tokyo (available at the URL http://www.lsbm.org/site_e/database/index.html), we have found that only 32 genes are significantly expressed in normal colonic mucosa, thus only 48 SNPs were selected. Finally, we kept only those SNPs within target sites for miRNAs specifically expressed in CRC. These miRNAs were taken from a study on the microRNAome of the colorectum by Cummins et al. (73). Thus, we ended with only five SNPs: rs709805 (KIAA0182), rs712 (KRAS), rs9266 (KRAS), rs14804 (NRAS) and rs354476 (NUP210) (Figure 1). For the selected SNPs, the algorithm RNAcofold (http://rna.tbi.univie.ac.at/cgi-bin/RNAcofold.cgi) was employed to assess the Gibbs binding-free energy (ΔG, expressed in kilojoules per mole), both for the common and for the variant alleles. The algorithm RNAcofold computes the hybridisation energy and base pairing pattern of two RNA sequences (74). The SNPs rs712 (KRAS), rs709805 (KIAA0182), rs354476 (NUP210) were genotyped in a case–control association study on 717 colorectal cases and 1171 controls from the Czech Republic and we found statistically significant associations between the risk of CRC and the variant alleles of KIAA0182 (rs709805) (OR: 1.57; 95% CI 1.06–2.78, for the variant homozygotes) and NUP210 genes (rs354476) (OR = 1.36; 95% CI = 1.02–1.82, for the variant homozygotes).
To date, >2000 miRNA sequences from different species have been included into the miRBase, the central online repository for miRNAs, making a total of 5071 miRNA loci, expressing 5922 distinct mature miRNA sequences (75). In this review, we have stressed the importance of genetic variations affecting miRNAs, the target genes and the genes involved in miRNA processing, for individual risk of cancer, with particular emphasis on CRC. In fact, the number of studies suggesting that the individual predisposition to cancer is modulated by genetic polymorphisms affecting the biogenesis of miRNA and the interaction between miRNAs and targets has risen rapidly in recent years.
Undoubtedly, in silico predictions and case–control association studies are useful tools for screening large number of SNPs, but they are not fully satisfactory for describing the complete picture of the mechanisms. In fact, not only can the prediction algorithms be imprecise but also they cannot take into account other factors that could modify the action of miRNAs (such as RNA-binding proteins that also bind to the 3′-end of messenger RNA or pre-messenger RNAs in the vicinity to miR-bindng sites). Thus, there is a need to perform in vitro and in vivo experiments to ascertain the function of relevant SNPs previously found associated with risk of cancer. Only a few published studies have focused on the function of polymorphisms within miRNA genes. Hoffman et al. (76) transfected breast cancer cells with expression vectors containing either the common or the rare variant of the precursor for miR-196a2 and showed that the rare variant leads to a decreased efficiency of the maturation of the pre-miRNA, as well as to a diminished capacity to regulate target genes. Another functional study revealed that the variant allele for the SNP rs11614913, associated with the risk of congenital heart disease, affects the binding between the 3′UTR of HOXB8 and the miR-196a2 (42). In this respect, we have provided preliminary evidence that variant alleles of SNPs associated with CRC risk (i.e. rs17281995 within CD86 and rs1051690 within INSR) also behaved differently in HeLa cells transfected with a chimeric vector, where the 3′-UTRs were placed after the stop codon of a reporter gene (luciferase) (71). Unfortunately, there have been no other such reports on polymorphisms of the human genome. Thus, more studies are warranted in order to further understand the role of SNPs related to the miRNA world in the context of cancer/disease susceptibility.
Associazione Italiana Ricerca Cancro, investigator grant year 2008.
Conflict of interest statement: None declared.