Epigenetics is one of the hottest topics in cancer research. We know that human tumors undergo a major disruption of their DNA methylation and histone modification patterns. The aberrant epigenetic landscape of the cancer cell is characterized by a massive genomic hypomethylation, CpG island promoter hypermethylation of tumor suppressor genes, an altered histone code for critical genes and a global loss of monoacetylated and trimethylated histone H4. But what we know is just a minimal percentage of the epigenetic ‘earthquake’ present in the transformed cell. We need to make an ambitious step to understand the DNA methylation and histone changes underlying tumorigenesis. The launching of an International Human Epigenome Project should be the response to this necessity.
Introduction to the human epigenome
The massive sequencing effort by public and private laboratories provided us with a wonderful first draft of our genome. An extraordinary accomplishment, only comparable in my modest opinion to the characterization of the double helix of DNA. However, many question marks remained unanswered. We have this huge phone book, but it is time to organize it and make some calls to be sure that the names and addresses correspond to those numbers. Where should we unfold this post-epigenomic era? Many possible answers flourish in the feverish minds of the researchers, but epigenetics is the most logical line of ambitious research. The call has already been made ( 1 – 4 ) and the race has started ( 5 ). There is the need of a large-scale epigenetic mapping of our cells that moves biology to this new century with a fresh face. There is the necessity of a world-scale Human Epigenome Project for the normal and cancer cell.
Epigenetics can be understood as the mechanisms that initiate and maintain heritable patterns of gene expression and gene function in an inheritable manner without changing the sequence of the genome. Epigenetics can also be understood as the interplay between environment and genetics. In this last issue, epigenetics provides the best explanation about how the same genotype can be translated to different phenotypes. Examples of the powerful modulator effects of epigenetics in this scenario are starting to emerge in an exponential increasing number: strains of Agouti mice can undergo changes of the DNA methylation status of an inserted IAP (intracisternal A-particle) element that changes the animal's coat color ( 6 ); cloned animals demonstrate an inefficient epigenetic reprogramming of the transplanted nucleus that it is associated with aberrations in imprinting, aberrant growth and lethality beyond a threshold of faulty epigenetic control ( 7 ); in assisted reproductive technology a surprising set of recent observations suggests an increased rate of epigenetic errors, such as Beckwith–Wiedemann syndrome, Angelman syndrome and retinoblastoma ( 8 ); and monozygotic twins, that thus share the same DNA sequence, can present antropomorphic difference and distinct disease susceptibility related to epigenetic differences such as DNA methylation and histone modifications ( 9 ). These findings underscore the relevance of a Human Epigenome Project as indicated in its statement: ‘The goal of the Human Epigenome Project is to identify all the chemical changes and relationships among chromatin constituents that provide function to the DNA code, which will allow a fuller understanding of normal development, aging, abnormal gene control in cancer, and other diseases as well as the role of the environment in human health’ ( 2 ). An ambitious objective that it will require the effort of many laboratories around the world and the decisive support of public research and health agencies and the complicity of private biotechnology and pharmaceutical companies.
The epigenome of a healthy cell
It is important first to assess what we know about epigenetics in health and in disease. The epigenetics network has many layers of complexity that could be summarized in four as follows: DNA methylation, histone modifications, chromatin remodeling and microRNAs. The order of citation is not casual, I started from the most known to end with the least elucidated.
The most studied epigenetic modification in humans is the methylation of the cytosine located within the dinucleotide CpG. 5-methylcytosine (5mC) in normal human tissue DNAs constitutes 0.75–1% of all nucleotide bases and we should remember that ∼3–4% of all cytosines are methylated in normal human DNA ( 10 ). CpG dinucleotides are not randomly distributed throughout the vast human genome. There are CpG-rich regions, known as CpG islands, which are usually unmethylated in all normal tissues and frequently span the 5′ end region (promoter, untranslated region and exon 1) of a number of genes: they are excellent markers of the beginning of a gene. If the corresponding transcription factors are available, the histones modifications are in a permissive state and the CpG island remains in an unmethylated state, that particular gene will be transcribed.
Of course, there are exceptions to the general rule. We can find certain normally methylated CpG islands in at least four cases as follows: imprinted genes, X-chromosome genes in women, germline-specific genes and tissue-specific genes ( 10 ). Genomic or parental imprinting is a process involving acquisition of DNA hypermethylation in one allele of a gene early on in the male and female germline that leads to monoallelic expression. A similar phenomenon of gene-dosage reduction can also be invoked with regard to the methylation of CpG islands in one X-chromosome in women, which renders these genes inactive in order to avoid redundancy. Finally, although DNA methylation is not a widely occurring system for regulating ‘normal’ gene expression, sometimes it does indeed accomplish this purpose. We have the case, for example, of those genes whose expression is restricted to the male or female germline and that are not expressed later in any adult tissue, such as the MAGE gene family. Finally, methylation has been postulated as a mechanism for silencing tissue-specific genes in cell types in which they should not be expressed. However, it is still not clear whether this type of methylation is secondary to a lack of gene expression owing to the absence of the particular cell-type-specific transcription factor or whether it is the main force behind transcriptional tissue-specific silencing.
What is the significance of the presence of DNA methylation outside the CpG islands? One of the most exciting possibilities for the normal function of DNA methylation is its role in repressing parasitic DNA sequences ( 11 ). Our genome is plagued with transposons and endogenous retroviruses acquired throughout the history of the human species. We can control these imported sequences thanks to direct transcriptional repression mediated by several host proteins, but our main line of defense against the large burden of parasitic sequence elements (>35% of our genome) may be DNA methylation. Methylation of the promoters of our intragenomic parasites inactivates these sequences and, over time, will destroy many transposons.
The DNA methylation landscape of a normal cell occurs in the context of all the other epigenetic marks. In this manner, DNA methylation is associated with the formation of nuclease-resistant chromatin, and methyl-CpG binding proteins and DNA methyltransferases are associated with histone deacetylases and histone methyltransferases, two superfamily enzymes that are key regulators of histone function ( 12 – 14 ). Histone function is one of several important members of the epigenetic network. The status of acetylation and methylation of specific lysine residues contained within the tails of nucleosomal core histones is known to play a critical role in chromatin packaging and gene expression ( 12 – 14 ). Overall, histone hypoacetylation and hypermethylation is characteristic of DNA sequences methylated and repressed in normal cells, such as X-chromosome in females, imprinted genes and tissue-specific genes. However, each particular lysine residue can be a marker for a different signal. For example, the underacetylated lysine positions of K5, K8 and K12 of histone H4 are characteristic of heterochromatic X-chromosomes, whilst acetylated K16 distribution is similar in the X-chromosome and autosomes. In this regard, recent data suggest that acetyl-K16 behaves differently from the other acetylated residues ( 4 , 14 ). It provides a barrier to the spreading of Sir proteins, histone hypoacetylation and silencing within adjacent subtelomeric DNA regions. With regard to histone H4 methylation, the only lysine methylation event in this tail occurs at position K20. This modification seems to trigger many biological processes and the trimethylation of histone H4 has recently been identified as a marker of constitutive heterochromatin and gene silencing, and has also been found to be associated with aging ( 15 ).
Finally, we should not forget that DNA methylation and histone modifications occur in the context of a higher-order chromatin structure. Nucleosomes, formed by the DNA wrapping around the octamer of histones, are the champions of that league. Multi-subunit complexes, such as those constituted by the SWI/SNF proteins, use the energy of ATP to mobilize nucleosomes and allow the access of the transcriptional machinery ( 16 ); or massive repressive complexes counteract SWI/SNF functions, as does the polycomb group gene family ( 17 ).
In the end, the gene expression and function, and overall the genome activity, of the healthy cell is the result of the balance between these massive forces shaping our human epigenome.
The epigenome of a sick cell
Most human diseases have an epigenetic cause. The perfect control of our cells by DNA methylation, histone modifications, chromatin-remodeling and microRNAs become dramatically distorted in the sick cell. The ground-breaking discoveries have been initially made in cancer cells, but it is just the beginning of the characterization of the wrong epigenomes underlying neurological, cardiovascular and immunological pathologies.
In human cancer, the DNA methylation aberrations observed can be considered as falling into one of two categories: transcriptional silencing of tumor suppressor genes by CpG island promoter hypermethylation in the context of a massive global genomic hypomethylation ( 10 , 18 , 19 ). CpG islands become hypermethylated with the result that the expression of the contiguous gene is shut down. If this aberration affects a tumor suppressor gene it confers a selective advantage on that cell and is selected generation after generation. We and other researchers have contributed to the identification of a long list of hypermethylated genes in human neoplasias, and this epigenetic alteration is now considered to be a common hallmark of all human cancers affecting all cellular pathways ( 10 , 18 , 19 ). Extremely important genes in cancer biology, such as the cell-cycle inhibitor p16INK4a, the p53-regulator p14ARF, the DNA-repair genes hMLH1 , BRCA1 and MGMT , the cell-adherence gene E-cadherin , or the estrogen and retinoid receptors undergo methylation-associated silencing in cancer cells ( 10 , 18 , 19 ). The profiles of CpG island hypermethylation are known to depend on the tumor type ( 20 , 21 ). Each tumor subtype can now be assigned a DNA hypermethylome that almost completely defines that particular malignancy in a similar fashion as do genetic and cytogenetic markers. Establishing a DNA hypermethylome can be very useful for classifying these malignancies according to their aggressiveness or sensitivity to chemotherapy. Single-gene approaches can also be extremely useful, such as it was first demonstrated with the DNA repair gene MGMT ( 22 ).
DNA methylation can be exploited on two additional translational fronts for clinical purposes in cancer patients. First, using hypermethylation as a molecular biomarker of cancer cells because the presence of CpG island hypermethylation of the tumor suppressor genes described is specific to transformed cells. One of the best-accepted cases is the presence of hypermethylation of the glutathione S -transferase P1 ( GSTP1 ) gene in prostate cancer ( 23 ). Hypermethylation could also be used as tool for detecting cancer cells in multiple biological fluids or even for monitoring hypermethylated promoter loci in serum DNA from cancer patients ( 24 ). Second, unlike genetic changes in cancer, epigenetic changes are potentially reversible. For years, in cultured cancer cell lines, we have been able to re-express genes that had been silenced by methylation by using DNA demethylating agents such as 5-aza-2-deoxycytidine, 5-azacitidine or zebularine ( 25 ). Today, given at lower doses these drugs have shown a significant antitumoral activity and the FDA has approved the use of Viaza as election treatment for a pre-leukemic disease, myelodisplastic syndrome.
At the same time as the aforementioned CpG islands become hypermethylated, the genome of the cancer cell undergoes global hypomethylation ( 26 ). The malignant cell can have 20–60% less genomic 5mC than its normal counterpart. The loss of methyl groups is accomplished mainly by hypomethylation of the ‘body’ (coding region and introns) of genes and through demethylation of repetitive DNA sequences, which account for 20–30% of the human genome. How does global DNA hypomethylation contribute to carcinogenesis? Three mechanisms can be invoked as follows: chromosomal instability, reactivation of transposable elements and loss of imprinting. Undermethylation of DNA might favor mitotic recombination, leading to loss of heterozygosity as well as promoting karyotypically detectable rearrangements. Additionally, extensive demethylation in centromeric sequences is common in human tumors and may play a role in aneuploidy. As evidence of this, patients with germline mutations in DNA methyltransferase 3b ( DNMT3b ) are known to have numerous chromosome aberrations. Hypomethylation of malignant cell DNA can also reactivate intragenomic parasitic DNA, such as L1 (Long Interspersed Nuclear Elements, LINEs) and Alu (recombinogenic sequence) repeats. These, and other previously silent transposons, may now be transcribed and even ‘moved’ to other genomic regions, where they can disrupt normal cellular genes. Finally, the loss of methyl groups can affect imprinted genes and genes from the methylated-X chromosome of women. The best-studied case is of the effects of the H19/IGF-2 locus on chromosome 11p15 in certain childhood tumors ( 27 ).
DNA methylation also occupies a place at the crossroads of many pathways in immunology, providing us with a clearer understanding of the molecular network of the immune system. From the classical genetic standpoint, two immunodeficiency syndromes, the ICF (immunodeficiency-centromeric regions instability–facial anomalies) and ATRX (X-linked form of syndromal retardation associated with alpha thalassemia) syndromes, are caused by germline mutations in two epigenetic genes: the DNA methyltransferase DNMT3b and the ATRX genes. Autoimmunity and DNA methylation can also go hand in hand. Classical autoimmune diseases, such as systemic lupus erythematosus or rheumatoid arthritis, are characterized by massive genomic hypomethylation ( 28 ). This phenomenon is highly reminiscent of the global demethylation observed in the DNA of cancer cells compared with their normal-tissue counterparts. Several other examples are also worth mentioning, such as the proposed epigenetic control of the histo-blood group ABO genes and the silencing of human leukocyte antigen (HLA) class I antigens.
Aberrant DNA methylation patterns go beyond the fields of oncology and immunology to touch a wide range of fields of biomedical and scientific knowledge. In neurology and autism research, for example, it was surprising to discover that germline mutations in the methyl-binding protein MeCP2 (a key element in the silencing of gene expression mediated by DNA methylation) causes the common neurodevelopmental disease known as Rett syndrome ( 29 ). This leads us to wonder how many DNA methylation alterations underlie other, more prevalent neurological pathologies, such as schizophrenia or Alzheimer's disease. Beyond that, DNA methylation changes are also known to be involved in cardiovascular disease, the biggest killer in western countries. For example, aberrant CpG island hypermethylation has been described in atherosclerotic lesions ( 30 ). Germline variants and mutations in genes involved in the metabolism of the methyl-group (such as MTHFR) cause changes in DNA methylation, and changes in the levels of methyl-acceptors and methyl-donors are responsible for the pathogenesis of diseases related to homocysteinemia and spina bifida. Imprinting disorders, which represent another huge area of research, are the perfect example of methylation-dependent epigenetic human diseases. A perfectly confined DNA methylation change causes Beckwith–Wiedemann syndrome, Prader–Willi/Angelman syndromes, Russell–Silver syndrome and Albright hereditary osteodystrophy. This highlights the absolute necessity to maintain the correct DNA methylome in order to achieve harmonized development.
Regarding histone modifications, we are largely ignorant of how these histone modifications markers are disrupted in human diseases. In cancer cells, it is known that hypermethylated promoter CpG islands of transcriptionally repressed tumor suppressor genes are associated with hypoacetylated and hypermethylated histones H3 and H4 ( 10 , 18 , 19 ). It is also recognized that certain genes with tumor suppressor-like properties such as p21WAF1 are silent at the transcriptional level, in the absence of CpG island hypermethylation, in association with hypoacetylated and hypermethylated histones H3 and H4 ( 31 ). However, until very recently there was not a profile of overall histone modifications and their genomic locations in the transformed cell. This need to determine the histone modification pattern of tumors was even more urgent, given the rapid development of histone deacetylase inhibitors as putative anticancer drugs ( 32 ). We have provided this missing link demonstrating that human tumors undergo an overall loss of monoacetylation of lysine 16 and trimethylation of lysine 20 in the tail of histone H4 ( 33 ). These two histone modification losses can be considered as almost universal epigenetic markers of malignant transformation ( 33 ), as has now been accepted for global DNA hypomethylation and CpG island hypermethylation. Certain histone acetylation and methylation marks may have prognostic value ( 34 ). For other human pathologies, we are still in the infancy to define their histone modification signatures.
Pursuing the human epigenome
If we are brave to tackle the whole human epigenome is in part related to a methodology revolution in epigenetics. The emergence of a new technology for studying DNA methylation based on bisulfite modification coupled with PCR has been decisive in the expansion of the field. Until a few years ago, the study of DNA methylation was almost entirely based on the use of enzymes that distinguished unmethylated and methylated recognition sites. This approach had many drawbacks, from incomplete restriction cutting to limitation of the regions of study. Furthermore, it usually involved Southern blot technologies, which required relatively substantial amounts of DNA of high molecular weight. The popularization of the bisulfite treatment of DNA (which changes unmethylated ‘C’ to ‘T’ but maintains the methylated ‘C’ as a ‘C’), associated with amplification by specific polymerase chain reaction primers (methylation-specific polymerase chain reaction), Taqman, restriction analysis and genomic sequencing has made it possible for every laboratory and hospital in the world to have a fair opportunity to study DNA methylation, even using pathological material from old archives. We may call this change the ‘universalization of DNA methylation’. The techniques described, which are ideal for studying biological fluids and the detailed DNA methylation patterns of particular tumor suppressor genes, can also be coupled with global genomic approaches for establishing molecular signatures of tumors based on DNA methylation markers, such as CpG island microarrays, Restriction Landmark Genomic Scanning and Amplification of Intermethylated Sites ( 35 ).
Moreover, we now have serious cause to believe that we can study the content and distribution of 5mC in the cellular nuclei and the whole genome, thanks to two new tools: the improved immunohistochemical staining of 5mC, which allows localization of the latter in the chromatin structure, and can be used to methyl-immunoprecipitate DNA for further hybridization to a CpG island or promoter microarray ( 36 ); and high performance capillary electrophoresis (HPCE), which is a reliable and affordable technique for measuring total levels of 5mC ( 35 ).
For the epigenomic study of histone modifications and chromatin-remodeling factors occupancy we have another breakthrough: chromatin immunoprecipitation (ChIP) associated with hybridization to a microarray platform (ChIP on CHIP). ChIP is currently the most powerful technique for investigating in vivo interactions between a nuclear factor and its genomic target sequences. The technique consists of immunoprecipitating chromatin with specific antibodies to isolate DNA sequences that are bound by the nuclear proteins against which the antibodies are raised. After that, immunoprecipitated DNA is typically analyzed by PCR with specific primers to investigate the presence of a candidate DNA sequence. In practice, two different aspects can be explored. One is the binding of different nuclear factors to their binding sites. The other is that, since histones are associated with DNA throughout the entire genome, it is possible to explore the association of different post-translational modifications of histones with specific genomic sequences by using antibodies that recognize these modifications. Consequently, ChIPs provide dynamic information about not only nuclear factor occupancy at their target binding sites but also specific histone modification patterns in selected DNA sequences.
Microarrays are the other part of the story. The first microarrays to be designed and used were cDNA microarrays, which have been routinely used to characterize variations in gene expression. More recently, genomic microarrays have become available as the entire genome has been sequenced and the gene regulatory regions have become better known. The development of novel types of genomic microarray such as CpG island and promoter microarrays provides the exceptional opportunity for the new application: hybridization of ChIP samples on a microarray. With this elegant combination of techniques it is now possible to uncover novel binding target sequences for nuclear factors, such as E2F ( 37 ) or MBDs ( 38 ), or DNA sequences with specific histone modification patterns on a genomic scale.
In summary, we have the right tools to unfold the epigenome and there are many biological and medical questions that could be answered with a large-scale launched project ( Figure 1 ). How many tumor suppressor genes undergo CpG island promoter hypermethylation in transformation? How does our epigenome change with the aging process? Which is the impact of the environment in modulating the epigenetic marks and gene function? What is the contribution of DNA methylation and histone modifications to cell and tissue type-specific differentiation? What is the epigenetic environment of a stem cell? Which is the epigenome of a breast cancer cell, of an endothelium affected by ateroma formation of a neuron undergoing Alzheimer degeneration? If the right funding is there, the scientists are ready to go ahead in a Human Epigenome Project.
Conflict of Interest Statement : None declared.