Abstract

The biological significance of 5-methylcytosine was in doubt for many years, but is no longer. Through targeted mutagenesis in mice it has been learnt that every protein shown by biochemical tests to be involved in the establishment, maintenance or interpretation of genomic methylation patterns is encoded by an essential gene. A human genetic disorder (ICF syndrome) has recently been shown to be caused by mutations in the DNA methyltransferase 3B (DNMT3B) gene. A second human disorder (Rett syndrome) has been found to result from mutations in the MECP2 gene, which encodes a protein that binds to methylated DNA. Global genome demethylation caused by targeted mutations in the DNA methyltransferase-1 (Dnmt1) gene has shown that cytosine methylation plays essential roles in X-inactivation, genomic imprinting and genome stabilization. The majority of genomic 5-methylcytosine is now known to enforce the transcriptional silence of the enormous burden of transposons and retroviruses that have accumulated in the mammalian genome. It has also become clear that programmed changes in methylation patterns are less important in the regulation of mammalian development than was previously believed. Although a number of outstanding questions have yet to be answered (one of these questions involves the nature of the cues that designate sites for methylation at particular stages of gametogenesis and early development), studies of DNA methyltransferases are likely to provide further insights into the biological functions of genomic methylation patterns.

Received 22 June 2000; Accepted 13 July 2000.

FORM AND FUNCTION OF GEMOMIC METHYLATION PATTERNS

The mammalian genome contains ∼3 × 107 residues of 5-methylcytosine (m5C), mostly within 5′-m5CG-3′ dinucleotides. Cytosine methylation raises the coding capacity of the genome and could, in principle, play any number of roles, although the real functions have been elusive and the subject of warm controversy over the past 25 years. It is still popularly believed that reversible methylation and demethylation regulate the normal development of mammals, although no single tissue-specific gene has been proven to be regulated in this way. The promoters of tissue-specific genes that are contained within CpG islands are largely unmethylated in both expressing and non-expressing tissues under normal conditions (except in the case of certain imprinted genes and genes on the inactive X chromosomes in females). The promoters of genes that lack CpG islands (and are therefore less sensitive to the inhibitory effects of methylation) are not heavily methylated in non-expressing tissues (1). The mere binding of certain transcription factors, or even the Escherichia coli lac repressor in cells transfected with the lac operator, can drive the loss of methylation from flanking CpG dinucleotides in dividing cells (2,3). The demethylation that is sometimes observed to accompany the onset of transcription of such genes is therefore more likely to represent a consequence, and not a cause, of gene activation (1,2). Several additional lines of evidence conflict with the methylation-development hypothesis; these have been discussed elsewhere (1). It has become increasingly difficult to maintain that reversible cytosine methylation has a major role in the regulation of mammalian development, which instead depends on well-conserved regulatory networks that operate even in organisms that do not modify their DNA.

Cytosine methylation is certainly involved in irreversible promoter silencing of many imprinted genes and genes subject to X-inactivation and of the promoters of transposons and endogenous retroviruses. The potential for expression of imprinted genes is irreversibly set in the germline of the preceding generation and that of X-linked genes in early post-implantation embryos. Irreversible promoter silencing appears to be restricted to organisms whose genomes contain modified bases, although a subset of imprinted genes have been reported to show methylation-independent imprinted expression (4,5). It may be useful to think of cytosine methylation as an additional regulatory signal that endows organisms that have modified bases with new abilities, rather than as an ancillary of conserved regulatory networks that operate in all metazoa, including those (such as Caenorhabditis elegans and Drosophila) which lack DNA modification.

Outcrossing sexual reproduction favors the evolution of aggressive and harmful transposons, which in turn selects for the evolution of host functions that repress those transposons (6). Imprinted genes and genes subject to X-inactivation (in females) account for <10% of the m5C in the genome; the large majority is actually in transposons (7), which are abundant [>106 elements, and >40% of the genome (8)] and relatively rich in CpG dinucleotides. Most cellular genes contain multiple transposons within introns, where their transcriptional activation would be expected to interfere with regulated expression of the host gene (7). Transposons also destabilize the genome by insertional mutagenesis and by favoring rearrangements via recombination between non-allelic repeats (7). A large and expanding body of evidence confirms that the role of the majority of the m5C in the mammalian genome is host defense against transposons. This provides short-term protection through the strong repressive effects of cytosine methylation, and permanent inactivation through the accumulation of C→T mutations that occur at high frequency at methylated sites (7). Transposons are heavily methylated in all cell types (8); this includes germ cells of both sexes (with the exception of the primordial germ cell, which is short-lived and in which alternative repressive mechanisms operate). A direct test of the host-defense hypothesis showed that demethylation in DNA methyltransferase-deficient mouse embryos caused fulminating transcription of a class of retroposon that is still capable of transposition in mice (9). Mechanisms that the cell may employ to identify and silence transposons have been described (10).

Disruption of global methylation patterns is lethal to mammals (11). Even focal demethylation or hypermethylation at imprinted loci can cause developmental abnormalities (12), and demethylation of classical satellite DNA in ICF (immunodeficiency, centromere instability and facial anomalies) syndrome can cause chromosome instability and fatal immunodeficiency (13). Ectopic de novo methylation of tumor suppressor genes may also contribute to oncogenesis (12). However, rather little is known of the cues that trigger de novo methylation or of the identity of the factors that respond to these cues. Much of what has been learned of the biological roles of genomic methylation patterns has come from studies of the DNA (cytosine-5)-methyltransferases themselves and the phenotypes that result from mutations in DNA methyltransferase genes.

THE UNUSUAL CATALYTIC MECHANISM OF DNA (CYTOSINE-5)-METHYLTRANSFERASES

The 5 position of cytosine is relatively unreactive, and its methylation in neutral aqueous solution has been called a ‘chemically improbable’ reaction (14). The catalytic mechanism of DNA (cytosine-5)-methyltransferases is correspondingly unusual (Fig. 1a). Santi et al. (15) proposed that the DNA cytosine-methyltransferases might use a mechanism similar to that of thymidylate synthetase, in which an enzyme cysteine thiolate adds covalently to the 6 position, thereby pushing electrons to the 5 position to make the carbanion, which could then attack the methyl group of N5,N10-methylenetetrahydrofolate. After methyl transfer, abstraction of a proton from the 5 position could allow reformation of the 5–6 double bond and release of enzyme by β-elimination (Fig. 1a). A similar mechanism could be used by DNA methyltransferases, except that the substrate is cytosine in DNA (rather than free dUMP) and the methyl donor is S-adenosyl-l-methionine (AdoMet). Erlanson et al. (16) pointed out that the approach trajectories to both the 5 and 6 positions of cytosine in DNA were occluded by neighboring nucleotides and suggested that the target base was extrahelical during the methyl transfer reaction; they also suggested that covalent addition of enzyme created a reactive 4–5 enamine rather than a 5-carbanion, which is too high in energy to exist under physiological conditions. The steric embarrassments were relieved by the remarkable DNA–DNA methyltransferase co-crystal structures of Klimasauskas et al. (17), who found that the target cytosine is everted from the DNA helix and inserted deep into the active site of the enzyme (Fig. 1b).

All enzymes that modify the 5 position of pyrimidines appear to use a variant of the reaction mechanism described above, and most [including all known DNA (cytosine-5)-methyltransferases] have a conserved prolylcysteinyl active site dipeptide that provides the cysteine thiolate (18). The DNA cytosine-methyltransferases bear ten characteristic sequence motifs (19,20), six of which are strongly conserved. Motifs I and X fold together to form most of the AdoMet binding site, motif IV contains the prolylcysteinyl dipeptide that provides the thiolate at the active site, motif VI contains the glutamyl residue that protonates the 3 position of the target cytosine, and motif IX has a role in maintaining the structure of the target recognition domain (usually located between motifs VIII and IX) that makes base-specific contacts in the major groove (18). All or most of these motifs are discernable in all DNA cytosine-methyltransferases of bacteria, fungi, plants and mammals, and a number of DNA methyltransferases have been identified from searches of anonymous expressed sequence tags (ESTs). With few exceptions, the set of motifs has proven to be a reliable diagnostic template in EST searches.

STRUCTURE AND FUNCTION OF Dnmt1

The wide conservation of DNA cytosine-methyltransferases was first revealed by the purification and cloning of the first eukaryotic DNA methyltransferase (21), which remains the sole mammalian DNA methyltransferase to have been identified by biochemical assay. This enzyme is now properly termed Dnmt1 (OMIM 126375), although many other names have been invented by various laboratories. This protein contains 1620 amino acids (an interesting form that lacks 118 N-terminal amino acids is found in oocytes and will be discussed later). Dnmt1 has a 5- to 30-fold preference for hemimethylated substrates (22), and as a result has been assigned a function in the maintenance of methylation patterns. However, this assignment was made largely to satisfy predictions made in the mid 1970s, and there is no direct evidence that Dnmt1 is not also involved in certain types of de novo methylation. Dnmt1 exerts the overwhelming majority of de novo methylation activity in embryo lysates and has little sequence specificity beyond the CpG dinucleotide (22). Homologs of Dnmt1 have been found in nearly all eukaryotes whose DNA bears m5C, but not in those that lack it. A report of a Dnmt1 homolog in Drosophila (23) may not have been completely accurate, as the genome sequence contains no evidence of such a homolog.

As shown in Figure 2a, Dnmt1 has an C-terminal domain that is related to bacterial restriction methyltransferases (21); the C-terminal domain is in fact more closely related to many of the bacterial enzymes than to mammalian DNA methyltransferases of the Dnmt2 and Dnmt3 families (Fig. 3). A large N-terminal domain has accreted multiple domains that provide functions specialized to eukaryotes; these functions include import into nuclei, the co-ordination of replication and methylation during S-phase, and the partial suppression of de novo methylation (18). The domain that targets Dnmt1 to replication foci (24) mediates a dramatic redistribution of Dnmt1 during S-phase: a uniform nucleoplasmic distribution in G1-phase is followed in S-phase by a coalescence into discrete foci that are organized around aggregations of γ satellite DNA in mouse fibroblast nuclei (Fig. 2b and c). During mid and late S-phase these large toroidal foci are the major sites of DNA replication; early in S phase there are many small replication foci throughout the nucleus (24).

Targeted mutations of the Dnmt1 gene (11,25) are recessive lethals that produce a number of unique phenotypes in mice. First, the Dnmt1 mutation produces a lethal differentiation phenotype in which homozygous mutant embryonic stem (ES) cells grow normally with severely demethylated genomes but undergo cell-autonomous apoptosis when induced to differentiate (11). Second, embryos homozygous for mutations at Dnmt1 show biallelic expression of several (but not all) imprinted genes (25,26). Third, homozygous Dnmt1-null embryos show transient ectopic expression of all copies of Xist and evidence of at least transient inactivation of all X chromosomes (27). Demethylation in ES cells has also been reported to cause an increased frequency of deletion and rearrangement mutations (28), probably through an increased rate of homologous recombination among demethylated and unmasked repeated sequences. Trace amounts of m5C persist in the genomes of Dnmt1-null ES cells and the capacity to methylate newly integrated retroviral DNA is partially retained, which requires the existence of one or more additional DNA methyltransferases (29).

SEX-SPECIFIC PROMOTERS AND EXONS AT THE Dnmt1 LOCUS

The Dnmt1 gene is unique in that expression is driven by sex-specific promoters and 5′ exons (Fig. 4). The 5′-most promoter introduces an oocyte-specific 5′ exon (exon 1o) which causes translation to initiate at an ATG codon in exon 4; the resulting protein is shorter than the somatic form by 118 N-terminal amino acids (30). This truncated oocyte-specific form of Dnmt1 (Dnmt1o) is enzymatically active and accumulates to very high levels in the oocyte; it is nuclear only at the earliest stages of oocyte growth, and just prior to ovulation comes to be localized in a cytoplasmic shell just within the oocyte cortex (31). Dnmt1o protein is cytoplasmic in pre-implantion embryos, but specifically enters and then exits nuclei at the 8-cell stage (30,31), and does not become fully nuclear until after implantation, where it is soon replaced by the full-length somatic form. Dnmt1 is localized largely or exclusively to nuclei of all somatic cells examined and is cytoplasmic only in the oocyte and pre-implantation embryo. The biological function of the elaborate and unprecedented nuclear–cytoplasmic trafficking of Dnmt1 during oogenesis and early development is currently unknown; the brief entry into and exit from nuclei at the 8-cell stage is especially intriguing. It should also be noted that somatic nuclei contain relatively large amounts of a form of Dnmt1 that is absent from the oocyte and pre-implantation embryo, and early development in the presence of this ectopic Dnmt1 may contribute to the poor success rates and developmental abnormalities commonly seen in offspring derived by transplantation of somatic nuclei into ooplasts.

A promoter and exon (exon 1s) that are active in all somatic cells is located ∼7 kb 3′ of exon 1o (Fig. 4). Promoter 1s functions as the housekeeping promoter (22), and exon 1s contains the ATG codon that initiates the full-length Dnmt1 in somatic cells (30). Promoter 1s is activated shortly after implantation and by post-coitum day 7 all detectable Dnmt1 protein is the full-length form (30). Promoter 1s is active in all cycling cells but is downregulated under conditions of growth arrest. There were early reports of massive overexpression of Dnmt1 in tumor cells (32), but later reports made it clear that expression in tumor cells is at most only slightly elevated over that of non-transformed cells (3335). It has also been reported that the Dnmt1 gene contains >11 transcriptional start sites spread over many kilobases (36), and that some of these promoters respond to the products of oncogenes (37,38). However, the promoters that were reported to respond to c-Jun and RB1 are just 5′ of exon 4, and several kilobases 3′ of the initiation codon in exon 1s. Transcription initiation at the putative oncogene-sensitive sites could only yield the truncated protein that is found in oocytes. This form of Dnmt1 has not been observed in somatic cells. For this and other reasons, the identification of the many transcriptional start sites described by Bigey et al. may have been in error (36). Most data indicate that there is a single transcriptional start site in adult somatic cells and the mode of regulation of the somatic promoter of Dnmt1 is consistent with that of other genes whose products are associated with DNA replication (39). The more remarkable attributes ascribed to the somatic Dnmt1 promoter have been difficult to confirm. Repression of DNMT1 has been suggested as a new therapy for certain cancers, but DNMT1 is not significantly overexpressed in tumors, and the loss of DNMT1 function is lethal to normal cells (29). These findings greatly reduce the promise of DNMT1 as a target of repression in the clinical management of cancer.

Just 3′ of exon 1s lies promoter and exon 1p (Fig. 4), which is active only in the pachytene spermatocyte, where it gives rise to the major or sole transcript (30). Exon 1p contains multiple short open reading frames which would be expected to interfere with translation of the Dnmt1 open reading frame (which in this mRNA is the same as that of the oocyte-specific mRNA). In keeping with this expectation, the pachytene spermatocyte does not contain detectable Dnmt1 protein and the abundant mRNA that contains exon 1p is not associated with polyribosomes (30,40). It is not known why the spermatocyte should have evolved a combined transcriptional and post-transcriptional mechanism for the downregulation of Dnmt1 protein at the pachytene stage. It should be noted that Dnmt1 protein is absent from both oocytes and spermatocytes at the time of meiotic recombination (30). Dnmt1 is produced to high levels after this stage of oogenesis but does not reappear after recombination during spermatogenesis (30).

The relationship between Dnmt1 protein and mRNA levels in germ cells is unusual. Dnmt1 protein is present at very high levels in mature oocytes and pre-implantation embryos, but mRNA levels are low at these stages. Conversely, Dnmt1 mRNA levels in the pachytene spermatocyte are high, but protein levels are low (30). Analysis of mRNA levels therefore gives a large underestimate of protein levels in oocytes and early embryos and a large overestimate in pachytene spermatocytes.

THE ENIGMATIC Dnmt2 FAMILY

For a decade Dnmt1 was the only DNA methyltransferase to have been identified in a mammal. New candidate DNA methyltransferases identified by searches of EST databases were reported in 1998. The first of these encodes Dnmt2 (41), which is most similar to pmt1p (42) of Schizosaccharomyces pombe, an organism not known to methylate its DNA (Fig. 3b). Disruption of the pmt1+ gene in S.pombe gave no discernible phenotype, and transmethylation activity could not be detected when recombinant pmt1p was subjected to biochemical assays (42). Disruption of Dnmt2, the mouse homolog of pmt1+, had no obvious effect on genomic methylation patterns in embryonic stem cells, nor did it affect the ability of such cells to methylate newly integrated retroviral DNA (43). There are well-conserved Dnmt2 homologs in plants, vertebrates, D.melanogaster and S.pombe, but no related sequence is found in the genomes of Saccharomyces cerevisiae or C.elegans (none of the latter four species are known to methylate their DNA and none have other DNA methyltransferase homologs). The surprising phylogenetic distribution of Dnmt2 homologs might provide a hint as to biological role: centromere structure and function is conserved among the organisms that contain Dnmt2 homologs, but is quite different in the organisms that lack them. Saccharomyces cerevisiae has compact centromeres very different from those of other eukaryotes, and C.elegans has holocentric chromosomes without discrete centromeres. Although a biological role remains to be demonstrated for any member of the Dnmt2 family in any species, a role in some aspects of centromere function is a possibility.

THE Dnmt3 FAMILY

Additional DNA methyltransferases soon appeared in EST databases. These enzymes [Dnmt3A and Dnmt3B (44)] are distantly related to the Dnmt1 and Dnmt2 families (Fig. 3) and, in fact, are most closely related to the multispecific DNA methyltransferases encoded by bacteriophages that infect Bacillus species (Fig. 3). Both DNMT3A and DNMT3B had been mapped by the Unigene consortium via polymorphisms in 3′-untranslated region sequences. DNMT3B mapped to the region of chromosome 20q that contains the trait for ICF NSsyndrome (45). This syndrome presents with variable combined immunodeficiency, mild facial anomalies and extravagant cytogenetic abnormalities which largely affect the pericentric regions of chromosomes 1, 9 and 16. These pericentric regions contain a type of satellite DNA termed classical satellite, or satellites 2 and 3. It is normally heavily methylated, but is nearly completely unmethylated in DNA of ICF patients (46). It was soon found that ICF patients had mutations in the C-terminal DNA methyltransferase domain of DNMT3B (13). Although classical satellite sequences were completely demethylated, none of the patients were homozygous for null alleles of DNMT3B (13). This suggested that null alleles might be lethal for reasons other than their loss of DNA methyltransferase activity, which may explain the lethality of targeted null alleles of Dnmt3B in mice (47). In addition to classical satellite, demethylation of DNA in ICF patients is also seen at CpG islands on the inactive X chromosome in females and at two repeat families, one of which (D4Z4) has been tied to facioscapulohumeral muscular dystrophy (48). ICF patients have not been observed to suffer from this condition and the lack of methylation of CpG islands on the inactive X chromosome does not cause the symptoms of ICF syndrome to differ notably between male and female patients (49). Whereas inactivation of Dnmt1 causes global demethylation of the genome (11), DNMT3B appears to be specialized for the methylation of a particular compartment of the genome; loss of DNMT3B activity in ICF syndrome causes demethylation of only specific families of repeated sequences and CpG islands on the inactive X chromosome. Classical satellite DNA and CpG islands on the inactive X normally undergo de novomethylation soon after implantation (6), at which time DNMT3B may be especially active. DNMT3B also remains the only DNA methyltransferase shown to be mutated in a human disease. Disruption of Dnmt3A is also lethal to mice and the Dnmt3A/Dnmt3B double mutant has been reported to be unable to methylate newly integrated retroviral DNA, whereas each single mutant retains this ability (47). The locations of DNA methyltransferase genes, classical satellite tracts and genes currently known to influence methylation patterns are shown in Figure 5.

As mentioned previously, it is strongly held in some quarters that maintenance and de novo methylation must be performed by separate enzymes. In this model, sequence-specific de novo methyltransferases act at specific stages of gametogenesis and early development to establish methylation patterns, which would then be maintained during cell division by sequence-independent DNA methyltransferases that can methylate only hemimethylated substrates. In order to satisfy this expectation, Dnmt3A and Dnmt3B have been assigned the former role, and Dnmt1 the latter (47). However, the real situation is not nearly so simple. No mammalian DNA methyltransferase has been shown to be sequence specific. The preference of Dnmt1 for hemimethylated substrates is not large (22) and the specific activity of Dnmt1 on unmethylated DNA substrates is much greater than that of Dnmt3A or Dnmt3B. Dnmt1 is also present at much higher levels than either of the latter enzymes. Furthermore, Dnmt3A and Dnmt3B are present in somatic cells and (if they are dedicated de novo DNA methyltransferases responsible for the establishment of methylation patterns) would be expected to eliminate allele-specific methylation patterns at imprinted loci during development. It is not unlikely that methylation imprints are established by as-yet undiscovered DNA methyltransferases. Methylation patterns are established in males by de novo methylation in prospermatogonia at 14–20 days post-coitum and in females during growth of dictyate oocytes at >5 days post-partum (9). It is not clear that EST libraries enriched in the relevant cell types have been prepared, but it is not unlikely that when examined such libraries will be found to contain new and possibly sex-specific DNA methyltransferases. The features that designate a particular region for de novo methylation, and the factors that respond to these cues, have yet to be identified.

A few years ago the biological significance of cytosine methylation was widely doubted and among the small group of believers a role in tissue-specific gene expression was most often invoked. It is now clear that perturbations of genomic methylation patterns can have diverse and severe effects on phenotype. Genome stability, allele-specific expression of imprinted genes and those subject to X-inactivation, the transcriptional silencing and masking of transposons and the assembly of higher-order chromatin structures on classical satellite DNA are all clearly dependent of cytosine methylation. A direct role in developmental gene control has come to seem increasingly unlikely. The current rate of progress is quite rapid and we can expect more surprises (and perhaps more controversy) as experimental studies of cytosine methylation and other aspects of epigenetic phenomena in mammals continue to gain momentum.

ACKNOWLEDGEMENTS

I apologize to those authors whose work could not be cited due to length limitations. I thank M. Goll for comments on the manuscript and R. Chaillet, X. Cheng, J. Trasler and M. Yanagida for discussions. This work was supported by grants GM59377 and HD37687 from the NIH and by a grant from the Leukemia and Lymphoma Society.

+

Tel: +1 212 305 5331; Fax: +1 212 740 0992; Email: thb12@columbia.edu

Figure 1. Catalytic mechanism and structure of DNA (cytosine-5)-methyltransferases. (a) Enzyme mechanism originally proposed by Santi et al. (14) and modified by Chen et al. (13) and Erlanson et al. (15). Protonation of the N3 position is mediated by the highly conserved ENV tripeptide in motif VI. The base that abstracts the 5-proton after methyl transfer is not known in all cases. (b) Structure of the bacterial enzyme M. HhaI with cognate DNA (16). The target recognition domain (in green at the top) makes specific contacts with base edges in the major groove and is responsible for sequence discrimination; the large catalytic domain (in blue at the bottom) contains the motifs responsible for the methyl transfer reaction shown in (a). The target cytosine is completely everted from the helix and inserted deep into the active site of the enzyme [not visible in this view, but near the cofactor S-adenosyl-l-methionine (AdoMet)].

Figure 1. Catalytic mechanism and structure of DNA (cytosine-5)-methyltransferases. (a) Enzyme mechanism originally proposed by Santi et al. (14) and modified by Chen et al. (13) and Erlanson et al. (15). Protonation of the N3 position is mediated by the highly conserved ENV tripeptide in motif VI. The base that abstracts the 5-proton after methyl transfer is not known in all cases. (b) Structure of the bacterial enzyme M. HhaI with cognate DNA (16). The target recognition domain (in green at the top) makes specific contacts with base edges in the major groove and is responsible for sequence discrimination; the large catalytic domain (in blue at the bottom) contains the motifs responsible for the methyl transfer reaction shown in (a). The target cytosine is completely everted from the helix and inserted deep into the active site of the enzyme [not visible in this view, but near the cofactor S-adenosyl-l-methionine (AdoMet)].

Figure 2. Functional domains in Dnmt1 and intranuclear trafficking of the enzyme. (a) Organization of Dnmt1. The C-terminal domain is related to the bacterial DNA methyltransferases shown in Figure 1b; the large N-terminal domain has accreted domains that serve functions required by eukaryotes. These include nuclear localization, co-ordination of methylation and replication via a domain that targets Dnmt1 to replication foci at S phase, and the partial suppression of de novo methylation (49). The function of the zinc-binding region is not clear; it is closely related to a motif in the human oncogene known as ALL1, MLL or HRX which is a homolog of Drosophila trithorax (17), but the motif is not present in the latter protein. Another motif is related to a domain in the Polybromo-1 protein of chicken, but is not a bromo domain. (b) Trafficking behavior of Dnmt1 in mouse 3T3 fibroblasts. Dnmt1 is present at nearly constant levels during the cell cycle (it is downregulated in growth-arrested cells) and is nucleoplasmic during G1 and G2 phases but associates with replication foci during S-phase. The nucleus at the bottom is in mid to late S-phase; the nucleus at top is in G1. Dnmt1 can be seen to have concentrated in the vicinity of replication foci; these are organized around concentrations of γ satellite DNA. (c) Co-localization of Dnmt1 and sites of active DNA synthesis in mouse fibroblast nuclei. As in (b), the nucleus at the bottom is in S-phase, the nucleus at the top in G1. Cells were pulse-labeled for 5 min with BrdU, then fixed and stained for Dnmt1 and BrdU (23). (d) Distinct and non-overlapping distributions of Dnmt1 and concentrations of splicing factors in S-phase nuclei.

Figure 2. Functional domains in Dnmt1 and intranuclear trafficking of the enzyme. (a) Organization of Dnmt1. The C-terminal domain is related to the bacterial DNA methyltransferases shown in Figure 1b; the large N-terminal domain has accreted domains that serve functions required by eukaryotes. These include nuclear localization, co-ordination of methylation and replication via a domain that targets Dnmt1 to replication foci at S phase, and the partial suppression of de novo methylation (49). The function of the zinc-binding region is not clear; it is closely related to a motif in the human oncogene known as ALL1, MLL or HRX which is a homolog of Drosophila trithorax (17), but the motif is not present in the latter protein. Another motif is related to a domain in the Polybromo-1 protein of chicken, but is not a bromo domain. (b) Trafficking behavior of Dnmt1 in mouse 3T3 fibroblasts. Dnmt1 is present at nearly constant levels during the cell cycle (it is downregulated in growth-arrested cells) and is nucleoplasmic during G1 and G2 phases but associates with replication foci during S-phase. The nucleus at the bottom is in mid to late S-phase; the nucleus at top is in G1. Dnmt1 can be seen to have concentrated in the vicinity of replication foci; these are organized around concentrations of γ satellite DNA. (c) Co-localization of Dnmt1 and sites of active DNA synthesis in mouse fibroblast nuclei. As in (b), the nucleus at the bottom is in S-phase, the nucleus at the top in G1. Cells were pulse-labeled for 5 min with BrdU, then fixed and stained for Dnmt1 and BrdU (23). (d) Distinct and non-overlapping distributions of Dnmt1 and concentrations of splicing factors in S-phase nuclei.

Figure 3. The three Dnmt families of mammals. (a) Dnmt1 was described earlier. Dnmt2 contains the full set of sequence motifs that are almost invariably diagnostic of DNA cytosine-methyltransferases but has not been shown to have transmethylase activity by biochemical or genetic tests. It also lacks the N-terminal domain characteristic of eukaryotic DNA cytosine-methyltransferases. The N-terminal regions of Dnmt3A and Dnmt3B are highly divergent on the N-terminal side of the Cys-rich region but in their C-terminal regions are more closely related to the multispecific DNA methyltransferases encoded by phages of the Bacillus species than to the Dnmt1 or Dnmt2 families. Dnmt3L lacks canonical DNA cytosine-methyltransferase motifs but is otherwise closely related to the C-terminal domain of Dnmt3A and Dnmt3B. (b) ClustalW analysis of DNA cytosine-methyltransferases from bacteria, fungi, plants and metazoa. The Dnmt1, Dnmt2 and Dnmt3 families can all be seen to be no more closely related to each other than to the bacterial restriction methyltransferases, which implies a radiation of DNA methyltransferases prior to or early in the evolution of eukaryotes. The regions of the proteins spanning motifs I and VI were compared to avoid unreasonable penalties arising from the variable spacing between the more C-terminal motifs. All upper case gene names indicate human proteins. X, Xenopus laevis; At, Arabidopsis thaliana;Dm, Drosophila melanogaster. Masc1 and Masc2 are from the fungus Ascobolus immersus, and pmt1p from Schizosaccharomyces pombe. M.SPR and M.φ3T are encoded by phages that infect Bacillus species (20). (c) Sequence comparison of the Cys-rich regions characteristic of metazoan DNA methyltransferases. Enzymes known to be active DNA methyltransferases are highlighted in red. The Cys-rich region contains 8± Cys residues within a tract of ∼40 amino acids. The function is not known. The C.elegans protein CE24669 and the D.melanogaster protein DCG11033 contain domains closely related to the Cys-rich domain of DNMT1 and of methylated DNA binding protein-1 (MDB1), but these organisms do not have m5C in their genomes; the domain is therefore unlikely to be involved in a methylation-dependent process. The divergent promoters of the autoimmune regulator (AIRE) and DNMT3L genes are separated by only 5 kb on chromosome 21. Dr, Danio rerio; Ce, C.elegans. Other prefixes are as in (b).

Figure 3. The three Dnmt families of mammals. (a) Dnmt1 was described earlier. Dnmt2 contains the full set of sequence motifs that are almost invariably diagnostic of DNA cytosine-methyltransferases but has not been shown to have transmethylase activity by biochemical or genetic tests. It also lacks the N-terminal domain characteristic of eukaryotic DNA cytosine-methyltransferases. The N-terminal regions of Dnmt3A and Dnmt3B are highly divergent on the N-terminal side of the Cys-rich region but in their C-terminal regions are more closely related to the multispecific DNA methyltransferases encoded by phages of the Bacillus species than to the Dnmt1 or Dnmt2 families. Dnmt3L lacks canonical DNA cytosine-methyltransferase motifs but is otherwise closely related to the C-terminal domain of Dnmt3A and Dnmt3B. (b) ClustalW analysis of DNA cytosine-methyltransferases from bacteria, fungi, plants and metazoa. The Dnmt1, Dnmt2 and Dnmt3 families can all be seen to be no more closely related to each other than to the bacterial restriction methyltransferases, which implies a radiation of DNA methyltransferases prior to or early in the evolution of eukaryotes. The regions of the proteins spanning motifs I and VI were compared to avoid unreasonable penalties arising from the variable spacing between the more C-terminal motifs. All upper case gene names indicate human proteins. X, Xenopus laevis; At, Arabidopsis thaliana;Dm, Drosophila melanogaster. Masc1 and Masc2 are from the fungus Ascobolus immersus, and pmt1p from Schizosaccharomyces pombe. M.SPR and M.φ3T are encoded by phages that infect Bacillus species (20). (c) Sequence comparison of the Cys-rich regions characteristic of metazoan DNA methyltransferases. Enzymes known to be active DNA methyltransferases are highlighted in red. The Cys-rich region contains 8± Cys residues within a tract of ∼40 amino acids. The function is not known. The C.elegans protein CE24669 and the D.melanogaster protein DCG11033 contain domains closely related to the Cys-rich domain of DNMT1 and of methylated DNA binding protein-1 (MDB1), but these organisms do not have m5C in their genomes; the domain is therefore unlikely to be involved in a methylation-dependent process. The divergent promoters of the autoimmune regulator (AIRE) and DNMT3L genes are separated by only 5 kb on chromosome 21. Dr, Danio rerio; Ce, C.elegans. Other prefixes are as in (b).

Figure 4. Sex-specific exons and mRNAs from the Dnmt1 gene. (a) 5′ region of Dnmt1 on proximal mouse chromosome 9. Exon 1o is oocyte-specific, exon 1s is specific to somatic cells of both sexes and exon 1p is restricted to pachytene spermatocytes (30). The ATG codon in exon 1s is used for initiation of translation in somatic cells; a truncated form arises from use of the ATG codon in exon 4 in oocytes (30). (b) mRNA products of sex-specific exons. Effect of the alternative promoter use and splicing on organization of mature Dnmt1 mRNAs are indicated. Heavy horizontal bars indicate open reading frames; short vertical bars indicate ATG initiation codons.

Figure 4. Sex-specific exons and mRNAs from the Dnmt1 gene. (a) 5′ region of Dnmt1 on proximal mouse chromosome 9. Exon 1o is oocyte-specific, exon 1s is specific to somatic cells of both sexes and exon 1p is restricted to pachytene spermatocytes (30). The ATG codon in exon 1s is used for initiation of translation in somatic cells; a truncated form arises from use of the ATG codon in exon 4 in oocytes (30). (b) mRNA products of sex-specific exons. Effect of the alternative promoter use and splicing on organization of mature Dnmt1 mRNAs are indicated. Heavy horizontal bars indicate open reading frames; short vertical bars indicate ATG initiation codons.

Figure 5. Human genes that affect genomic methylation patterns. Three confirmed DNA methyltransferases (DNMT1, DNMT3A and DNMT3B) map to chromosomes 19p, 2p and 20q, respectively; the DNA methyltransferase homolog DNMT2 maps to 10p and the DNMT3 family member named DNMT3L maps to 21q, in the distal part of the Down’s syndrome critical region. It is not yet known whether increased dosage of DNMT3L is involved in the etiology of Down’s syndrome. Mutations in ATRX cause both demethylation and hypermethylation of different sequences (51). Expression of XIST causes de novo methylation of CpG islands incis on the expressing chromosome through an unknown mechanism (27). Pericentric tracts of classical satellite 2 on chromosomes 1 and 16 and classical satellite 3 on chromosome 9 are indicated; these are severely or completely unmethylated in patients with ICF syndrome, which is caused by mutations in DNMT3B (12).

Figure 5. Human genes that affect genomic methylation patterns. Three confirmed DNA methyltransferases (DNMT1, DNMT3A and DNMT3B) map to chromosomes 19p, 2p and 20q, respectively; the DNA methyltransferase homolog DNMT2 maps to 10p and the DNMT3 family member named DNMT3L maps to 21q, in the distal part of the Down’s syndrome critical region. It is not yet known whether increased dosage of DNMT3L is involved in the etiology of Down’s syndrome. Mutations in ATRX cause both demethylation and hypermethylation of different sequences (51). Expression of XIST causes de novo methylation of CpG islands incis on the expressing chromosome through an unknown mechanism (27). Pericentric tracts of classical satellite 2 on chromosomes 1 and 16 and classical satellite 3 on chromosome 9 are indicated; these are severely or completely unmethylated in patients with ICF syndrome, which is caused by mutations in DNMT3B (12).

References

1 Walsh, C.P. and Bestor, T.H. (
1999
) Cytosine methylation and mammalian development.
Genes Dev.
 ,
13
,
26
–34.
2 Matsuo, K., Silke, J., Georgiev, O., Marti, P., Giovannini, N. and Rungger, D. (
1998
) An embryonic demethylation mechanism involving binding of transcription factors to replicating DNA.
EMBO J.
 ,
17
,
1446
–1453.
3 Lin, I.G., Tomzynski, T.J., Ou, Q. and Hsieh, C.L. (
2000
) Modulation of DNA binding protein affinity directly affects target site demethylation.
Mol. Cell. Biol.
 ,
20
,
2343
–2349.
4 Caspary, T., Cleary, M.A., Baker, C.C., Guan, X.J. and Tilghman, S.M. (
1998
) Multiple mechanisms regulate imprinting of the mouse distal chromosome 7 gene cluster.
Mol. Cell. Biol.
 ,
18
,
3466
–3674.
5 Dao, D. et al. (
1999
) Multipoint analysis of human chromosome 11p15/mouse distal chromosome 7: inclusion of H19/IGF2 in the minimal WT2 region, gene specificity of H19 silencing in Wilms’ tumorigenesis and methylation hyper-dependence of H19 imprinting.
Hum. Mol. Genet.
 ,
8
,
1337
–1352.
6 Bestor, T.H. (
1999
) Sex brings transposons and genomes into conflict.
Genetica
 ,
107
,
289
–295.
7 Yoder, J.A., Walsh, C.P. and Bestor, T.H. (
1997
) Cytosine methylation and the ecology of intragenomic parasites.
Trends Genet.
 ,
13
,
335
–340.
8 Smit, A.F.A. (
1999
) Interspersed repeats and other momentos of transposable elements in mammalian genomes.
Curr. Opin. Genet. Dev.
 ,
9
,
657
–663.
9 Walsh, C.P., Chaillet, J.R. and Bestor, T.H. (
1998
) Transcription of IAP endogenous retroviruses is constrained by cytosine methylation.
Nature Genet.
 ,
20
,
116
–117.
10 Bestor, T.H. and Tycko, B. (
1996
) Creation of genomic methylation patterns.
Nature Genet.
 ,
12
,
363
–367.
11 Li, E., Bestor, T.H. and Jaenisch, R. (
1992
) Targeted mutation of the DNA methyltransferase gene results in embryonic lethality.
Cell
 ,
69
,
915
–926.
12 Tycko, B. (
2000
) Epigenetic gene silencing in cancer.
J. Clin. Invest.
 ,
105
,
401
–407.
13 Xu, G.-L. et al. (
1999
) Chromosome instability and immunodeficiency syndrome caused by mutations in a DNA methyltransferase gene.
Nature
 ,
402
,
187
–191.
14 Chen, L., MacMillan, A.M., Chang, W., Ezaz-Nikpay, K., Lane, W.S. and Verdine, G.L. (
1991
) Direct identification of the active site nucleophile in a DNA (cytosine-5)-methyltransferase.
Biochemistry
 ,
30
,
11018
–11025.
15 Santi, D.V., Garrett, C.E. and Barr, P.J. (
1983
) On the mechanism of inhibition of DNA-cytosine methyltransferase by cytosine analogs.
Cell
 ,
33
,
9
–10.
16 Erlanson, D., Dhen, L. and Verdine, G.L. (
1993
) Enzymatic DNA methylation through a locally unpaired intermediate.
J. Am. Chem. Soc.
 ,
115
,
12583
–12584.
17 Klimasauskas, S., Kumar, S., Roberts, R.J. and Cheng, X. (
1994
) HhaI methyltransferase flips its target base out of the DNA helix.
Cell
 ,
76
,
357
–369.
18 Bestor, T.H. and Verdine, G.L. (
1994
) DNA methyltransferases.
Curr. Opin. Cell Biol.
 ,
6
,
3803
–3809.
19 Posfai, J., Bhagwat, A.S., Posfai, G. and Roberts, R.J. (
1989
) Predictive motifs derived from cytosine methyltransferases.
Nucleic Acids Res.
 ,
17
,
2421
–2435.
20 Lauster, R., Trautner, T.A. and Noyer-Weidner, M. (
1989
) Cytosine-NSspecific type II DNA methyltransferases. A conserved enzyme core with variable target-recognizing domians.
J. Mol. Biol.
 ,
206
,
305
–312.
21 Bestor, T.H., Laudano, A., Mattaliano, R. and Ingram, V. (
1988
) Cloning and sequencing of a cDNA encoding DNA methyltransferase of mouse cells. The carboxyl-terminal domain of the mammalian enzyme is related to bacterial restriction methyltransferases.
J. Mol. Biol.
 ,
203
,
971
–983.
22 Yoder, J.A., Soman, N., Verdine, G.V. and Bestor, T.H. (
1997
) DNA methyltransferases in mouse tissues and cells. Studies with a mechanism-based probe.
J. Mol. Biol.
 ,
270
,
385
–395.
23 Hung, M.-S., Karthikeyan, N., Huang, B., Koo, H.-C., Kiger, J. and Shen, C.-K.J. (
1999
) Drosophila proteins related to vertebrate DNA (5-cytosine) methyltransferases.
Proc. Natl Acad. Sci. USA
 ,
96
,
11940
–11945.
24 Leonhardt, H., Page, A.W., Weier, H.-Ul. and Bestor, T.H. (
1992
) A targeting sequence directs DNA methyltransferase to sites of DNA replication in mammalian nuclei.
Cell
 ,
71
,
865
–874.
25 Li, E., Beard, C., Forster, A.C., Bestor, T.H. and Jaenisch, R. (
1993
) DNA methylation, genomic imprinting, and mammalian development.
Cold Spring Harb. Symp. Quant. Biol.
 ,
LVIII
,
297
–305.
26 Li, E., Beard, C. and Jaenisch, R. (
1993
) Role for DNA methylation in genomic imprinting.
Nature
 ,
366
,
362
–365.
27 Beard, C., Li, E. and Jaenisch, R. (
1995
) Loss of methylation activates Xist in somatic but not in embryonic cells.
Genes Dev.
 ,
9
,
2325
–2334.
28 Chen, R.Z., Pettersson, U., Beard, C., Jackson-Grusby, L. and Jaenisch, R. (
1998
) DNA hypomethylation leads to elevated mutation rates.
Nature
 ,
395
,
89
–93.
29 Lei, H., Oh, S.P., Okano, M., Juttermann, R., Goss, K.A., Jaenisch, R. and Li, E. (
1996
) De novo DNA cytosine methyltransferase activities in mouse embryonic stem cells.
Development
 ,
122
,
3195
–3205.
30 Mertineit, C., Yoder, J.A., Takedo, T., Laird, D., Trasler, J. and Bestor, T.H. (
1998
) Sex-specific exons control DNA methyltransferase in mammalian germ cells.
Development
 ,
125
,
889
–897.
31 Carlson, L.L., Page, A.W. and Bestor, T.H. (
1992
). Localization and properties of DNA methyltransferase in preimplantation mouse embryos: implications for genomic imprinting.
Genes Dev.
 ,
6
,
2536
–2541.
32 el-Deiry, W.S., Nelkin, B.D., Celano, P., Yen, R.W., Falco, J.P., Hamilton, S.R. and Baylin, S.B. (
1991
) High expression of the DNA methyltransferase gene characterizes human neoplastic cells and progression stages of colon cancer.
Proc. Natl Acad. Sci. USA
 ,
88
,
3470
–3474.
33 Lee, P.J., Washer, L.L., Law, D.J., Boland, C.R., Horon, I.L. and Feinberg, A.P. (
1996
) Limited up-regulation of DNA methyltransferase in human colon cancer reflecting increased cell proliferation.
Proc. Natl Acad. Sci. USA
 ,
93
,
10366
–10370.
34 Schmutte, C., Yang, A.S., Nguyen, T.T., Beart, R.W. and Jones, P.A. (
1996
) Mechanisms for the involvement of DNA methylation in colon carcinogenesis.
Cancer Res.
 ,
56
,
2375
–2381.
35 Warnecke, P.M. and Bestor, T.H. (
2000
) Cytosine methylation and human neoplasia.
Curr. Opin. Oncol.
 ,
12
,
68
–73.
36 Bigey, P., Ramchandani, W., Theberge, J., Araujo, F.D. and Szyf, M. (
2000
) Transcriptional regulation of the human DNA methyltransferase (dnmt1) gene.
Gene
 ,
242
,
407
–418.
37 Rouleau, J., MacLeod, A.R. and Szyf, M. (
1995
) Regulation of the DNA methyltransferase by the RAS-AP1 signaling pathway.
J. Biol. Chem.
 ,
270
,
1595
–1601.
38 Rouleau, J., Tanigawa, G. and Szyf, M. (
1992
) The mouse DNA methyltransferase 5′ region. A unique housekeeping gene promoter.
J. Biol. Chem.
 ,
267
,
7368
–7377.
39 Yoder, J.A., Chiu Yen, R.-Y., Vertino, P.M., Bestor, T.H. and Baylin, S.B. (
1996
) New 5′ regions of the murine and human genes for DNA (cytosine-5) methyltransferase.
J. Biol. Chem.
 ,
271
,
31092
–31097.
40 Trasler, J.M., Alcivar, A.A., Hake, L.E., Bestor, T.H. and Hecht, N.B. (
1992
) DNA methyltransferase is developmentally expressed in replicating and non-replicating male germ cells.
Nucleic Acids Res.
 ,
20
,
2541
–2545.
41 Yoder, J.A. and Bestor, T.H. (
1998
) A candidate mammalian DNA methyltransferase related to pmt1p of fission yeast.
Hum. Mol. Genet.
 ,
7
,
279
–284.
42 Wilkinson, C.R., Bartlett, R., Nurse, P. and Bird, A.P. (
1995
) The fission yeast gene pmt1+ encodes a DNA methyltransferase homologue.
Nucleic Acids Res.
 ,
23
,
203
–210.
43 Okano, M., Xie, S. and Li, E. (
1998
) Dnmt2 is not required for de novo and maintenance methylation of viral DNA in embryonic stem cells.
Nucleic Acids Res.
 ,
26
,
2536
–2540.
44 Okano, M., Xie, S. and Li, E. (
1998
) Cloning and characterization of a family of novel mammalian DNA (cytosine-5) methyltransferases.
Nature Genet.
 ,
19
,
219
–220.
45 Wijmenga, C. et al. (
1998
) Localization of the ICF syndrome to chromosome 20 by homozygosity mapping.
Am. J. Hum. Genet.
 ,
63
,
803
–809.
46 Jeanpierre, M. et al. (
1993
) An embryonic-like methylation pattern of classical satellite DNA is observed in ICF syndrome.
Hum. Mol. Genet.
 ,
2
,
731
–735.
47 Okano, M., Bell, D.W., Haber, D.A. and Li, E. (
1999
) DNA methyltransferases Dnmt3a and Dnmt3b are essential for de novo methylation and mammalian development.
Cell
 ,
99
,
247
–257.
48 Kondo, T. et al. (
2000
) Whole-genome methylation scan in ICF syndrome: hypomethylation of non-satellite DNA repeats D4Z4 and NBL2.
Hum. Mol. Genet.
 ,
9
,
597
–604.
49 Bourc’his, D. et al. (
1999
) Abnormal methylation does not prevent X inactivation in ICF patients.
Cytogenet. Cell Genet.
 ,
84
,
245
–252.
50 Bestor, T.H. (
1992
) Activation of mammalian DNA methyltransferase by cleavage of a Zn-binding regulatory domain.
EMBO J.
 ,
11
,
2611
–2618.
51 Aapola, U. et al. (
2000
) Isolation and initial characterization of a novel zinc finger gene, DNMT3L, on 21q22.3, related to the cytosine-5-methyltransferase 3 gene family.
Genomics
 ,
65
,
293
–298.
52 Gibbons, R.J. et al. (
2000
) Mutations in ATRX, encoding a SWI/SNF-like protein, cause diverse changes in the pattern of DNA methylation.
Nature Genet.
 ,
24
,
368
–371.