Abstract

Changes in gene expression resulting from epigenetic and/or genetic changes play an important role in the evolutionary divergence of phenotypes. To explore how epigenetic and genetic changes are linked during primate evolution, we have compared the genome-wide DNA methylation profiles (methylomes) of humans and chimpanzees, which have a 1.2% DNA sequence divergence, of sperm, the frontal cortices, B cells, and neutrophils. We revealed that species-specific differentially methylated regions (S-DMRs), ranging from several hundred base pairs (bp) to several kilo base pairs (kb), were frequently associated with sequence changes in transcription factor-binding sites and insertions of Alu and SVA retrotransposons. We then generated a reference macaque sperm methylome map and revealed, in sperm, that both human and chimpanzee S-DMRs arose more frequently owing to methylation loss rather than gain. Moreover, we observed that the sperm methylomes contained many more hypomethylated domains (HMDs), ranging from 20 to 500 kb, than did the somatic methylomes. Interestingly, the sperm HMDs changed rapidly during primate evolution; hundreds of sperm HMDs were specific to humans, whereas most somatic HMDs were highly conserved between humans and chimpanzees. Notably, these human-specific sperm HMDs frequently occurred in regions exhibiting copy number variations. Our findings indicate that primate evolution, particularly in the germline, is significantly impacted by reciprocal changes in the genome and epigenome.

Introduction

In the genome, cytosine methylation at CpG sites is an epigenetic modification that plays a critical role in gene regulation, the silencing of transposable elements, and the pathogenesis of diseases such as cancer (1,2). Most CpG sites in the mammalian genome are generally highly methylated; however, some regions, such as CpG-rich islands and active regulatory elements (promoters and enhancers), are known to have reduced or no methylation (3,4). Acquisition of methylation at promoters and enhancers is strongly associated with transcriptional silencing, and changes in DNA methylation at these regulatory regions often occur during embryonic development, in response to the environment, and during disease progression (2,5,6).

Between closely related species, differences in proteins are more often found at the expression level rather than at the sequence level (7), and it has therefore been proposed that evolutionary changes in gene regulation play a major role in phenotypic diversification (8,9). These changes in gene regulation may arise through genetic mutation/drift but also via alterations in DNA methylation state. To understand how DNA methylation patterns change during evolution, several studies have compared the genome-wide DNA methylation profiles (methylomes) in closely related great-ape species (10–14). Depending on the tissues analyzed, human–chimpanzee comparisons estimated that 12–18% of inter-species differences in gene expression could be explained by the changes in promoter methylation (12). Our previous study on human and chimpanzee chromosomes 21 and 22 revealed that genetic changes that create or disrupt binding sites for CCCTC-binding factor (CTCF), a nuclear protein that regulates chromatin folding, are often associated with DNA methylation changes (10). This suggested that inter-species epigenetic differences can arise via genetic differences in transcription factor (TF)-binding sites (TFBSs). Indeed, recent studies on methylation quantitative trait loci using human lymphoblastoid cell lines identified many intra-species methylation variations that were associated with single nucleotide polymorphisms within TFBSs, including those for CTCF (15,16). Similarly, strain-specific DNA methylation and histone acetylation patterns in mice are likely determined mainly by genetic changes (17,18).

It has been generally thought that hypomethylated regions in the mammalian epigenome are small in size [<a few kilo base pairs (kb)] as exemplified by CpG islands and enhancers. However, large genomic domains (>several tens of kb) with relatively low methylation levels were recently identified in particular cells and were designated hypomethylated domains (HMDs), partially methylated domains, or DNA methylation valleys (4,19–22). HMDs in normal somatic tissues and pluripotent stem cells are gene-rich, G + C rich, and are often marked by trimethylation of histone H3 lysine-27 (H3K27me3) and trimethylation of histone H3 lysine-4 (H3K4me3). Changes in HMD methylation levels are often associated with cell differentiation and disease conditions (19,21–24). Conversely, in cancer cells and in cultured fibroblastic cells, HMDs are G + C poor and gene-poor (4,23) and marked by H3K9me3 (25). The mechanisms underlying the establishment and maintenance of HMDs remain poorly understood, and the evolutionary conservation/alteration of HMDs and its effect on phenotypic diversification remains unaddressed.

To study the mechanisms underlying inter-species epigenetic differences, we analyzed existing DNA methylome datasets from human and chimpanzee tissues. In addition, we determined the sperm methylome of the Japanese macaque and used it as a reference to infer ancestral methylation patterns. Our study demonstrated that genetic changes involving TFBSs and retrotransposons were often associated with local methylation differences. Moreover, we identified human-specific sperm HMDs, which were frequently associated with copy number variations (CNVs). Combined, our study revealed a tight interaction between DNA sequence and methylation and suggested that loss of DNA methylation is linked to chromosomal instability during human evolution.

Results

Identification of species-specific differentially methylated regions (S-DMRs) in sperm, the frontal cortex, B cells and neutrophils

To study differences in the methylome of various tissue/cell types from humans and chimpanzees, we analyzed publicly available whole-genome bisulfite sequencing (WGBS) data from sperm, the frontal cortices, B cells and neutrophils of both species (11,13,26). The average methylation levels at CpG sites ranged from 65 to 80% (Fig. 1A and Supplementary Material, Table S1). Hierarchical clustering grouped the methylome data (1-kb windows) by tissue type rather than by species (Fig. 1B).

Features of the species-specific differentially methylated regions (S-DMRs) identified in various tissues/cells. (A) The global CpG methylation level in each tissue or cell type of the two species. Species are indicated in parenthesis: H, human; C, chimpanzee. (B) Hierarchical clustering of the methylome data analyzed in 1-kb genomic windows. (C) Number of S-DMRs identified in each tissue. H<C and C<H indicate lower methylation in humans and in chimpanzees, respectively. (D) Fractions of the S-DMRs of each tissue that are shared by sperm. (E) Overall genomic locations of the S-DMRs. Promoters were defined as genomic regions spanning –2 to + 2 kb of the transcription start site of each RefSeq gene. The upstream region (–2 to 0 kb) of each promoter was excluded from the corresponding intergenic region. (F) Fractions of the S-DMRs containing repeats identified by RepeatMasker (http://www.repeatmasker.org/); date last accessed on June 24, 2017. (G) Repeat occupancy within the S-DMRs, as defined as the proportion of the total length of repeats in the S-DMRs relative to their total length.
Figure 1

Features of the species-specific differentially methylated regions (S-DMRs) identified in various tissues/cells. (A) The global CpG methylation level in each tissue or cell type of the two species. Species are indicated in parenthesis: H, human; C, chimpanzee. (B) Hierarchical clustering of the methylome data analyzed in 1-kb genomic windows. (C) Number of S-DMRs identified in each tissue. H<C and C<H indicate lower methylation in humans and in chimpanzees, respectively. (D) Fractions of the S-DMRs of each tissue that are shared by sperm. (E) Overall genomic locations of the S-DMRs. Promoters were defined as genomic regions spanning –2 to + 2 kb of the transcription start site of each RefSeq gene. The upstream region (–2 to 0 kb) of each promoter was excluded from the corresponding intergenic region. (F) Fractions of the S-DMRs containing repeats identified by RepeatMasker (http://www.repeatmasker.org/); date last accessed on June 24, 2017. (G) Repeat occupancy within the S-DMRs, as defined as the proportion of the total length of repeats in the S-DMRs relative to their total length.

We identified species-specific differentially methylated regions (S-DMRs), defined as containing ≥10 CpG sites with an average methylation difference of ≥ 40%, in each tissue by comparing the methylation levels at orthologous CpG sites (see Materials and Methods). This approach identified 12 727, 1875, 3128 and 3210 S-DMRs in sperm, the frontal cortex, B cells and neutrophils, respectively (Fig. 1C and Supplementary Material, Table S2). The median size of the S-DMRs was 1002, 478, 484 and 548 bp in sperm, the frontal cortex, B cells and neutrophils, respectively. To validate these results, we analyzed the methylation states of selected neutrophil S-DMRs in our human and chimpanzee peripheral white blood cell samples (where roughly 40–60% of the cells were neutrophils) by bisulfite-polymerase chain reaction (PCR) (Supplementary Material, Fig. S1). The results were consistent with our S-DMR calls with the WGBS data. Interestingly, sperm S-DMRs were ∼2-fold larger than somatic S-DMRs and outnumbered them by 4- to 7-fold (Fig. 1C). Moreover, almost all sperm S-DMRs were specific to this cell type (Fig. 1D). Note that, in these analyses, we utilized identical CpG sites with comparable sequencing depth across cells/tissues, minimizing the possible contribution by data biases (Supplementary Material, Table S1 and see Materials and Methods). Importantly, the methylation states of the sperm S-DMRs were highly conserved between different individuals of the same species (Supplementary Material, Fig. S2A and B). Genomic distributions of the S-DMRs differed considerably between sperm and somatic tissues (P < 10−153 by χ2 test) (Fig. 1E): sperm S-DMRs tended to be more frequently located in intergenic regions, whereas the somatic S-DMRs were more frequently found within promoters, exons, and introns. The sperm S-DMRs rarely overlapped with promoters (Fig. 1E), but more frequently contained repeats such as retrotransposons than did the somatic S-DMRs (Fig. 1F). Indeed, the repeat occupancy (in length) was higher in the sperm S-DMRs (Fig. 1G).

Evolution of the sperm methylome

The unique and rapid changes in the sperm methylome after the human–chimpanzee split prompted us to study the sperm S-DMRs in more detail. The methylation levels of the sperm S-DMRs were high in all somatic tissues (Fig. 2A), suggesting a sperm- and species-specific loss of methylation in both species. We then constructed a third sperm methylome, in Japanese macaque (Macaca fuscata), by performing WGBS, which yielded a mapped read number comparable to those of human and chimpanzee sperm (Supplementary Material, Table S1). Using the Japanese macaque methylome as a reference, the ancestral methylation state and the species-specificity of the changes were inferred in a fraction of the sperm S-DMRs for which orthologous regions could be analyzed (n = 7160). Here, it was assumed that the dynamics of the methylome are parsimonious, just as those of the genome, during evolution. This analysis identified 3675 (3112 hypomethylated and 563 hypermethylated) human-specific and 3202 (678 hypermethylated and 2524 hypomethylated) chimpanzee-specific sperm S-DMRs (Fig. 2B). The macaque sperm methylome was not informative for the remainder of S-DMRs (n = 283, labeled as unknown in Fig. 2B) owing to their intermediate methylation levels. The species-specific S-DMRs arose more frequently by a loss (5636 = 3112 + 2524), rather than by a gain (1241 = 678 + 563) of methylation. The sperm S-DMRs that were hypomethylated in humans (H < C) were frequently associated with H3K27me3 enrichment in human sperm (27) (Fig. 2C, P < 10−50 by χ2 test), whereas such an association was not observed for H3K4me3 (Fig. 2D). This may suggest that species-specific hypomethylation is accompanied by species-specific histone modification state in sperm. However, we could not assess it because of lack of histone modification data in chimpanzee sperm.

Evolution of the sperm methylome. (A) Box plots of the methylation levels of the sperm S-DMRs in various tissues. (B) Heatmap of the methylation levels of the sperm S-DMRs in human, chimpanzee, and Japanese macaque sperm. Only sperm S-DMRs with analyzable orthologous regions in macaques were subject to this analysis (n=7160). Numbers in parentheses indicate the numbers of S-DMRs belonging to the indicated categories. (C, D) Numbers of human sperm S-DMRs containing regions with H3K27me3 (C) or H3K4me3 enrichment (D) identified previously (27). P values are by chi-square test. H, Human; C, chimpanzee.
Figure 2

Evolution of the sperm methylome. (A) Box plots of the methylation levels of the sperm S-DMRs in various tissues. (B) Heatmap of the methylation levels of the sperm S-DMRs in human, chimpanzee, and Japanese macaque sperm. Only sperm S-DMRs with analyzable orthologous regions in macaques were subject to this analysis (n=7160). Numbers in parentheses indicate the numbers of S-DMRs belonging to the indicated categories. (C, D) Numbers of human sperm S-DMRs containing regions with H3K27me3 (C) or H3K4me3 enrichment (D) identified previously (27). P values are by chi-square test. H, Human; C, chimpanzee.

Loss and gain of TFBSs and S-DMRs

To investigate the potential correlations between epigenetic and genetic changes, we first performed an in silico search for species-specific TF-binding sites (TFBSs) in the S-DMRs (Supplementary Material, Fig. S3A, B and Table S3). A species-specific gain of a CTCF-binding site was significantly associated with hypomethylation of the S-DMR in all tissues (Bonferroni-adjusted P < 0.05 by χ2 test) (Supplementary Material, Table S3). Further analyses using the publicly available ChIP-seq data (28) confirmed that the neutrophil S-DMRs with a human-specific CTCF-binding site identified in silico were indeed bound by CTCF more abundantly in human lymphoblast cells than in chimpanzee lymphoblast cells (Supplementary Material, Fig. S3C). We also observed frequent association between species-specific hypomethylation at the S-DMRs and TFBSs for YY1 (sperm), NRF1 (sperm), SPI1 (B cells and neutrophils), PO2F2 (B cells) and C/EBP (neutrophils). The latter three TFs are active in the same tissues (29–31) where we identified S-DMRs, indicating that these species-specific genetic changes are associated with the generation or disruption of TFBSs and with tissue-specific TF functionality. Among the species-specific TFBSs associated with the S-DMRs (Supplementary Material, Fig. S3A and Table S3), those for ITF2 were associated with hypermethylation, not hypomethylation. Gene ontology enrichment analysis using GREAT (32) revealed that genes located in and around the somatic S-DMRs, but not the sperm S-DMRs, were mostly related to tissue-specific functions (Supplementary Material, Fig. S3D).

Retrotransposons and sperm S-DMRs

Because retrotransposon insertion or deletion can cause local epigenetic changes (33–35), we examined whether retrotransposons were enriched in the S-DMRs (Supplementary Material, Fig. S4A). In somatic S-DMRs, only one retrotransposon LTR12C was significantly enriched (Bonferroni-adjusted P < 10−4 by χ2 test); however, sperm S-DMRs exhibited enrichment for many retrotransposons (Supplementary Material, Fig. S4A). Approximately, 40% of the sperm S-DMRs (H < C) contained AluY (34%, n = 2679), SVA (4%, n = 322), or THE1C family retrotransposons (2%, n = 174) (Fig. 3A and Supplementary Material, Fig. S4B).

Association of AluY and SVA retrotransposons with sperm species-specific differentially methylated regions (S-DMRs). (A) Enrichment of individual retrotransposons in the sperm S-DMRs. Asterisks indicate statistically significant enrichment (Fisher’s exact test, Bonferroni-adjusted P<10−4). (B) Species-specific methylation differences in individual retrotransposons from human and chimpanzee sperm. The differences determined are shown for all copies (black) and for only orthologous copies (gray). (C, D) Interspecies differences in methylation of individual retrotransposons in the regions surrounding human-specific AluY (C) and SVA (D) insertion sites. H, Human; C, chimpanzee.
Figure 3

Association of AluY and SVA retrotransposons with sperm species-specific differentially methylated regions (S-DMRs). (A) Enrichment of individual retrotransposons in the sperm S-DMRs. Asterisks indicate statistically significant enrichment (Fisher’s exact test, Bonferroni-adjusted P<10−4). (B) Species-specific methylation differences in individual retrotransposons from human and chimpanzee sperm. The differences determined are shown for all copies (black) and for only orthologous copies (gray). (C, D) Interspecies differences in methylation of individual retrotransposons in the regions surrounding human-specific AluY (C) and SVA (D) insertion sites. H, Human; C, chimpanzee.

We found that the methylation levels of the regions around AluYa5 and AluYb8 copies were generally lower in human sperm than in chimpanzee sperm (Fig. 3B). Because humans have many more copies of AluYa5 and AluYb8 (36), and because those copies that were shared by both species exhibited smaller methylation differences in their own sequences (Fig. 3B), we speculated that the sperm S-DMRs (H < C) could be associated with the human-specific AluY copies. Indeed, such AluY copies were associated with hypomethylation of the adjacent regions (Fig. 3C) and, 80% (n = 130) of those found in the sperm S-DMRs (H < C) (n = 164) were human-specific copies (Supplementary Material, Table S4). These results suggest a possible role of AluY in sperm S-DMR formation, but it is also possible that AluY preferentially retrotransposed to the sperm S-DMRs (H < C).

In contrast to AluY, most SVA copies found in the sperm S-DMRs (H < C) were shared by both species (Supplementary Material, Table S4). This is consistent with previous findings that shared SVA copies can serve as S-DMRs (13) and implies a differential regulation of methylation by the two species. Indeed, we found that the upstream regions (up to –2 kb) of the SVA copies tended to be hypomethylated in human but not in chimpanzee sperm (Supplementary Material, Fig. S4C and D).

Because SVA_E and SVA_F insertions are highly polymorphic even within the human population (37,38), we genotyped our human sperm samples (n = 7) and identified several polymorphic copies (Supplementary Material, Table S5). Among them was an SVA_E copy identified at chr7:1186736–1187834 (Fig. 4A): one sample was homozygous for this insertion, five were heterozygous, and the last one was a non-carrier. We then observed that the upstream region of this insertion showed 43% methylation in the homozygote but 98% methylation in the non-carrier (Fig. 4B and C). In one heterozygote, the same region was less methylated on the SVA-carrying allele compared with the non-carrying allele (22% versus 95% methylation, P = 1.6 × 10−32 by χ2 test) (Fig. 4D), suggesting a cis effect of this insertion on local methylation. These results indicate that de novo SVA insertions generate epigenetic variations within the current human population.

Association between differential methylation and polymorphic SVA insertion in the human population. (A) A polymorphic SVA_E copy on human chr7:1186736–1187834. (B–D) Methylation patterns of the regions upstream and downstream of the insertion site in sperm from a homozygous individual (B), a non-carrier individual (C) and a heterozygous individual (D). Closed and open circles indicate methylated and unmethylated CpG sites, respectively. Each row represents a clone of bisulfite-PCR products. Numbers in parentheses indicate the methylation level. Allele-specific primers were used to assess differences in allelic methylation.
Figure 4

Association between differential methylation and polymorphic SVA insertion in the human population. (A) A polymorphic SVA_E copy on human chr7:1186736–1187834. (BD) Methylation patterns of the regions upstream and downstream of the insertion site in sperm from a homozygous individual (B), a non-carrier individual (C) and a heterozygous individual (D). Closed and open circles indicate methylated and unmethylated CpG sites, respectively. Each row represents a clone of bisulfite-PCR products. Numbers in parentheses indicate the methylation level. Allele-specific primers were used to assess differences in allelic methylation.

Large HMDs in sperm

During the above studies, we noticed that some of the sperm S-DMRs, but none of the somatic S-DMRs, were very large (≥20 kb) (Fig. 5A and Supplementary Material, Fig. S5A and B). Although HMDs similar to these have been reported in several tissues and cells (4,19–22), their evolutionary conservation has been poorly understood. We thus attempted to identify HMDs of ≥ 20 kb in human and chimpanzee tissues (regardless of S-DMR or not) using a hidden Markov model segmentation algorithm (39). We found that both human and chimpanzee sperm contain many more HMDs than somatic cells (Fig. 5B), and that a majority of the human sperm HMDs were highly methylated in somatic tissues (Fig. 5C). Interestingly, while a large fraction of the somatic HMDs was shared between humans and chimpanzees, the sperm HMDs were much more specific to each species (Fig. 5B). This suggests that the sperm HMDs evolved more rapidly than the somatic HMDs. Moreover, the human sperm HMDs (n = 629) outnumbered the chimpanzee sperm HMDs (n = 236).

Human-specific evolution of sperm hypomethylated domains (HMDs). (A) Example of large S-DMRs in sperm. The methylation levels at individual CpG sites in human (blue) and chimpanzee sperm (red) are shown. The S-DMRs are indicated by black boxes at the bottom. (B) Venn diagrams of HMDs identified in each species with the total numbers of the HMDs shown in parentheses. (C) Heatmap of the methylation levels of human sperm HMDs in four human tissues. (D) Numbers of the species-specific (≥30% methylation difference) (green), intermediate (<30% methylation difference) (red), and shared sperm HMDs (blue) in humans and chimpanzees. (E) Heatmaps of the methylation levels of the shared and human-specific sperm HMDs in human, chimpanzee, and macaque sperm.
Figure 5

Human-specific evolution of sperm hypomethylated domains (HMDs). (A) Example of large S-DMRs in sperm. The methylation levels at individual CpG sites in human (blue) and chimpanzee sperm (red) are shown. The S-DMRs are indicated by black boxes at the bottom. (B) Venn diagrams of HMDs identified in each species with the total numbers of the HMDs shown in parentheses. (C) Heatmap of the methylation levels of human sperm HMDs in four human tissues. (D) Numbers of the species-specific (≥30% methylation difference) (green), intermediate (<30% methylation difference) (red), and shared sperm HMDs (blue) in humans and chimpanzees. (E) Heatmaps of the methylation levels of the shared and human-specific sperm HMDs in human, chimpanzee, and macaque sperm.

To explore the species specificity of the sperm HMDs in detail, we identified species-specific sperm HMDs as those that were present only in one species and, were more than 30% less methylated when compared with the corresponding region of the other species. We found that 234 of the 629 human sperm HMDs (37%) were human-specific, whereas only 9 of the 236 chimpanzee sperm HMDs (4%) were chimpanzee-specific (Fig. 5D). These species-specific sperm HMDs frequently overlapped with one or more of the sperm S-DMRs: 173 of the 234 human-specific HMDs (74%) and 6 of the 9 chimpanzee-specific HMDs (66%) with the sperm S-DMRs (H < C) and (C < H), respectively. Analysis of the 234 human-specific sperm HMDs that were identified across 4 human and 2 chimpanzee specimens yielded similar results (Supplementary Material, Fig. S5C). To begin to understand the origin of the methylation states observed in the human-specific sperm HMDs, we examined the methylation levels in orthologous regions in macaque sperm and found that these regions were highly methylated in macaques (Fig. 5E;Supplementary Material, Fig. S5D), indicating that many sperm HMDs were acquired specifically in the human lineage.

Unique features of the human-specific sperm HMDs

Although about a quarter (49/176) of the sperm HMDs shared between humans and chimpanzees overlapped with somatic HMDs (constitutive HMDs) (Fig. 6A), most human-specific sperm HMDs were found only in sperm. The constitutive HMDs were located in gene-rich regions and marked with H3K27me3 and/or H3K4me3 in the sperm chromatin (27) (Fig. 6A), as previously reported for somatic HMDs (19,21,22). In contrast, the human-specific sperm HMDs were located in regions that were GC-poor, gene-poor, L1-rich, and did not show enrichment for H3K27me3, H3K4me3 or MNase-sensitive sites (Fig. 6A and B). The observation was apparently inconsistent with the above finding that the sperm S-DMRs (H < C) showed H3K27me3 enrichment (Fig. 2C) but, in fact, those in the human-specific HMDs (n = 242) were not enriched with H3K27me3 (Supplementary Material, Fig. S6). These findings imply that the human-specific sperm HMDs evolved through a mechanism different from that of the constitutive HMDs. Notably, about half of the human-specific sperm HMDs were located on the X chromosome (Supplementary Material, Table S6), which is L1-rich.

Features of the human-specific sperm hypomethylated domains (HMDs). (A) Heatmaps of shared and human-specific sperm HMDs for AT content, L1 density, somatic HMD density, gene density, CpG island (CGI) density, H3K27me3 peak density, H3K4me3 peak density and MNase-sensitive site density. The latter three values were from human sperm chromatin (27). (B) L1 occupancies (in length) of the shared and human-specific sperm HMDs located on all chromosomes (middle) and autosomes (right). Data are presented as mean±standard deviation. Statistical significance was determined by t-test. (C) Expected and observed numbers of shared and human-specific sperm HMDs that overlap with regions exhibiting copy number variations (CNVs). P-value was determined by χ2 test. (D) Examples of overlapping human-specific HMDs and human-specific CNVs identified by Perry et al. (45). Methylation levels in human sperm (blue) and chimpanzee sperm (red) are shown. Blue and black boxes indicate human-specific sperm HMDs and human-specific CNVs, respectively. The chromosomal positions are according to hg19.
Figure 6

Features of the human-specific sperm hypomethylated domains (HMDs). (A) Heatmaps of shared and human-specific sperm HMDs for AT content, L1 density, somatic HMD density, gene density, CpG island (CGI) density, H3K27me3 peak density, H3K4me3 peak density and MNase-sensitive site density. The latter three values were from human sperm chromatin (27). (B) L1 occupancies (in length) of the shared and human-specific sperm HMDs located on all chromosomes (middle) and autosomes (right). Data are presented as mean±standard deviation. Statistical significance was determined by t-test. (C) Expected and observed numbers of shared and human-specific sperm HMDs that overlap with regions exhibiting copy number variations (CNVs). P-value was determined by χ2 test. (D) Examples of overlapping human-specific HMDs and human-specific CNVs identified by Perry et al. (45). Methylation levels in human sperm (blue) and chimpanzee sperm (red) are shown. Blue and black boxes indicate human-specific sperm HMDs and human-specific CNVs, respectively. The chromosomal positions are according to hg19.

Hypomethylation is associated with chromosomal instability in diseases such as ICF (immunodeficiency, centromere instability and facial anomalies) syndrome (40), and cancer (41,42). Moreover, it has been proposed that hypomethylation in human sperm can be associated with CNVs (43). Using publicly available human and chimpanzee CNV data (44,45), we found that both human-specific and shared sperm HMDs were frequently associated with human CNVs (Fig. 6C). Moreover, the human-specific sperm HMDs were associated with human-specific CNVs, but not with chimpanzee-specific CNVs (Supplementary Material, Fig. S7A), although the association was not significant likely owing to the small number of the human-specific CNVs. Likewise, the chimpanzee-specific sperm HMDs were frequently associated with chimpanzee-specific CNVs (Supplementary Material, Fig. S7B). Thus, we observed a link between the species-specific sperm HMDs (many of which overlapped with the sperm S-DMRs) and genetic changes. Interestingly, some of the chromosomal breakpoints responsible for Turner syndrome (46), a condition in females where one X-chromosome is completely or partially lost, were located within or close to the human-specific sperm HMDs (Supplementary Material, Fig. S7C).

Discussion

In this study, we explored how genetic and epigenetic changes are linked during primate evolution, by first identifying S-DMRs in humans and chimpanzees using DNA methylome data and then characterizing the sequence features of the S-DMRs. We revealed closer methylome relationships according to tissue type, not species, similar to the reported tissue-dependent clustering of gene expression patterns (47). Moreover, most S-DMRs identified in sperm were not S-DMRs in somatic tissues, underscoring the robustness of the developmental program regulating somatic methylomes.

It was previously reported in mouse embryonic stem cells, that DNA methylation states of exogenous DNA fragments were autonomously and reproducibly determined by their sequence features such as the presence/absence of TFBSs (48). Consistent with this, we observed in our S-DMRs that sequence variations at TFBSs were often associated with their methylation changes in the same direction. This finding was also consistent with recent reports on the effects of TFBS variations on methylation states within human populations (15,16,34). Therefore, it is conceivable that evolutionary losses and gains in TFBSs are involved in the formation of S-DMRs and in species-specific transcriptomes, in tissues that express the TFs.

We also found that retrotransposons are linked with the methylome divergence in primates. Retrotransposons increase their copies in the genome through retrotransposition events, but most copies present in the descendent species accumulate numerous sequence changes and are no longer active. However, in humans and chimpanzees, some retrotransposons including AluY, SVA_E, SVA_F, and L1Hs, are still active and generate genetic diversity via retrotransposition (36,49). Among the extinct retrotransposons, we found that LTR12C is enriched in S-DMRs both in germ and somatic cells. It has been inferred that the LTR12 family became inactive in retrotransposition about 6 million years ago, immediately before the human–chimpanzee split, as almost all human LTR12C copies have orthologues in the chimpanzee genome (50). However, while some of the S-DMRs associated with LTR12C copies are more methylated in humans, others are more methylated in chimpanzees. Because the LTR12 family sequences can serve as enhancers (51), their differential methylation could cause differential expression of their neighboring genes.

Of the currently active retrotransposons, AluYa5/Yb8 exhibited a clear enrichment in S-DMRs. As such, human-specific AluYa5/Yb8 insertions were frequently associated with human-specific hypomethylation of the flanking regions in sperm DNA. A previous study reported a link between Alu insertions and DNA methylation divergence in the somatic cells of humans and chimpanzees (34). However, the effect was opposite to what we observed in sperm (i.e. insertions were associated with hypermethylation). This discrepancy can be resolved by the fact that AluYa5/Yb8 copies tend to be hypermethylated in somatic tissues but hypomethylated in sperm. Human-specific SVA subfamilies are another class of active retrotransposon associated with human-specific hypomethylation in sperm. The 5′ regions of young SVA families are highly CpG-rich, resembling CpG islands, and thus SVA retrotranspositions could create a hypomethylation center, extending beyond the 5′ junction, in the human genome. Indeed, we observed an SVA_E insertion that was polymorphic even within the human population and was associated with hypomethylation. Taken together, our results indicate that retrotransposon insertions are one of the driving forces for methylome evolution in primates.

Recent methylome studies have revealed the presence of a substantial number of HMDs in mammalian somatic cells (20,24), but their evolutionary conservation has been poorly understood. We found that, although HMDs in somatic tissues are highly conserved between humans and chimpanzees, HMDs in sperm are more specific to the respective species. Many new sperm HMDs appeared only in the human lineage. Interestingly, while most somatic HMDs were located in GC-rich and gene-rich regions (corresponding to chromosome R bands), many human-specific sperm HMDs were located in GC-poor, gene-poor and L1-rich regions (G bands). In somatic cells, the GC-poor and L1-rich genomic regions tend to be located close to the nuclear periphery (52,53), where H3K9me2, a repressive mark, is highly enriched (54,55). In contrast, male germ cells may have a specific subnuclear positioning pattern and/or chromatin environment, which could result in the formation of sperm-specific HMDs. It has been proposed that hypomethylation in sperm is associated with chromosomal instability (43). Consistent with this hypothesis, our results demonstrated a significant overlap between sperm HMDs and CNVs. In general, this type of study must carefully control for potential cofounders (56), we believe that our study presents a better case than the previous ones: for example, our HMDs were precisely defined and depleted of CpG island (Fig. 6A), which was previously shown to be a strong confounder (56). Although it is tempting to speculate that the human-specific sperm HMDs is involved in CNV formation, establishment of the casual relationship awaits future studies.

In conclusion, our study revealed that genetic changes in TFBSs and retrotransposon insertions have significantly contributed to the evolutionary changes in the primate methylome. Moreover, we found that humans have acquired a number of species-specific HMDs in male germ cells, which may cause large genetic variations such as CNVs. These results underscore a tight link between the evolutionary changes in DNA sequence and those in the epigenetic state, and suggest that they can synergistically contribute to the diversification of gene expression and phenotype.

Materials and Methods

Whole-genome bisulfite sequencing

WGBS of sperm from a Japanese macaque was approved by the experimental ethics committee of the Primate Research Institute, Kyoto University (2010-122 and 2011-007). Genomic DNA was prepared using a standard procedure. The WGBS library was constructed using the post-bisulfite adaptor tagging method (57) and subjected to 100-bp single-end sequencing on an Illumina HiSeq 2500 platform (HCS and RTA versions 2.0.12.0 and 1.17.21.3, respectively), generating 422 million sequence reads. The sequencing data is available at Gene Expression Omnibus (GEO) under the accession number GSE64830.

Published data sets

Cell or tissue WGBS data were retrieved from the following indicated databases: sperm data of four humans and two chimpanzees, GSE30340 (13) and GSE49624 (58); the frontal cortex data of male human and chimpanzee, GSE37202 (11); and B cells and neutrophils data from female human and chimpanzee, GSE31971 and SRP021118 (26). ChIP-seq data of human sperm (H3K27me3 and H3K4me3) were from GSE15690 (27). Human and chimpanzee CNV data were from refs (44,45).

Construction of methylomes and identification of S-DMRs and HMDs

Low quality bases and adaptor sequences were trimmed away from the ends of WGBS reads using Trim Galore (http://www.bioinformatics.babraham.ac.uk/projects/trim_galore/; date last accessed on June 24, 2017) with default parameters. The reads were then aligned by Bismark (59) using reference human (hg19), chimpanzee (panTro4) or rhesus macaque (rheMac3) genomes. The option ‘pbat’ was used for WGBS reads of the Japanese macaque. Only uniquely aligned reads were used to construct methylomes. S-DMRs were identified by comparing human and chimpanzee methylomes. The positions of CpG sites in the chimpanzee genome were converted to orthologous CpG positions in the human genome by liftover (60). S-DMR candidates were identified using the ‘Commet’ command in BisulFighter (61). Only CpG sites covered by at least one read in all samples were used for the BisulFighter analysis. To enhance the confidence of S-DMR call, we calculated the average methylation levels of the candidates using CpG sites with ≥5 reads in both humans and chimpanzees, and among the candidates, those containing ≥10 successive analyzable CpG sites and showing a ≥ 40% methylation difference were determined as S-DMRs. Human-specific sperm S-DMRs were identified using the following criteria: (1) |human–macaque| >20% and (2) |human–macaque|/|chimpanzee–macaque| >1.5. Chimpanzee-specific S-DMRs were identified in a similar fashion.

CpG sites covered by at least one read in all samples were used for the identification of HMDs. HMD candidates were identified using a hidden Markov model segmentation algorithm by choosing the ‘pmd’ command in methpipe (39) with default parameters. Among the candidates, those spanning ≥20 kb, having a methylation level <50% and carrying ≥50 CpG sites with ≥5 depth were identified as HMDs. We designated species-specific HMDs as those that were present in only one species at a rate of 30% or less methylation, when compared with the corresponding region of the other species.

Species-specific TFBSs and their association with S-DMRs

TF binding motifs were obtained from HOCOMOCO (http://autosome.ru/HOCOMOCO/; date last accessed on June 24, 2017), and putative TFBSs were identified in human and chimpanzee S-DMR sequences by RSAT (http://rsat.ulb.ac.be/; date last accessed on June 24, 2017). For each TFBS-containing S-DMR, the lowest P-value obtained by RSAT was determined as the representative value in each species. A TFBS was defined as species-specific if the P-value obtained in one species was <10−4 and also at least 10 times lower than that of the other species. The association of species-specific TFBS with hypermethylation or hypomethylation was assessed by counting S-DMRs (H < C) and S-DMRs (C < H) containing species-specific TFBSs for particular TFs. Statistical significance was determined by χ2 test.

Identification of retrotransposons enriched in S-DMRs

The RepeatMasker annotation data for the human genome was downloaded from the UCSC table browser (60). A set of genomic regions was selected so that each region would correspond to an S-DMR with the same length and the same or closest number (a 10% difference allowed) of analyzable CpGs (read depth ≥5 in both human and chimpanzee) but otherwise fortuitously, and regions containing a given retrotransposon were counted. We generated 1000 such data sets and calculated the average number of regions containing that retrotransposon, which was used as the expected number. Retrotransposons residing in >0.5% of the S-DMRs and enriched by >2-fold over the expectation were identified as S-DMR-enriched retrotransposons. The statistical significance of enrichment was examined by χ2 test.

Analysis of enrichment of histone marks in S-DMRs

Histone mark enrichment analyses were performed as the retrotransposon enrichment analyses.

Analysis of enrichment of CNVs in HMDs

Human CNV annotation data (44) were obtained from UCSC table browser (60), and species-specific CNV annotation data were obtained from Supplementary Table 7 of Perry et al. (45). Ethnic groups used in Conrad et al. (44) were Yoruba, Japanese, Chinese and Norther European, and those used in Perry et al. were Yoruba, Biaka and Mbuti. CNV enrichment analyses were performed as the retrotransposon enrichment analyses. We used only autosomal sperm HMDs for the analysis as Perry et al. (45), and used only CNVs ≥20 kb.

Identification of species-specific retrotransposon insertions

Human/chimpanzee genomic sequence alignments were downloaded from the UCSC genome browser (60), and insertions/deletions (unaligned segments) were identified. Species-specific retrotransposon insertions were then identified using the following criteria: (1) ≥80% of the insertion should be spanned by a part of single retrotransposon, and (2) if a partial deletion of a retrotransposon copy was the cause of an apparent retrotransposon insertion in the other species, the deletion should span ≥80% of that copy. The second criterion effectively excluded apparent insertions resulting from small species-specific deletions within retrotransposons.

Identification of polymorphic SVA insertions and bisulfite-PCR analysis

Genomic DNA was prepared from sperm obtained from human donors of 25–35 years old. The presence or absence of an SVA copy at chr7: 1186736–1187834 was determined by genotyping PCR. DNA methylation state was determined by bisulfite treatment of the genomic DNA, followed by touchdown PCR, as described previously (62). PCR products were cloned, and the resultant plasmids were sequenced. Primers used for genotyping and bisulfite analyses are listed in Supplementary Material, Table S5. The study procedures were approved by Ethics Committee of Human Genome Analysis, Kyushu University (approval numbers 508-00 and 600-00).

Supplementary Material

Supplementary Material is available at HMG online.

Acknowledgements

We thank Dr Robert Feil for critically reading the manuscript, Ms Junko Kitayama and Miho Miyake for technical assistance, and Dr Munehiro Okamoto for collecting macaque sperm. We also thank the members of the Center for Human Evolution Modeling Research for sampling assistance.

Conflict of Interest statement. None declared.

Funding

This work was supported in part by Grants-in-Aid for Scientific Research from the Japanese Society of the Promotion of Science to K.F. (grant number 13J03253), those from Ministry of Education, Culture, Sports, Science, and Technology to K.I. (21200037) and H.S. (23249019), a Research Grant from the Takeda Science Foundation to K.I., and a Research Grant from the Uehara Memorial Foundation to H.S. This work was also supported in part by the Cooperation Research Program of the Primate Research Institute, Kyoto University, and a Collaborative Research Project organized by the Interuniversity Bio-Backup Project.

References

1

Bird
A.
(
2002
)
DNA methylation patterns and epigenetic memory
.
Genes Dev
.,
16
,
6
21
.

2

Bergman
Y.
,
Cedar
H.
(
2013
)
DNA methylation dynamics in health and disease
.
Nat. Struct. Mol. Biol
.,
20
,
274
281
.

3

Stadler
M.B.
,
Murr
R.
,
Burger
L.
,
Ivanek
R.
,
Lienert
F.
,
Scholer
A.
,
van Nimwegen
E.
,
Wirbelauer
C.
,
Oakeley
E.J.
,
Gaidatzis
D.
et al. (
2011
)
DNA-binding factors shape the mouse methylome at distal regulatory regions
.
Nature
,
480
,
490
495
.

4

Lister
R.
,
Pelizzola
M.
,
Dowen
R.H.
,
Hawkins
R.D.
,
Hon
G.
,
Tonti-Filippini
J.
,
Nery
J.R.
,
Lee
L.
,
Ye
Z.
,
Ngo
Q.M.
et al. (
2009
)
Human DNA methylomes at base resolution show widespread epigenomic differences
.
Nature
,
462
,
315
322
.

5

Jaenisch
R.
,
Bird
A.
(
2003
)
Epigenetic regulation of gene expression: how the genome integrates intrinsic and environmental signals
.
Nat. Genet
.,
33 Suppl
,
245
254
.

6

Li
E.
(
2002
)
Chromatin modification and epigenetic reprogramming in mammalian development
.
Nat. Rev. Genet
.,
3
,
662
673
.

7

King
M.C.
,
Wilson
A.C.
(
1975
)
Evolution at two levels in humans and chimpanzees
.
Science
,
188
,
107
116
.

8

Wray
G.A.
(
2007
)
The evolutionary significance of cis-regulatory mutations
.
Nat. Rev. Genet
.,
8
,
206
216
.

9

Carroll
S.B.
(
2005
)
Evolution at two levels: on genes and form
.
PLoS Biol
.,
3
,
e245.

10

Fukuda
K.
,
Ichiyanagi
K.
,
Yamada
Y.
,
Go
Y.
,
Udono
T.
,
Wada
S.
,
Maeda
T.
,
Soejima
H.
,
Saitou
N.
,
Ito
T.
et al. (
2013
)
Regional DNA methylation differences between humans and chimpanzees are associated with genetic changes, transcriptional divergence and disease genes
.
J. Hum. Genet
.,
58
,
446
454
.

11

Zeng
J.
,
Konopka
G.
,
Hunt
B.G.
,
Preuss
T.M.
,
Geschwind
D.
,
Yi
S.V.
(
2012
)
Divergent whole-genome methylation maps of human and chimpanzee brains reveal epigenetic basis of human regulatory evolution
.
Am. J. Hum. Genet
.,
91
,
455
465
.

12

Pai
A.A.
,
Bell
J.T.
,
Marioni
J.C.
,
Pritchard
J.K.
,
Gilad
Y.
(
2011
)
A genome-wide study of DNA methylation patterns and gene expression levels in multiple human and chimpanzee tissues
.
PLoS Genet
.,
7
,
e1001316.

13

Molaro
A.
,
Hodges
E.
,
Fang
F.
,
Song
Q.
,
McCombie
W.R.
,
Hannon
G.J.
,
Smith
A.D.
(
2011
)
Sperm methylation profiles reveal features of epigenetic inheritance and evolution in primates
.
Cell
,
146
,
1029
1041
.

14

Hernando-Herraez
I.
,
Prado-Martinez
J.
,
Garg
P.
,
Fernandez-Callejo
M.
,
Heyn
H.
,
Hvilsom
C.
,
Navarro
A.
,
Esteller
M.
,
Sharp
A.J.
,
Marques-Bonet
T.
(
2013
)
Dynamics of DNA methylation in recent human and great ape evolution
.
PLoS Genet
.,
9
,
e1003763.

15

Gutierrez-Arcelus
M.
,
Lappalainen
T.
,
Montgomery
S.B.
,
Buil
A.
,
Ongen
H.
,
Yurovsky
A.
,
Bryois
J.
,
Giger
T.
,
Romano
L.
,
Planchon
A.
et al. (
2013
)
Passive and active DNA methylation and the interplay with genetic variation in gene regulation
.
Elife
,
2
,
e00523.

16

Banovich
N.E.
,
Lan
X.
,
McVicker
G.
,
van de Geijn
B.
,
Degner
J.F.
,
Blischak
J.D.
,
Roux
J.
,
Pritchard
J.K.
,
Gilad
Y.
(
2014
)
Methylation QTLs are associated with coordinated changes in transcription factor binding, histone modifications, and gene expression levels
.
PLoS Genet
.,
10
,
e1004663.

17

Schilling
E.
,
El Chartouni
C.
,
Rehli
M.
(
2009
)
Allele-specific DNA methylation in mouse strains is mainly determined by cis-acting sequences
.
Genome Res
.,
19
,
2028
2035
.

18

Heinz
S.
,
Romanoski
C.E.
,
Benner
C.
,
Allison
K.A.
,
Kaikkonen
M.U.
,
Orozco
L.D.
,
Glass
C.K.
(
2013
)
Effect of natural genetic variation on enhancer selection and function
.
Nature
,
503
,
487
492
.

19

Jeong
M.
,
Sun
D.
,
Luo
M.
,
Huang
Y.
,
Challen
G.A.
,
Rodriguez
B.
,
Zhang
X.
,
Chavez
L.
,
Wang
H.
,
Hannah
R.
et al. (
2014
)
Large conserved domains of low DNA methylation maintained by Dnmt3a
.
Nat. Genet
.,
46
,
17
23
.

20

Hon
G.C.
,
Rajagopal
N.
,
Shen
Y.
,
McCleary
D.F.
,
Yue
F.
,
Dang
M.D.
,
Ren
B.
(
2013
)
Epigenetic memory at embryonic enhancers identified in DNA methylation maps from adult mouse tissues
.
Nat. Genet
.,
45
,
1198
1206
.

21

Nakamura
R.
,
Tsukahara
T.
,
Qu
W.
,
Ichikawa
K.
,
Otsuka
T.
,
Ogoshi
K.
,
Saito
T.L.
,
Matsushima
K.
,
Sugano
S.
,
Hashimoto
S.
et al. (
2014
)
Large hypomethylated domains serve as strong repressive machinery for key developmental genes in vertebrates
.
Development
,
141
,
2568
2580
.

22

Xie
W.
,
Schultz
M.D.
,
Lister
R.
,
Hou
Z.
,
Rajagopal
N.
,
Ray
P.
,
Whitaker
J.W.
,
Tian
S.
,
Hawkins
R.D.
,
Leung
D.
et al. (
2013
)
Epigenomic analysis of multilineage differentiation of human embryonic stem cells
.
Cell
,
153
,
1134
1148
.

23

Hansen
K.D.
,
Timp
W.
,
Bravo
H.C.
,
Sabunciyan
S.
,
Langmead
B.
,
McDonald
O.G.
,
Wen
B.
,
Wu
H.
,
Liu
Y.
,
Diep
D.
et al. (
2011
)
Increased methylation variation in epigenetic domains across cancer types
.
Nat. Genet
.,
43
,
768
775
.

24

Schroeder
D.I.
,
Blair
J.D.
,
Lott
P.
,
Yu
H.O.
,
Hong
D.
,
Crary
F.
,
Ashwood
P.
,
Walker
C.
,
Korf
I.
,
Robinson
W.P.
et al. (
2013
)
The human placenta methylome
.
Proc. Natl. Acad. Sci. U.S.A
.,
110
,
6037
6042
.

25

Berman
B.P.
,
Weisenberger
D.J.
,
Aman
J.F.
,
Hinoue
T.
,
Ramjan
Z.
,
Liu
Y.
,
Noushmehr
H.
,
Lange
C.P.
,
van Dijk
C.M.
,
Tollenaar
R.A.
et al. (
2012
)
Regions of focal DNA hypermethylation and long-range hypomethylation in colorectal cancer coincide with nuclear lamina-associated domains
.
Nat. Genet
.,
44
,
40
46
.

26

Hodges
E.
,
Molaro
A.
,
Dos Santos
C.O.
,
Thekkat
P.
,
Song
Q.
,
Uren
P.J.
,
Park
J.
,
Butler
J.
,
Rafii
S.
,
McCombie
W.R.
et al. (
2011
)
Directional DNA methylation changes and complex intermediate states accompany lineage specificity in the adult hematopoietic compartment
.
Mol. Cell
,
44
,
17
28
.

27

Hammoud
S.S.
,
Nix
D.A.
,
Zhang
H.
,
Purwar
J.
,
Carrell
D.T.
,
Cairns
B.R.
(
2009
)
Distinctive chromatin in human sperm packages genes for embryo development
.
Nature
,
460
,
473
478
.

28

Schwalie
P.C.
,
Ward
M.C.
,
Cain
C.E.
,
Faure
A.J.
,
Gilad
Y.
,
Odom
D.T.
,
Flicek
P.
(
2013
)
Co-binding by YY1 identifies the transcriptionally active, highly conserved set of CTCF-bound regions in primate genomes
.
Genome Biol
.,
14
,
R148.

29

Fisher
R.C.
,
Scott
E.W.
(
1998
)
Role of PU.1 in hematopoiesis
.
Stem Cells
,
16
,
25
37
.

30

Corcoran
L.M.
,
Karvelas
M.
,
Nossal
G.J.
,
Ye
Z.S.
,
Jacks
T.
,
Baltimore
D.
(
1993
)
Oct-2, although not required for early B-cell development, is critical for later B-cell maturation and for postnatal survival
.
Genes Dev
.,
7
,
570
582
.

31

Cloutier
A.
,
Guindi
C.
,
Larivee
P.
,
Dubois
C.M.
,
Amrani
A.
,
McDonald
P.P.
(
2009
)
Inflammatory cytokine production by human neutrophils involves C/EBP transcription factors
.
J. Immunol
.,
182
,
563
571
.

32

McLean
C.Y.
,
Bristor
D.
,
Hiller
M.
,
Clarke
S.L.
,
Schaar
B.T.
,
Lowe
C.B.
,
Wenger
A.M.
,
Bejerano
G.
(
2010
)
GREAT improves functional interpretation of cis-regulatory regions
.
Nat. Biotechnol
.,
28
,
495
U155
.

33

Morgan
H.D.
,
Sutherland
H.G.
,
Martin
D.I.
,
Whitelaw
E.
(
1999
)
Epigenetic inheritance at the agouti locus in the mouse
.
Nat. Genet
.,
23
,
314
318
.

34

Prendergast
J.G.
,
Chambers
E.V.
,
Semple
C.A.
(
2014
)
Sequence-level mechanisms of human epigenome evolution
.
Genome Biol. Evol
.,
6
,
1758
1771
.

35

Rebollo
R.
,
Karimi
M.M.
,
Bilenky
M.
,
Gagnier
L.
,
Miceli-Royer
K.
,
Zhang
Y.
,
Goyal
P.
,
Keane
T.M.
,
Jones
S.
,
Hirst
M.
et al. (
2011
)
Retrotransposon-induced heterochromatin spreading in the mouse revealed by insertional polymorphisms
.
PLoS Genet
.,
7
,
e1002301.

36

Mikkelsen
T.S.
,
Hillier
L.W.
,
Eichler
E.E.
,
Zody
M.C.
,
Jaffe
D.B.
,
Yang
S.-P.
,
Enard
W.
,
Hellmann
I.
,
Lindblad-Toh
K.
,
Altheide
T.K.
et al. (
2005
)
Initial sequence of the chimpanzee genome and comparison with the human genome
.
Nature
,
437
,
69
87
.

37

Hancks
D.C.
,
Kazazian
H.H.
Jr.
(
2010
)
SVA retrotransposons: evolution and genetic instability
.
Semin. Cancer Biol
.,
20
,
234
245
.

38

Wang
H.
,
Xing
J.
,
Grover
D.
,
Hedges
D.J.
,
Han
K.
,
Walker
J.A.
,
Batzer
M.A.
(
2005
)
SVA elements: a hominid-specific retroposon family
.
J. Mol. Biol
.,
354
,
994
1007
.

39

Song
Q.
,
Decato
B.
,
Hong
E.E.
,
Zhou
M.
,
Fang
F.
,
Qu
J.
,
Garvin
T.
,
Kessler
M.
,
Zhou
J.
,
Smith
A.D.
(
2013
)
A reference methylome database and analysis pipeline to facilitate integrative and comparative epigenomics
.
PLoS One
,
8
,
e81148.

40

Xu
G.L.
,
Bestor
T.H.
,
Bourc'his
D.
,
Hsieh
C.L.
,
Tommerup
N.
,
Bugge
M.
,
Hulten
M.
,
Qu
X.
,
Russo
J.J.
,
Viegas-Pequignot
E.
(
1999
)
Chromosome instability and immunodeficiency syndrome caused by mutations in a DNA methyltransferase gene
.
Nature
,
402
,
187
191
.

41

Eden
A.
,
Gaudet
F.
,
Waghmare
A.
,
Jaenisch
R.
(
2003
)
Chromosomal instability and tumors promoted by DNA hypomethylation
.
Science
,
300
,
455.

42

Rodriguez
J.
,
Frigola
J.
,
Vendrell
E.
,
Risques
R.A.
,
Fraga
M.F.
,
Morales
C.
,
Moreno
V.
,
Esteller
M.
,
Capella
G.
,
Ribas
M.
et al. (
2006
)
Chromosomal instability correlates with genome-wide DNA demethylation in human primary colorectal cancers
.
Cancer Res
.,
66
,
8462
9468
.

43

Li
J.
,
Harris
R.A.
,
Cheung
S.W.
,
Coarfa
C.
,
Jeong
M.
,
Goodell
M.A.
,
White
L.D.
,
Patel
A.
,
Kang
S.H.
,
Shaw
C.
et al. (
2012
)
Genomic hypomethylation in the human germline associates with selective structural mutability in the human genome
.
PLoS Genet
.,
8
,
e1002692.

44

Conrad
D.F.
,
Pinto
D.
,
Redon
R.
,
Feuk
L.
,
Gokcumen
O.
,
Zhang
Y.
,
Aerts
J.
,
Andrews
T.D.
,
Barnes
C.
,
Campbell
P.
et al. (
2010
)
Origins and functional impact of copy number variation in the human genome
.
Nature
,
464
,
704
712
.

45

Perry
G.H.
,
Yang
F.
,
Marques-Bonet
T.
,
Murphy
C.
,
Fitzgerald
T.
,
Lee
A.S.
,
Hyland
C.
,
Stone
A.C.
,
Hurles
M.E.
,
Tyler-Smith
C.
et al. (
2008
)
Copy number variation and evolution in humans and chimpanzees
.
Genome Res
.,
18
,
1698
1710
.

46

Scott
S.A.
,
Cohen
N.
,
Brandt
T.
,
Warburton
P.E.
,
Edelmann
L.
(
2010
)
Large inverted repeats within Xp11.2 are present at the breakpoints of isodicentric X chromosomes in Turner syndrome
.
Hum. Mol. Genet
.,
19
,
3383
3393
.

47

Brawand
D.
,
Soumillon
M.
,
Necsulea
A.
,
Julien
P.
,
Csardi
G.
,
Harrigan
P.
,
Weier
M.
,
Liechti
A.
,
Aximu-Petri
A.
,
Kircher
M.
et al. (
2011
)
The evolution of gene expression levels in mammalian organs
.
Nature
,
478
,
343
348
.

48

Lienert
F.
,
Wirbelauer
C.
,
Som
I.
,
Dean
A.
,
Mohn
F.
,
Schubeler
D.
(
2011
)
Identification of genetic elements that autonomously determine DNA methylation states
.
Nat. Genet
.,
43
,
1091
1097
.

49

Hancks
D.C.
,
Kazazian
H.H.
Jr.
(
2012
)
Active human retrotransposons: variation and disease
.
Curr. Opin. Genet. Dev
.,
22
,
191
203
.

50

Lopez-Sanchez
P.
,
Costas
J.C.
,
Naveira
H.F.
(
2005
)
Paleogenomic record of the extinction of human endogenous retrovirus ERV9
.
J. Virol
.,
79
,
6997
7004
.

51

Yu
X.
,
Zhu
X.
,
Pi
W.
,
Ling
J.
,
Ko
L.
,
Takeda
Y.
,
Tuan
D.
(
2005
)
The long terminal repeat (LTR) of ERV-9 human endogenous retrovirus binds to NF-Y in the assembly of an active LTR enhancer complex NF-Y/MZF1/GATA-2
.
J. Biol. Chem
.,
280
,
35184
35194
.

52

Guelen
L.
,
Pagie
L.
,
Brasset
E.
,
Meuleman
W.
,
Faza
M.B.
,
Talhout
W.
,
Eussen
B.H.
,
de Klein
A.
,
Wessels
L.
,
de Laat
W.
et al. (
2008
)
Domain organization of human chromosomes revealed by mapping of nuclear lamina interactions
.
Nature
,
453
,
948
951
.

53

Solovei
I.
,
Kreysing
M.
,
Lanctot
C.
,
Kosem
S.
,
Peichl
L.
,
Cremer
T.
,
Guck
J.
,
Joffe
B.
(
2009
)
Nuclear architecture of rod photoreceptor cells adapts to vision in mammalian evolution
.
Cell
,
137
,
356
368
.

54

Wu
R.
,
Terry
A.V.
,
Singh
P.B.
,
Gilbert
D.M.
(
2005
)
Differential subnuclear localization and replication timing of histone H3 lysine 9 methylation states
.
Mol. Biol. Cell
,
16
,
2872
2881
.

55

Yokochi
T.
,
Poduch
K.
,
Ryba
T.
,
Lu
J.
,
Hiratani
I.
,
Tachibana
M.
,
Shinkai
Y.
,
Gilbert
D.M.
(
2009
)
G9a selectively represses a class of late-replicating genes at the nuclear periphery
.
Proc. Natl. Acad. Sci. U.S.A
.,
106
,
19363
19368
.

56

Watson
C.T.
,
Garg
P.
,
Sharp
A.J.
(
2013
)
Comment on "genomic hypomethylation in the human germline associates with selective structural mutability in the human genome"
.
PLoS Genet
.,
9
,
e1003332.

57

Miura
F.
,
Enomoto
Y.
,
Dairiki
R.
,
Ito
T.
(
2012
)
Amplification-free whole-genome bisulfite sequencing by post-bisulfite adaptor tagging
.
Nucleic Acids Res
.,
40
,
e136.

58

Hammoud
S.S.
,
Low
D.H.
,
Yi
C.
,
Carrell
D.T.
,
Guccione
E.
,
Cairns
B.R.
(
2014
)
Chromatin and transcription transitions of mammalian adult germline stem cells and spermatogenesis
.
Cell Stem Cell
,
15
,
239
253
.

59

Krueger
F.
,
Andrews
S.R.
(
2011
)
Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications
.
Bioinformatics
,
27
,
1571
1572
.

60

Kuhn
R.M.
,
Haussler
D.
,
Kent
W.J.
(
2013
)
The UCSC genome browser and associated tools
.
Brief Bioinform
.,
14
,
144
161
.

61

Saito
Y.
,
Tsuji
J.
,
Mituyama
T.
(
2014
)
Bisulfighter: accurate detection of methylated cytosines and differentially methylated regions
.
Nucleic Acids Res
.,
42
,
e45.

62

Ichiyanagi
K.
,
Li
Y.
,
Watanabe
T.
,
Ichiyanagi
T.
,
Fukuda
K.
,
Kitayama
J.
,
Yamamoto
Y.
,
Kuramochi-Miyagawa
S.
,
Nakano
T.
,
Yabuta
Y.
et al. (
2011
)
Locus- and domain-dependent control of DNA methylation at mouse B1 retrotransposons during male germ cell development
.
Genome Res
.,
21
,
2058
2066
.

Supplementary data