Abstract

Meiotic crossovers in the human genome cluster into highly localized hotspots identifiable indirectly from patterns of DNA diversity and directly by high-resolution sperm typing. Little is known about factors that control hotspot activity and the apparently rapid turnover of hotspots during recent evolution. Clues can, however, be gained by characterizing variation in sperm crossover activity between men. Previous studies have identified single nucleotide polymorphisms within hotspots that appear to suppress crossover activity and which may be involved in hotspot attenuation/extinction. We now analyse a closely spaced pair of hotspots (MSTM1a, MSTM1b) on chromosome 1q42.3, the former being a candidate for a young hotspot that has failed to leave a significant mark on haplotype diversity. Extensive surveys of different men revealed substantial polymorphism in sperm crossover frequencies at both hotspots, but with very different patterns of variation. Hotspot MSTM1b was active in all men tested but with widely differing crossover frequencies. In contrast, MSTM1a was active in only a few men and appeared to be recombinationally inert in the remainder, providing the first example of presence/absence polymorphism of a human hotspot. Haplotype analysis around both hotspots identified active and suppressed men sharing identical haplotypes, establishing that these major variations in the presence/absence of a hotspot and in quantitative activity are not caused by local DNA sequence variation. These findings suggest a role for distal regulators or epigenetic factors in hotspot activity and provide the first direct evidence for the rapid evolution of recombination hotspots in humans.

INTRODUCTION

Meiotic recombination is of fundamental importance in reshuffling haplotypes between generations and in shaping patterns of human DNA diversity. Fine-scale distributions of meiotic crossover events can be characterized indirectly across the entire human genome from surveys of DNA diversity, using approaches such as linkage disequilibrium (LD) mapping (1,2) and coalescent analysis (35) to infer patterns of historical recombination from genotype or haplotype data. Direct analysis is possible by batch screening of sperm DNA for crossover molecules (68). This approach, although laborious and limited to short intervals of the human genome, is capable of screening millions of sperm per man for recombination events and characterizing crossover distributions at very high resolution. It also circumvents potential problems in population genetic estimation of recombination frequencies and distributions resulting from factors such as genetic drift, bottlenecks, admixture and selection that might perturb patterns of DNA diversity and thus inferences of historical recombination profiles (912).

Population genetic and sperm typing approaches have both shown that meiotic crossovers cluster into highly localized hotspots occurring roughly every 50 kb in the human genome (4,5,7,13). These hotspots profoundly influence human DNA diversity, creating haplotype blocks between hotspots that offer the promise of efficient methods of association analysis (5). Different hotspots can vary widely in activity (7,13,14), though all typed to date in sperm show a constant width of 1–2 kb. Sperm analysis has shown that these hotspots are also active in gene conversion involving the exchange of information between haplotypes but without exchange of flanking DNA markers (15). This provides strong evidence that hotspots mark narrow zones within which recombination is initiated.

Evidence is accumulating that crossover hotspots may be relatively dynamic features of the human genome. Comparison of LD landscapes in humans and chimpanzees has revealed major divergence in haplotype block structures and thus in the positions of hotspots inferred by population genetic approaches (1619). This implies that many hotspots have turned over within the last 6 million years (Myr) and between genomes that show very little DNA sequence divergence. Hotspot turnover within human populations is more equivocal, with evidence both for conserved LD landscapes implying that most hotspots are shared across populations (3,5,18,20,21) and for hotspots detected in sperm that have failed to leave their full mark on haplotype diversity and which may therefore have arisen very recently in human evolution (13).

Factors controlling hotspot activity and the processes whereby hotspots come into existence and eventually go extinct remain enigmatic. A recent genome-wide survey of human DNA diversity has revealed short DNA sequence motifs strongly associated with putative hotspots (4). However, although these motifs may be necessary for hotspot activity, they cannot be sufficient because many exist outside hotspots and most will not have changed on the time-scale of human/chimpanzee divergence. The alternative approach to identifying hotspot-controlling factors is to identify polymorphisms between individuals in hotspot activity. This cannot be analysed from population DNA diversity data but can be explored by sperm typing. Evidence to date on a limited number of crossover hotspots suggests that such polymorphisms are not unusual (13,2224). In two cases, the cause of variation in sperm crossover frequency could be traced to a single nucleotide polymorphism (SNP) within the hotspot that seems to influence recombination initiation rate (23,24). In both cases, the recombination-suppressing allele disrupts a DNA sequence motif preferentially found in hotspots (4) and is strongly over-transmitted to recombinant sperm, creating a level of meiotic drive that can be sufficient to promote population fixation of the recombination-suppressing allele. This drive of any recombination suppressor arising within a hotspot results in the ‘hotspot paradox’, namely how a hotspot can arise and persist in the face of a deterministic drive to attenuate hotspot activity (25). However, this drive is unlikely to eliminate a hotspot completely; the recombination suppressors identified to date reduce but do not eliminate recombination initiation activity, and as the activity of a hotspot diminishes, so will the strength of meiotic drive at the population level until the point is reached where the population frequency of a suppressor is determined by genetic drift alone.

Although recombination-suppressing variants give clues about likely processes involved in the disappearance of hotspots from a population, nothing is known about how hotspots come into existence. One possible way of analysing this problem is to identify either population-specific hotspots or preferably hotspots functional in only a proportion of individuals in a population. Such polymorphic hotspots might include young hotspots that, through comparisons of inactive and active haplotypes, could give clues about processes involved in the origin of hotspot activity.

Surveys to date on polymorphisms in crossover hotspot activity in sperm have been limited to very small numbers of men, often just two to three individuals (6,7,13,14,22,24). To explore such variation in more detail, we have conducted an extensive survey of crossover activity at a pair of hotspots, termed MSTM1a and MSTM1b, located in a region of chromosome 1q42.3 well characterized for hotspots detected by population genetic approaches and by direct analysis in sperm (13). Both contain sequence motifs associated with hotspots (4); MSTM1a shows a CTCCTCC motif close to its centre, whereas MSTM1b has CCTTCCC within 400 bp of the centre. These hotspots are located close to each other and can be assayed in sperm in a single test interval. Hotspot MSTM1a is also a candidate for a young hotspot, being active in sperm yet located in a region of intense marker association (13).

RESULTS

Sperm crossover activity at hotspots MSTM1a and MSTM1b

A survey of 80 semen donors of north European origin identified 26 men with sufficient SNP heterozygosities to allow recovery of MSTM1a and MSTM1b crossover molecules and to map crossover exchange points. These molecules were recovered from sperm DNA by repulsion-phase allele-specific PCR across a 8.1–10.2 kb interval spanning both hotspots (13), the precise length of the interval depending on available SNP heterozygosities in each man. All except seven of the 26 men were assayed for reciprocal crossovers (Fig. 1A), yielding a total of 5743 crossover molecules recovered from 39×106 amplifiable molecules of each haplotype. The mean recombination frequency over all men was 14.7×10−5 per sperm but varied substantially between men, from 1.6 to 90×10−5. There was no significant difference in the frequency of reciprocal A- and B-type crossovers in any man (P>0.04 for each man, without Bonferroni correction), consistent with reciprocal exchange. The observed disparities in the numbers of A- and B-type crossovers (on average only 1.3-fold over 19 men tested, range 1.0–1.6) also established that these crossover assays gave reproducible and reliable measures of recombination frequency.

Variation in crossover profiles

Mapping exchange points in these MSTM1a,b crossover molecules also revealed considerable variation between men in crossover distribution. Examples of these differences are shown in Figure 1B. Analysis of cumulative crossover frequency distributions across the test interval (Fig. 2) showed that hotspot MSTM1b was active in all men, though with widely varying crossover frequencies. Despite the lack of markers in this hotspot, distributions consistently located its centre to position 23.3 kb and gave a hotspot width of ∼1.6 kb, typical of other hotspots characterized by sperm typing (7,13,14). These distributions also revealed additional crossovers apparently randomly distributed across the test interval; these were particularly evident 3′ to MSTM1b. These rare ‘background’ crossovers, some of which may be PCR artefacts (6), were as expected most noticeable in men with weakest activity at hotspot MSTM1b and gave a consistent background crossover activity of 0.09 cM/Mb, comparable to previous estimates of crossover activity outside hotspots (7,13,26).

Only three men showed significant activity at hotspot MSTM1a (Figs 1B and 2). All showed a 1.2 kb wide hotspot centred at 21.3 kb, 2 kb away from the centre of MSTM1b. The remaining 23 men yielded a total of only 44 crossovers in this region, very similar to the 37 crossovers expected from the background crossover activity of 0.09 cM/Mb. Furthermore, there was no evidence of clustering of these rare exchanges within the MSTM1a hotspot region (Fig. 2). More detailed analysis (see Materials and Methods) showed that these exchanges arose with an apparently uniform frequency across these 23 men (P=0.14) and that crossover locations were more compatible with random distribution than clustering into hotspot MSTM1a (likelihood ratio= 2.6). Therefore, there is no evidence of any crossover activity at MSTM1a in any of these 23 men.

The cumulative crossover frequency distributions shown in Figure 2 allowed crossovers to be assigned to each hotspot in each man. MSTM1a and MSTM1b crossover activities are summarized in Figure 3. MSTM1a is active in only three men, all of whom show a similar crossover frequency of ∼9×10−5 per sperm. MSTM1b is active in all men, but with widely varying activities (median 9×10−5 per sperm, range 1.2–90×10−5).

Hotspot MSTM1b activity and haplotype variation

To investigate the DNA sequence basis of the extensive variation seen between men in crossover activity at hotspot MSTM1b, experimentally determined haplotypes were established for all 26 men analysed for sperm crossovers. One hundred SNPs were analysed over a 37 kb interval extending across the MSTM1 region and neighbouring sperm crossover hotspots, with SNP density maximized over MSTM1 by resequencing haplotypes from the semen donors. Comparison of pairs of haplotypes revealed sets of men with very high or very low crossover activity at MSTM1b, yet sharing exactly the same haplotypes across the hotspot (Fig. 4). For example, men 2 and 24 show a 50-fold difference in crossover activity, yet share exactly the same haplotypes and DNA sequences across a 15 kb interval spanning the hotspot. Therefore, there is no DNA sequence variant within or near the hotspot that is responsible for triggering these differences in crossover activity.

Variation in crossover activity at hotspot MSTM1b is likely to reflect differences between haplotypes in the efficiency of crossover initiation. As shown previously (23), men heterozygous for relatively active and suppressed haplotypes can be identified by the crossover asymmetry test, in which reciprocal crossovers arise in sperm at the same frequency but map to different locations within the hotspot. However, hotspot MSTM1b lacks markers within the hotspot, with the nearest SNP at 23.8 kb, close to the edge of the hotspot (Figs. 1B and 2). As a result, only 6% of MSTM1b crossovers map distal to 23.8 kb in the nine men informative for this marker, in a 1.8 kb interval that will also contain background crossovers. Analysis of the proportion of crossovers that mapped 5′ and 3′ to 23.8 kb showed no significant deviation between A- and B-type reciprocal crossovers in any of these nine men (P>0.1 for each man), but could not exclude deviations of the magnitude seen at the edge of hotspots showing reciprocal crossover asymmetry (discussed subsequently). Identification of relatively active and suppressed haplotypes at MSTM1b was therefore not possible due to this lack of statistical power.

Reciprocal crossover asymmetry at hotspot MSTM1a

Hotspot MSTM1a contains many more markers than MSTM1b, allowing the asymmetry test to be applied to the three men showing MSTM1a activity (Fig. 5A and B). Sperm crossovers in all three active men showed highly significant asymmetry, with reciprocal crossover distributions displaced by ∼320 bp, similar to that seen at other mammalian hotspots showing asymmetry most likely triggered by variation in recombination initiation rates (23,24,27). This is consistent with initiation occurring in cis, with these three active men each carrying an active and a suppressed haplotype. Under this model, the 23 men who are inactive at MSTM1a should be homozygous for suppressed haplotypes.

These differences in recombination initiation rates result in over-transmission of alleles from the suppressed haplotype into recombinant progeny, as seen in humans (23,24), mice (27) and fungi (28,29). The transmission distortion seen in hotspot MSTM1a is highest (on average 72:28 versus 50:50 for Mendelian transmission) for markers nearest the centre of the hotspot (Fig. 5C) and is similar in intensity to that seen at other mammalian hotspots showing distortion (23,24,27). This meiotic drive in favour of alleles from the suppressed haplotype allowed us to identify the putative MSTM1a-active haplotype in each of the three men.

MSTM1a active and suppressed haplotypes

If the difference between active and suppressed haplotypes involved a recent DNA sequence change in or near hotspot MSTM1a, then these three active haplotypes should be similar or identical. Surprisingly, extended haplotype and sequence analysis across this hotspot and its three neighbouring hotspots (Fig. 6A) showed that these active haplotypes were in fact very different (Fig. 6B). This excludes any hotspot-controlling mutation restricted to these three haplotypes from most of the 100 kb haplotyped region. Although the active haplotypes were identical over a 3.5 kb interval extending across hotspots MSTM1a and MSTM1b, precisely the same 3.5 kb sequence was seen in many of the suppressed haplotypes (Fig. 6D). Therefore there is no DNA sequence change even in this 3.5 kb region that is associated with MSTM1a activity. Similarly, hotspot activity cannot be governed by sequence differences between haplotypes (Fig. 6C); for example, man 10 carries the same two haplotypes as man 14, and almost the same as man 24, yet only man 10 is active at MSTM1a.

Haplotype diversity and evolution across hotspot MSTM1a

We have previously shown from genotype data that hotspot MSTM1a, even though active in some men, lies in a region of intense LD and has left little mark on haplotype diversity (13). This was confirmed from the present haplotype data by LD mapping (1) which revealed a clear step of LD breakdown across hotspot MSTM1b but no corresponding step for MSTM1a other than a very minor increment at the extreme 5′ edge of the hotspot (Fig. 7A and B). More detailed analysis of haplotypes and the ancestral state of each SNP (Fig. 6D) showed blocks of markers for which there was no evidence of obligate historical recombination events, with boundaries corresponding to the approximate positions of historical exchanges (Fig. 7C). Hotspot MSTM1a was located almost entirely within such a block, allowing a unique phylogeny of haplotypes in this region to be deduced (Fig. 7D). Using human/chimpanzee divergence over this region (2.2%), the time to the most recent common ancestor of these haplotypes was estimated at ∼1.4 Myr. The three MSTM1a-active haplotypes belong to a single branch of this phylogeny, which also includes many suppressed haplotypes. If the change in hotspot activity was due to a change (not based on DNA sequence) within the MSTM1a region, which had occurred only once during human evolution, then the hotspot must have been initially inactive and only recently become activated.

DISCUSSION

This study shows that polymorphism between men in sperm crossover frequency at recombination hotspots is a common phenomenon, having been seen in the MHC (23,30) and in all of the hotspots shown in Figure 6 (13,22,24) (this work). Of the 16 hotspots surveyed to date by sperm typing, six show significant variation between men in crossover frequencies as detected by rate measurements in sperm and/or the demonstration of disparity of recombination initiation rates between haplotypes using the crossover asymmetry test. Most of these polymorphism surveys have been limited to very small number of men (6,7,13,14,22,24,26) and the true incidence of crossover frequency variation in hotspots is likely to be even greater.

These polymorphisms in crossover frequency can give important clues about processes involved in the evolutionary turnover of recombination hotspots. In two, possibly three, cases, variation in crossover frequency can be traced to a recombination-attenuating SNP allele located very close to the centre of the hotspot that reduces the recombination initiation rate by a factor estimated at about 3–6-fold (2224). These recombination-suppressing alleles are over-transmitted to recombinant progeny, both to crossovers and to gene conversions without exchange of flanking markers, and the resulting meiotic drive provides an explanation of how hotspots can become attenuated in activity during evolution (23,25,31).

Hotspot sequence motifs and hotspot suppressors provide evidence that hotspot activity can be influenced by primary DNA sequence determinants (4,23,24). The present survey reveals an additional layer of complexity, namely that major rate variation within a hotspot, as well as the presence/absence of a hotspot, can occur without any change in local DNA sequence. Hotspot MSTM1b provides the most dramatic example to date of quantitative variation in activity in a hotspot, with men showing a 75-fold range of crossover activity. The lack of SNP markers in the hotspot prevented identification, via the asymmetry test, of classes of haplotype that might show different recombination initiation activities and which could carry some variable feature that causes this polymorphism in activity. Nevertheless, the identification of strongly and weakly active men sharing identical haplotypes proved that local variation in primary DNA sequence is not a major determinant of hotspot MSTM1b polymorphism. Further, the extent of haplotype sharing between high and low activity men (Fig. 4) appears to exclude a DNA sequence-based cis-acting regulator of hotspot activity from an interval of 20 kb extending across this hotspot.

The neighbouring hotspot MSTM1a has provided the first example of a human crossover hotspot that shows what appears to be complete presence/absence polymorphism. Reciprocal crossover asymmetry in the three active men provided strong evidence that this hotspot is autonomous and marks a zone of recombination initiation separate from hotspot MSTM1b (23). It also strongly suggests that crossover activity at this hotspot is controlled in cis, with active men each carrying a haplotype active in recombination initiation plus a suppressed haplotype. As with MSTM1b, analysis of active and suppressed haplotypes showed that there cannot be a DNA-based cis-acting regulator of hotspot activity anywhere within the 100 kb haplotyped region spanning not only MSTM1a but also several neighbouring crossover hotspots. Furthermore, men exist with similar activities at MSTM1b but with MSTM1a either active or inactive, as well as with widely differing MSTM1b activities but with MSTM1a consistently inactive (Fig. 3). Although the three men active at MSTM1a show similar activities at MSTM1b, rank order analysis showed that this correlation is not significant (P=0.08). There is therefore no evidence of co-ordinate regulation of crossover activity at these hotspots.

So what controls the variable activity of MSTM1b and the on/off state of MSTM1a? One possibility is that recombination activity is controlled by a polymorphic trans-acting factor, for example a protein that binds to a hotspot and regulates its activity in spermatocytes. Given the apparently independent regulation of the two hotspots, this would require more than one factor. Also, for MSTM1a, it would need a factor, present in only three of the 26 men analysed, that activates MSTM1a but in cis on only one of the two haplotypes in an active man, implying that activation is by chance blocked on one haplotype perhaps by variant(s) in or near the hotspot. The possibility that both haplotypes are equally activated, with reciprocal crossover asymmetry arising instead by biases in heteroduplex repair during recombination, is unlikely for reasons discussed elsewhere (23).

Another possibility is that hotspot activity is regulated by DNA sequence-based cis-acting regulators specifically found on active haplotypes but which map remotely, outside the haplotyped region. This model is plausible given evidence in mice for regulators of crossover activity that map outside hotspots (32). However, it would require activation at a considerable distance and without obvious effect on the activity of intervening hotspots. It would also require separate regulation of MSTM1a on/off status and MSTM1b activity.

The third plausible explanation is that hotspot activity is regulated locally in cis but by epigenetic changes, for example, alterations in DNA methylation that could influence chromatin conformation locally in the vicinity of the hotspot in germ cells, thus affecting the initiation of meiotic recombination (3335). The inability to define active and suppressed haplotypes at hotspot MSTM1b prevented the further definition of haplotypes that might carry such an imprint. In contrast, the three active haplotypes at MSTM1a do contain a 3.5 kb region of haplotype identity spanning the hotspot that might be fortuitous but which could be a prime candidate for containing such a hotspot-activating imprint.

However, there are problems associated with this epigenetic model for hotspot MSTM1a. The simplest scenario is that the imprint changed once in evolution, in which case the hotspot must have originally been inactive (Fig. 7D). The activating epigenetic change would have occurred most likely on haplotype α (Fig. 6D), incidentally the most common haplotype seen in the survey, followed by exchanges in hotspot MSTM1b and the now-active MSTM1a to create the two additional active haplotypes seen in men 5 and 9. As predicted, these two haplotypes are unique and are not seen among the suppressed haplotypes and both could have been created by exchanges with other haplotypes similar or identical to those seen in the survey (Fig. 6D). Two lines of evidence suggest that this hypothetical activating epigenetic change in MSTM1a must have occurred fairly recently during human evolution. First, the estimated population frequency of active haplotypes is low (3/52), consistent with an origin estimated at very roughly 140 thousand years (kyr) (see Materials and Methods). Secondly, the three active haplotypes have failed to accumulate base substitutions over the 3.5 kb region of haplotype identity, yet two of them show exchanges at both hotspots. Information on base substitution and crossover rates again suggests that this epigenetic signal would have arisen recently, roughly 90 kyr ago (see Materials and Methods). Although both age estimates are crude, they imply that this activating epigenetic signal must have been subsequently inherited in a stable fashion over very many generations. Although possible, this seems unlikely given the generally rapid erasure of epigenetic modifications during passage through the germline (36).

Maybe instead there is a local cis-acting epigenetic signal that activates the hotspot but only transiently. For instance, the 3.5 kb region of active haplotype identity might facilitate the alteration of an epigenetic signal (an epimutation) on transmission from parent to son. This signal would be present in the son's soma and germline, leading to hotspot activation in his spermatocytes, but would be subsequently eliminated by germline reprogramming. Such epimutations have been described in plants (37,38) and recently in the human germline (39). Under this model, hotspot activity would be probabilistic, being expressed both in the context of local DNA sequence and by chance epigenetic activation, with only 11% (3/26) of men with the active haplotype actually expressing the hotspot. The resulting population recombination frequency would be only 10−5 (9×10−5 crossover frequency in active men, 11% of men active). This is so low that the hotspot would be unlikely to have left a significant mark on haplotype diversity, explaining its presence in a region of intense LD (13). If the 3.5 kb region of identity shared by the three active haplotypes is fortuitous (the probability of sampling three haplotypes that by chance are identical over the 3.5 kb region is 0.13), then perhaps any haplotype could be prone to epimutation. If so, then the antiquity of the trigger for epimutation, and indeed the age of the hotspot, becomes indeterminate. Unfortunately, the three men showing activity at hotspot MSTM1a were all anonymous semen donors and it was not possible to access somatic DNA from these men to test directly for epimutational changes nor to investigate the transmissibility of hotspot activity from father to son.

In summary, this survey has shown that hotspot activity can be influenced by factors other than local DNA sequence determinants, consistent with human/chimpanzee divergence in the locations of putative hotspots disproportionate to DNA sequence divergence (18,19). It has also revealed the first example of a hotspot present in only some men that may have arisen very recently in human evolution, possibly on a time-scale within the recent diversification of mankind and conceivably appearing and disappearing from one generation to the next. Further work is needed to explore the dynamics of hotspot turnover in human populations and to define the highly enigmatic processes whereby hotspots come into existence.

MATERIALS AND METHODS

Genotyping and haplotyping

Semen samples were collected, with approval from the Leicestershire Health Authority Research Ethics Committee and with informed consent, from 200 UK men of north European descent. Eighty men showing high yields of sperm DNA were selected for further analysis. SNP genotyping of these 80 men, by PCR amplification from whole-genome amplified DNA followed by allele-specific oligonucleotide (ASO) hybridization, is described elsewhere (6,7,13). Unambiguous haplotypes in the 26 men analysed for sperm crossovers were established using allele-specific long PCR directed to known heterozygous SNP sites in each man to amplify 8–12 kb long intervals of each haplotype from genomic DNA, followed by ASO hybridization to determine SNP status on each separated haplotype. This procedure was repeated sequentially at distal SNP heterozygosities whose linkage phase had been established to build up extended haplotypes. These separated haplotypes were also used as templates for resequencing over the interval 14–26 kb. All additional SNPs discovered in this interval were typed across all separated haplotypes. The ancestral state of each SNP was established by comparison with the chimpanzee genome sequence, release 30.2 (http://www.ensembl.org/Pan_troglodytes/). Details of SNPs and haplotypes are provided at http://www.le.ac.uk/ge/ajj/MS32/.

Sperm crossover analysis

Sperm crossover molecules at hotspots MSTM1a and MSTM1b were recovered using nested PCR with allele-specific primers (ASPs) in repulsion phase directed to heterozygous SNP sites upstream of MSTM1a and downstream of MSTM1b. By developing ASPs to six different upstream SNPs and to five downstream SNPs and selecting appropriate combinations of ASPs according to available SNP heterozygosities in each man, it proved possible to carry out crossover analysis on 26 of the 80 semen donors. Depending on crossover rate, 1000–26 000 amplifiable molecules of each progenitor haplotype (12–310 ng sperm DNA containing 0.4 to 2 crossover molecules) were screened per PCR for crossover molecules. Details of crossover recovery, crossover breakpoint mapping and Poisson correction for more than one crossover molecule per PCR reaction are given elsewhere (6,7). Details of ASPs and crossover assay conditions are available at http://www.le.ac.uk/ge/ajj/MS32/.

Analysis of crossover distributions

Crossover frequency uniformity in the 23 men showing no activity at hotspot MSTM1a was tested as follows. Crossovers mapping to the interval 20.8–22.0 kb spanning this hotspot were counted over all these men to give an average frequency per sperm. The probability Po over all men of obtaining the observed number of crossovers in each man at this frequency was then estimated, taking into account the number of sperm screened per man. Simulated data sets were then generated assuming a uniform crossover frequency in all men and again taking into account the sperm tested per man. For each simulation, the probability Pu of obtaining the entire simulated data set was estimated. Of 1000 simulations, 140 gave Pu<Po, indicating no significant clustering of exchanges into a subset of these MSTM1a-suppressed men.

Crossover locations in the hotspot MSTM1a region in the 23 suppressed men were analysed under two models, one (uniform) with exchanges uniformly distributed across this region and the second (hotspot) with all crossovers clustered into a 1.2 kb wide hotspot centred at 21.3 kb with the morphology seen in MSTM1a-active men. The probability of obtaining the entire data set in each of the 23 men, using the number of crossovers seen in each informative interval in each man, was then estimated for each model. The likelihood ratio P(data|uniform)/P(data|hotspot) was 2.6.

LD mapping and analysis of divergence times

LD maps were determined from haplotype data as described elsewhere (1), using LDmap software downloaded from http://cedar.genetics.soton.ac.uk/public_html/. The age of a mutation at current frequency x in the population was estimated as −4Ne[x/(1−x)]loge x where Ne is the effective diploid population size (40); Ne was assumed to be 10 000, an appropriate value for north Europeans (41), and the generation time was set at 20 years.

The age of a hypothetical MSTM1a-activating epigenetic signal arising on the inactive progenitor haplotype α was estimated by simulating genealogies of the inactive progenitor and its inactive plus three active descendants, assuming a constant-sized randomly mating population of 10 000 and with 20 years per generation. Genealogies were conditioned on those sharing a unique signal on the three active haplotypes. We estimated base substitution rates in the 3.5 kb region of haplotype identity from human–chimpanzee divergence over this region (2.2%) assuming a divergence time of 6 Myr. Sperm crossover frequencies at MSTM1a were averaged over the three active semen donors and at MSTM1b over all men. Population average rates were estimated from these assuming that both hotspots are twice as active in females as in males (13). These crossover rates were corrected for the proportion that would lead to detectable exchanges, based on haplotype frequency data. We recorded activation ages from those genealogies that showed no base substitutions over the 3.5 kb region in any of the four descendant haplotypes, nor crossovers in haplotype α nor in one of the three active haplotypes, and with the other two active haplotypes showing different exchanges at both MSTM1a and MSTM1b. Data from 1000 such genealogies showed that a local hotspot-activating signal arising as a single historical event would have appeared most likely 90 kyr ago (95% C.I. 40–150 kyr).

ACKNOWLEDGEMENTS

We thank J. Blower and volunteers for providing semen samples, S. Mistry for oligonucleotide synthesis, A. Webb for implementing LD mapping software, C. May for web site design, Y. Dubrova for insights and colleagues for helpful discussions. This work was supported by grants to A.J.J. from the Medical Research Council, the Royal Society and the Louis-Jeantet Foundation.

Conflict of Interest statement. None declared.

Figure 1. Variation between men in sperm crossover profiles across hotspots MSTM1a and MSTM1b. (A) Detecting crossovers. Nested PCR using ASPs (triangles) directed to heterozygous SNP sites (ovals) in repulsion phase was used to selectively amplify orientation A- or B-type crossover molecules from sperm DNA. Crossover breakpoints were then mapped by typing internal markers. (B) Sperm crossover distributions in two men. A total of 1.4×106 and 1.8×106 amplifiable molecules of each haplotype (17 µg and 22 µg sperm DNA) was assayed for reciprocal (A, B) crossovers in man 10 and man 4, respectively. The number of A plus B crossovers mapping to each interval between adjacent heterozygous SNPs, marked as ticks above, are indicated in italics and were used to estimate the local recombination activity in cM/Mb. The centre point of each hotspot is marked with an arrow. Co-ordinates in kb are taken from Jeffreys et al. (13). The test interval in man 4 is shorter due to lack of 5′ SNP heterozygosities. Note the lack of MSTM1a crossovers in this man.

Figure 1. Variation between men in sperm crossover profiles across hotspots MSTM1a and MSTM1b. (A) Detecting crossovers. Nested PCR using ASPs (triangles) directed to heterozygous SNP sites (ovals) in repulsion phase was used to selectively amplify orientation A- or B-type crossover molecules from sperm DNA. Crossover breakpoints were then mapped by typing internal markers. (B) Sperm crossover distributions in two men. A total of 1.4×106 and 1.8×106 amplifiable molecules of each haplotype (17 µg and 22 µg sperm DNA) was assayed for reciprocal (A, B) crossovers in man 10 and man 4, respectively. The number of A plus B crossovers mapping to each interval between adjacent heterozygous SNPs, marked as ticks above, are indicated in italics and were used to estimate the local recombination activity in cM/Mb. The centre point of each hotspot is marked with an arrow. Co-ordinates in kb are taken from Jeffreys et al. (13). The test interval in man 4 is shorter due to lack of 5′ SNP heterozygosities. Note the lack of MSTM1a crossovers in this man.

Figure 2. Cumulative crossover frequencies across hotspots MSTM1a and MSTM1b in 26 men assayed for sperm exchanges. Depending on crossover frequency, 0.7–3.8×106 amplifiable molecules of each haplotype were assayed for reciprocal crossovers in sperm DNA from each man, yielding 47–772 exchanges per man. Men are ranked by crossover activity at MSTM1b from highest (man 1) to lowest (man 26). Men showing high, medium and low activity at MSTM1b are analysed separately, with their range of recombination frequencies over the entire test interval shown. The bottom panel shows the three men also active at hotspot MSTM1a. Least-squares best-fit cumulative frequency distributions (7) are shown for a model incorporating a hotspot in which crossover breakpoints are normally distributed (two hotspots for the bottom panel) plus additional crossovers randomly distributed across the test interval. Hotspot MSTM1b is centred at a similar location in each panel (mean 23.3 kb, range 23.1–23.4 kb) and with similar widths (mean 1.6 kb, range 1.2–2.1 kb). The incidence of randomly distributed crossovers corresponds to a background crossover rate of ∼0.09 cM/Mb (range 0.04–0.18 over the four panels).

Figure 2. Cumulative crossover frequencies across hotspots MSTM1a and MSTM1b in 26 men assayed for sperm exchanges. Depending on crossover frequency, 0.7–3.8×106 amplifiable molecules of each haplotype were assayed for reciprocal crossovers in sperm DNA from each man, yielding 47–772 exchanges per man. Men are ranked by crossover activity at MSTM1b from highest (man 1) to lowest (man 26). Men showing high, medium and low activity at MSTM1b are analysed separately, with their range of recombination frequencies over the entire test interval shown. The bottom panel shows the three men also active at hotspot MSTM1a. Least-squares best-fit cumulative frequency distributions (7) are shown for a model incorporating a hotspot in which crossover breakpoints are normally distributed (two hotspots for the bottom panel) plus additional crossovers randomly distributed across the test interval. Hotspot MSTM1b is centred at a similar location in each panel (mean 23.3 kb, range 23.1–23.4 kb) and with similar widths (mean 1.6 kb, range 1.2–2.1 kb). The incidence of randomly distributed crossovers corresponds to a background crossover rate of ∼0.09 cM/Mb (range 0.04–0.18 over the four panels).

Figure 3. Variation between men in sperm crossover frequencies at hotspots MSTM1a and MSTM1b. Crossovers mapping upstream of 22.0 kb were assigned to hotspot MSTM1a, and those between 22.0 and 25.5 kb to MSTM1b (Fig. 2). Bars indicate 95% confidence intervals.

Figure 3. Variation between men in sperm crossover frequencies at hotspots MSTM1a and MSTM1b. Crossovers mapping upstream of 22.0 kb were assigned to hotspot MSTM1a, and those between 22.0 and 25.5 kb to MSTM1b (Fig. 2). Bars indicate 95% confidence intervals.

Figure 4. Similar haplotypes shared by men showing high and low crossover activity at hotspot MSTM1b and no activity at MSTM1a. (A) The region analysed, with locations of known crossover hotspots and SNPs shown. Intervals marked in red have been analysed for sperm crossovers (13,22) and markers within hotspots are indicated by coloured shading below. Note the change in scaling at 10 kb. (B) Examples of men with high and low MSTM1b crossover frequencies (given at the right) who share similar haplotypes. Alleles in the most common haplotype are coloured pink and alternative alleles blue. Regions where men show identical pairs of haplotypes are boxed. All these haplotypes were fully resequenced over the region 18.2–26.2 kb indicated by the bar below.

Figure 4. Similar haplotypes shared by men showing high and low crossover activity at hotspot MSTM1b and no activity at MSTM1a. (A) The region analysed, with locations of known crossover hotspots and SNPs shown. Intervals marked in red have been analysed for sperm crossovers (13,22) and markers within hotspots are indicated by coloured shading below. Note the change in scaling at 10 kb. (B) Examples of men with high and low MSTM1b crossover frequencies (given at the right) who share similar haplotypes. Alleles in the most common haplotype are coloured pink and alternative alleles blue. Regions where men show identical pairs of haplotypes are boxed. All these haplotypes were fully resequenced over the region 18.2–26.2 kb indicated by the bar below.

Figure 5. Reciprocal crossover asymmetry in hotspot MSTM1a. (A) Haplotypes across the hotspot plus examples of structures of reciprocal (A-, B-type) crossovers. (B) Cumulative frequencies of A- and B-type sperm crossovers across the hotspot in the three men with the active hotspot (black, man 5; blue, man 9; red, man 10) (Fig. 2). Lines show the least-squares best-fit cumulative distributions assuming that A and B exchange points are normally distributed (7). (C) Transmission of alleles from haplotype 1 into crossover progeny in these three men, normalized to equal frequencies of A- and B-type crossovers (23) and with 95% confidence intervals. These data were derived from 58 A-type and 32 B-type MSTM1a crossovers recovered, respectively, from 0.6×106 and 0.4×106 amplifiable sperm DNA molecules of each haplotype from man 5, plus 74 A-type and 71 B-type crossovers from 1.4×106 and 0.9×106 molecules of each haplotype from man 9, and 166 A-type and 191 B-type crossovers from 2.0×106 and 2.1×106 molecules from man 10. Crossovers from men 5 and 9 were recovered from sperm DNA using 5′ selector primers located at 20.8 kb and thus lack markers further upstream.

Figure 5. Reciprocal crossover asymmetry in hotspot MSTM1a. (A) Haplotypes across the hotspot plus examples of structures of reciprocal (A-, B-type) crossovers. (B) Cumulative frequencies of A- and B-type sperm crossovers across the hotspot in the three men with the active hotspot (black, man 5; blue, man 9; red, man 10) (Fig. 2). Lines show the least-squares best-fit cumulative distributions assuming that A and B exchange points are normally distributed (7). (C) Transmission of alleles from haplotype 1 into crossover progeny in these three men, normalized to equal frequencies of A- and B-type crossovers (23) and with 95% confidence intervals. These data were derived from 58 A-type and 32 B-type MSTM1a crossovers recovered, respectively, from 0.6×106 and 0.4×106 amplifiable sperm DNA molecules of each haplotype from man 5, plus 74 A-type and 71 B-type crossovers from 1.4×106 and 0.9×106 molecules of each haplotype from man 9, and 166 A-type and 191 B-type crossovers from 2.0×106 and 2.1×106 molecules from man 10. Crossovers from men 5 and 9 were recovered from sperm DNA using 5′ selector primers located at 20.8 kb and thus lack markers further upstream.

Figure 6. Active and suppressed haplotypes at hotspot MSTM1a. (A) The region analysed, as shown in Figure 4A. (B) The three MSTM1a-active haplotypes identified from crossover frequency (Fig. 3) and crossover asymmetry data (Fig. 5), with alleles coloured as in Figure 4. The 3.5 kb region of haplotype identity is marked by a box. (C) Comparison of haplotypes in the active men 5 and 10 with other men carrying similar or identical haplotypes but inactive at MSTM1a. Haplotype discordancies between active and inactive men are indicated with asterisks. (D) The three active and 49 suppressed haplotypes in all 26 men assayed for crossovers. Haplotypes are grouped by similarities over the 3.5 kb region, and those marked in green at the right were fully resequenced over the interval 14–26 kb. The ancestral state of each SNP was deduced from human/chimpanzee comparisons; ambiguous states or unknown states arising from gaps in the chimpanzee sequence are indicated in white.

Figure 6. Active and suppressed haplotypes at hotspot MSTM1a. (A) The region analysed, as shown in Figure 4A. (B) The three MSTM1a-active haplotypes identified from crossover frequency (Fig. 3) and crossover asymmetry data (Fig. 5), with alleles coloured as in Figure 4. The 3.5 kb region of haplotype identity is marked by a box. (C) Comparison of haplotypes in the active men 5 and 10 with other men carrying similar or identical haplotypes but inactive at MSTM1a. Haplotype discordancies between active and inactive men are indicated with asterisks. (D) The three active and 49 suppressed haplotypes in all 26 men assayed for crossovers. Haplotypes are grouped by similarities over the 3.5 kb region, and those marked in green at the right were fully resequenced over the interval 14–26 kb. The ancestral state of each SNP was deduced from human/chimpanzee comparisons; ambiguous states or unknown states arising from gaps in the chimpanzee sequence are indicated in white.

Figure 7. Haplotype diversity across the MSTM1 region. (A) LD map across this region, determined from the haplotypes in Figure 6 as described elsewhere (1). (B) Best-fit sperm crossover distributions in men active at hotspot MSTM1a, from data in Figure 2. The region upstream of 20 kb has not been assayed for sperm exchanges but lies within a region of complete LD. (C) Location of SNPs and the 3.5 kb region of identity shared by the three haplotypes that are active at hotspot MSTM1a, together with blocks of markers that show no evidence for historical recombination. The latter were identified from the haplotypes shown in Figure 6, together with information on the ancestral state of each SNP. The non-recombining block spanning MSTM1a is marked with an asterisk. (D) Phylogeny of this MSTM1a block, with intermediate haplotypes not seen in the survey marked with brackets and with the number of nucleotide changes on each branch indicated. The numbers of each haplotype observed are shown at the right.

Figure 7. Haplotype diversity across the MSTM1 region. (A) LD map across this region, determined from the haplotypes in Figure 6 as described elsewhere (1). (B) Best-fit sperm crossover distributions in men active at hotspot MSTM1a, from data in Figure 2. The region upstream of 20 kb has not been assayed for sperm exchanges but lies within a region of complete LD. (C) Location of SNPs and the 3.5 kb region of identity shared by the three haplotypes that are active at hotspot MSTM1a, together with blocks of markers that show no evidence for historical recombination. The latter were identified from the haplotypes shown in Figure 6, together with information on the ancestral state of each SNP. The non-recombining block spanning MSTM1a is marked with an asterisk. (D) Phylogeny of this MSTM1a block, with intermediate haplotypes not seen in the survey marked with brackets and with the number of nucleotide changes on each branch indicated. The numbers of each haplotype observed are shown at the right.

References

1
Maniatis, N., Collins, A., Xu, C.F., McCarthy, L.C., Hewett, D.R., Tapper, W., Ennis, S., Ke, X. and Morton, N.E. (
2002
) The first linkage disequilibrium (LD) maps: delineation of hot and cold blocks by diplotype analysis.
Proc. Natl Acad. Sci. USA
 ,
99
,
2228
–2233.
2
Tapper, W., Collins, A., Gibson, J., Maniatis, N., Ennis, S. and Morton, N.E. (
2005
) A map of the human genome in linkage disequilibrium units.
Proc. Natl Acad. Sci. USA
 ,
102
,
11835
–11839.
3
McVean, G.A., Myers, S.R., Hunt, S., Deloukas, P., Bentley, D.R. and Donnelly, P. (
2004
) The fine-scale structure of recombination rate variation in the human genome.
Science
 ,
304
,
581
–584.
4
Myers, S., Bottolo, L., Freeman, C., McVean, G. and Donnelly, P. (
2005
) A fine-scale map of recombination rates and hotspots across the human genome.
Science
 ,
310
,
321
–324.
5
Altshuler, D., Brooks, L.D., Chakravarti, A., Collins, F.S., Daly, M.J., Donnelly, P.; International HapMap Consortium. (
2005
) A haplotype map of the human genome.
Nature
 ,
437
,
1299
–1320.
6
Jeffreys, A.J., Ritchie, A. and Neumann, R. (
2000
) High resolution analysis of haplotype diversity and meiotic crossover in the human TAP2 recombination hotspot.
Hum. Mol. Genet.
 ,
9
,
725
–733.
7
Jeffreys, A.J., Kauppi, L. and Neumann, R. (
2001
) Intensely punctate meiotic recombination in the class II region of the major histocompatibility complex.
Nat. Genet.
 ,
29
,
217
–222.
8
Kauppi, L., Jeffreys, A.J. and Keeney, S. (
2004
) Where the crossovers are: recombination distributions in mammals.
Nat. Rev. Genet.
 ,
5
,
413
–424.
9
Wang, N., Akey, J.M., Zhang, K., Chakraborty, R. and Jin, L. (
2002
) Distribution of recombination crossovers and the origin of haplotype blocks: the interplay of population history, recombination, and mutation.
Am. J. Hum. Genet
 ,
71
,
1227
–1234.
10
Phillips, M.S., Lawrence, R., Sachidanandam, R., Morris, A.P., Balding, D.J., Donaldson, M.A., Studebaker, J.F., Ankener, W.M., Alfisi, S.V., Kuo, F.S. et al. (
2003
) Chromosome-wide distribution of haplotype blocks and the role of recombination hot spots.
Nat. Genet.
 ,
33
,
382
–387.
11
Zhang, K., Akey, J.M., Wang, N., Xiong, M., Chakraborty, R. and Jin, L. (
2003
) Randomly distributed crossovers may generate block-like patterns of linkage disequilibrium: an act of genetic drift.
Hum. Genet.
 ,
113
,
51
–59.
12
Stumpf, M.P. (
2004
) Haplotype diversity and SNP frequency dependence in the description of genetic variation.
Eur. J. Hum. Genet.
 ,
12
,
469
–477.
13
Jeffreys, A.J., Neumann, R., Panayi, M., Myers, S. and Donnelly, P. (
2005
) Human recombination hot spots hidden in regions of strong marker association.
Nat. Genet.
 ,
37
,
601
–606.
14
May, C.A., Shone, A.C., Kalaydjieva, L., Sajantila, A. and Jeffreys, A.J. (
2002
) Crossover clustering and rapid decay of linkage disequilibrium in the Xp/Yp pseudoautosomal gene SHOX.
Nat. Genet.
 ,
31
,
272
–275.
15
Jeffreys, A.J. and May, C.A. (
2004
) Intense and highly localized gene conversion activity in human meiotic crossover hot spots.
Nat. Genet.
 ,
36
,
151
–156.
16
Wall, J.D., Frisse, L.A., Hudson, R.R. and Di Rienzo, A. (
2003
) Comparative linkage-disequilibrium analysis of the β-globin hotspot in primates.
Am. J. Hum. Genet.
 ,
73
,
1330
–1340.
17
Ptak, S.E., Roeder, A.D., Stephens, M., Gilad, Y., Paabo, S. and Przeworski, M. (
2004
) Absence of the TAP2 human recombination hotspot in chimpanzees.
PLoS Biol.
 ,
2
,
e155
.
18
Winckler, W., Myers, S.R., Richter, D.J., Onofrio, R.C., McDonald, G.J., Bontrop, R.E., McVean, G.A., Gabriel, S.B., Reich, D., Donnelly, P. et al. (
2005
) Comparison of fine-scale recombination rates in humans and chimpanzees.
Science
 ,
308
,
107
–111.
19
Ptak, S.E., Hinds, D.A., Koehler, K., Nickel, B., Patil, N., Ballinger, D.G., Przeworski, M., Frazer, K.A. and Paabo, S. (
2005
) Fine-scale recombination patterns differ between chimpanzees and humans.
Nat. Genet.
 ,
37
,
429
–434.
20
Kauppi, L., Sajantila, A. and Jeffreys, A.J. (
2003
) Recombination hotspots rather than population history dominate linkage disequilibrium in the MHC class II region.
Hum. Mol. Genet.
 ,
12
,
33
–44.
21
Crawford, D.C., Bhangale, T., Li, N., Hellenthal, G., Rieder, M.J., Nickerson, D.A. and Stephens, M. (
2004
) Evidence for substantial fine-scale variation in recombination rates across the human genome.
Nat. Genet.
 ,
36
,
700
–706.
22
Jeffreys, A.J., Murray, J. and Neumann, R. (
1998
) High-resolution mapping of crossovers in human sperm defines a minisatellite-associated recombination hotspot.
Mol. Cell
 ,
2
,
267
–273.
23
Jeffreys, A.J. and Neumann, R. (
2002
) Reciprocal crossover asymmetry and meiotic drive in a human recombination hot spot.
Nat. Genet.
 ,
31
,
267
–271.
24
Jeffreys, A.J. and Neumann, R. (
2005
) Factors influencing recombination frequency and distribution in a human meiotic crossover hotspot.
Hum. Mol. Genet.
 ,
14
,
2277
–2287.
25
Boulton, A., Myers, R.S. and Redfield, R.J. (
1997
) The hotspot conversion paradox and the evolution of meiotic recombination.
Proc. Natl Acad. Sci. USA
 ,
94
,
8058
–8063.
26
Kauppi, L., Stumpf, M.P. and Jeffreys, A.J. (
2005
) Localized breakdown in linkage disequilibrium does not always predict sperm crossover hot spots in the human MHC class II region.
Genomics
 ,
86
,
13
–24.
27
Yauk, C.L., Bois, P.R.J. and Jeffreys, A.J. (
2003
) High-resolution sperm typing of meiotic recombination in the mouse MHC Eβ gene.
EMBO J.
 ,
22
,
1389
–1397.
28
Nicolas, A., Treco, D., Schultes, N.P. and Szostak, J.W. (
1989
) An initiation site for meiotic gene conversion in the yeast Saccharomyces cerevisiae.
Nature
 ,
338
,
35
–39.
29
Nicolas, A. and Petes, T.D. (
1994
) Polarity of meiotic gene conversion in fungi: contrasting views.
Experientia
 ,
50
,
242
–252.
30
Cullen, M., Perfetto, S.P., Klitz, W., Nelson, G. and Carrington, M. (
2002
) High-resolution patterns of meiotic recombination across the human major histocompatibility complex.
Am. J. Hum. Genet.
 ,
71
,
759
–776.
31
Pineda-Krch, M. and Redfield, R. (
2005
) Persistence and loss of meiotic recombination hotspots.
Genetics
 ,
169
,
2319
–2333.
32
Shiroishi, T., Sagai, T., Hanzawa, N., Gotoh, H. and Moriwaki, K. (
1991
) Genetic control of sex-dependent meiotic recombination in the major histocompatibility complex of the mouse.
EMBO J.
 ,
10
,
681
–686.
33
Ohta, K., Shibata, T. and Nicolas, A. (
1994
) Changes in chromatin structure at recombination initiation sites during yeast meiosis.
EMBO J.
 ,
13
,
5754
–5763.
34
Wu, T.C. and Lichten, M. (
1994
) Meiosis-induced double-strand break sites determined by yeast chromatin structure.
Science
 ,
263
,
515
–518.
35
Fan, Q.Q. and Petes, T.D. (
1996
) Relationship between nuclease-hypersensitive sites and meiotic recombination hot spot activity at the HIS4 locus of Saccharomyces cerevisiae.
Mol. Cell. Biol.
 ,
16
,
2037
–2043.
36
Rakyan, V.K., Preis, J., Morgan, H.D. and Whitelaw, E. (
2001
) The marks, mechanisms and memory of epigenetic states in mammals.
Biochem. J.
 ,
356
,
1
–10.
37
Fedoroff, N., Masson, P. and Banks, J.A. (
1989
) Mutations, epimutations, and the developmental programming of the maize suppressor–mutator transposable element.
Bioessays
 ,
10
,
139
–144.
38
Cubas, P., Vincent, C. and Coen, E. (
1999
) An epigenetic mutation responsible for natural variation in floral symmetry.
Nature
 ,
401
,
157
–161.
39
Suter, C.M., Martin, D.I. and Ward, R.L. (
2004
) Germline epimutation of MLH1 in individuals with multiple cancers.
Nat. Genet.
 ,
36
,
497
–501.
40
Kimura, M. and Ota, T. (
1973
) The age of a neutral mutant persisting in a finite population.
Genetics
 ,
75
,
199
–212.
41
Morton, N.E. (
1982
)
Outline of Genetic Epidemiology
 . Karger, Basel.