Abstract
Recombination plays a fundamental role in meiosis. Non-exchange gene conversion (non-crossover, NCO) may facilitate homologue pairing, while reciprocal crossover (CO) physically connects homologues so they orientate appropriately on the meiotic spindle. In males, X–Y homologous pairing and exchange occurs within the two pseudoautosomal regions (PARs) together comprising <5% of the human sex chromosomes. Successful meiosis depends on an obligatory CO within PAR1, while the nature and role of exchange within PAR2 is unclear. Here, we describe the identification and characterization of a typical ∼1 kb wide recombination hotspot within PAR2. We find that both COs and NCOs are strongly modulated in trans by the presumed chromatin remodelling protein PRDM9, and in cis by a single nucleotide polymorphism (SNP) located at the hotspot centre that appears to influence recombination initiation and which causes biased gene conversion in SNP heterozygotes. This, the largest survey to date of human NCOs reveals for the first time substantial inter-individual variation in the NCO:CO ratio. Although the extent of biased transmission at the central marker in COs is similar across men, it is highly variable among NCO recombinants. This suggests that cis-effects are mediated not only through recombination initiation frequencies varying between haplotypes but also through subsequent processing, with the potential to significantly intensify meiotic drive of hotspot-suppressing alleles. The NCO:CO ratio and extent of transmission distortion among NCOs appear to be inter-related, suggesting the existence of two NCO pathways in humans.
INTRODUCTION
Our understanding of the fundamental process of meiotic recombination in humans has greatly advanced through complementary genome-wide analyses and focused fine-scale sperm DNA approaches. Most crossover events (COs) in our genome cluster into narrow hotspots 1–2 kb in width (1,2), and many of these hotspots contain degenerate GC-rich sequence motifs which appear to influence the initiation of recombination (3). Recently, the meiosis-specific protein PRDM9 has been identified as a major specifier of hotspots in humans and mice (4–11). PRDM9 contains a zinc-finger (ZnF) domain, encoded by a variable minisatellite that is capable of binding hotspot sequences, plus a SET domain implicated in histone methylation and thus perhaps activation of hotspots via chromatin remodelling. The analysis of de novo events in sperm DNA has shown that hotspot activity is highly sensitive to variation in the ZnF array of PRDM9 (5), to the extent that individuals with differing genotypes can use substantially different sets of hotspots (4,11). Sperm DNA typing has also revealed cis-acting effects within hotspots that down-regulate CO activity (12,13), and has been used to explore a second class of meiotic recombination event, highly localized non-reciprocal gene conversions or NCOs between homologues (14).
The peak numbers of COs and NCOs at individual human and mouse hotspots co-localize (14–17), suggesting that these events are alternative outcomes of the same recombination-initiating lesions, programmed double strand breaks (DSBs) induced by the endonuclease Spo11 at leptotene (18). However, mouse studies have shown that CO and NCO have different genetic requirements, indicating that they likely result from distinct pathways (19,20), possibly comparable to those in Saccharomyces cerevisiae that diverge shortly after recombination initiation (21,22). It is widely held that in the budding yeast, COs are formed through resolution of double Holliday junction (dHJ) intermediates, whereas NCOs are predominantly formed earlier via single-end invasion intermediates by synthesis-dependent strand annealing (SDSA) (23).
Observed as chiasmata during diplotene, the physical linkages that remain between homologues following CO are required for appropriate orientation on the first meiotic spindle and thus correct disjunction of chromosome pairs. Cytological data indicate that only ∼10% of mammalian DSBs genome-wide are repaired to generate COs (19), suggesting that the remainder are repaired as NCOs. It has been proposed that these homology-dependent interactions might directly enforce proper homologous pairing (24–26) and/or be the indirect consequence of induction of sufficient DSBs to ensure properly regulated placement of COs (17). To date, observed relative rates of CO and NCO show major variation between human hotspots (15) yet we have little insight into the factors that influence this balance at the hotspot level either in the mouse or man.
The critical role of CO is evident from the behaviour of the sex chromosomes during male meiosis. In humans, normal pairing and CO between the X and Y is mainly limited to the 2.7Mb region of shared homology at the tips of short arms, the major pseudoautosomal region (PAR) or PAR1 (27,28). Family data are consistent with a single obligatory PAR1-mediated CO per male meiosis (29), and diminished recombination in PAR1 has been associated with an increased frequency of 24, XY aneuploid sperm (i.e. a single copy of each autosome, but both X and Y chromosomes) (30,31). The human sex chromosomes also harbour a second much smaller (∼330 kb) PAR, PAR2, at the ends of the long arms of X and Y (32). PAR2 is human-specific, having resulted from an L1-mediated ectopic recombination event that transferred the subtelomeric region of the X onto the Y chromosome after the human and chimpanzee lineages diverged (33–35). Cytological studies have shown that ∼50% of early pachytene spermatocytes synapse at PAR2, but curiously these associations disappear by mid-pachytene such that chiasmata have only been observed at PAR1 (36–39). This suggests that the two human PARs are not functionally equivalent and raises questions about the nature of genetic exchanges in PAR2—are they genuine CO events or atypically long non-reciprocal NCOs (33)? Very low-resolution molecular studies have identified a few de novo paternal exchanges within families, at least some of which are consistent with reciprocal CO (40,41). However, as noted by Flaquer et al. (42,43), PAR2 has been largely neglected in recent marker-dense linkage studies. To address this, we have used population diversity data from the International HapMap project (44) and sperm recombination assays (45) to examine genetic exchange in PAR2 at high resolution. We demonstrate that exchange is dominated by a single highly active hotspot previously shown on a subset of data to be regulated by PRDM9 (5). This hotspot has proved to be highly informative for analysing recombination events and for the first time has allowed us to explore in detail inter-individual variation in the ratio of COs to NCOs.
RESULTS
CO activity in PAR2: identification of the SPRY3 hotspot
Published historical CO rates for PAR2, estimated by coalescent analysis of CEU HapMap data (44,46), showed major clustering of COs with two clear peaks of activity in the first 100 kb of PAR2, the more extreme coinciding with the 5′ region of the most proximal gene SPRY3 some 66 kb from the PAR2 boundary (34,35) (Fig. 1A). All exchange events previously identified in CEPH families mapped to a 145 kb interval entirely encompassing these two peaks of historical recombination (40,41) (Fig. 1B). We attempted to refine the mapping of these COs in five of these six key CEPH families by single nucleotide polymorphism (SNP) typing. In one family, we could not identify a recombinant child, while for four families we could map the events to short intervals (0.9–30.7 kb) that all overlapped each other and the extreme peak in historical recombination (Fig. 1B).
Recombination activity in PAR2 inferred from population and pedigree data. (A) Published historical recombination map of PAR2 as estimated by coalescent analysis of CEU genotypes from phase II HapMap, with peaks of historical activity arrowed. This map does not cover the proximal and distal regions of PAR2 (shaded). Coding sequences and the PAR2 boundary PAB2 are indicated below. (B) Mapping de novo recombination events identified in CEPH pedigrees, with individual identifiers to the right. All four events had been previously mapped to the interval between markers JXYQ and sDF-1 (40,41) shown by thin lines between triangles. Further mapping localized the events to the regions marked by black boxes. Partial pedigree data for the two key markers for family 1332 are shown beneath; the recombinant child 133206 (asterisk) receives his father's Y-allele at rs700442, but his father's X-allele at rs71190340. [Note: despite a previous report (41) we were unable to identify a recombinant child within CEPH pedigree 1416.] (C) Metric LD maps over a 15 kb interval encompassing the 5′ end of SPRY3, established from high-resolution SNP genotype data from 97 men of North European descent (filled circles) and 74 men of Southern East-African descent (open circles). Co-ordinates are given for the X chromosome (NCBI36/hg18). (D) Cumulative frequency distributions of sperm COs from seven men are shown for the assay interval (0 kb = chrX:154,644,494), with filled symbols representing Europeans and open symbols Africans. The least-squares best-fit cumulative normal distribution is shown by the black curve. Note that despite an apparent double LD hotspot in Africans (Fig. 1C), there is only one active hotspot in the African men tested over this interval.
Recombination activity in PAR2 inferred from population and pedigree data. (A) Published historical recombination map of PAR2 as estimated by coalescent analysis of CEU genotypes from phase II HapMap, with peaks of historical activity arrowed. This map does not cover the proximal and distal regions of PAR2 (shaded). Coding sequences and the PAR2 boundary PAB2 are indicated below. (B) Mapping de novo recombination events identified in CEPH pedigrees, with individual identifiers to the right. All four events had been previously mapped to the interval between markers JXYQ and sDF-1 (40,41) shown by thin lines between triangles. Further mapping localized the events to the regions marked by black boxes. Partial pedigree data for the two key markers for family 1332 are shown beneath; the recombinant child 133206 (asterisk) receives his father's Y-allele at rs700442, but his father's X-allele at rs71190340. [Note: despite a previous report (41) we were unable to identify a recombinant child within CEPH pedigree 1416.] (C) Metric LD maps over a 15 kb interval encompassing the 5′ end of SPRY3, established from high-resolution SNP genotype data from 97 men of North European descent (filled circles) and 74 men of Southern East-African descent (open circles). Co-ordinates are given for the X chromosome (NCBI36/hg18). (D) Cumulative frequency distributions of sperm COs from seven men are shown for the assay interval (0 kb = chrX:154,644,494), with filled symbols representing Europeans and open symbols Africans. The least-squares best-fit cumulative normal distribution is shown by the black curve. Note that despite an apparent double LD hotspot in Africans (Fig. 1C), there is only one active hotspot in the African men tested over this interval.
We homed in on a 15 kb interval centred on the candidate hotspot upstream of SPRY3 and genotyped 97 North European semen donors for all available SNP markers (Supplementary Material, Tables S1 and S2). Linkage disequilibrium (LD) mapping (47) confirmed a single interval of LD breakdown ∼2 kb wide (Fig. 1C). A similar LD profile was seen in 74 African semen donors, suggesting that this LD hotspot has been active in both African and non-African populations. Corresponding analysis of the second more distal peak of activity (Fig. 1A) failed to identify an LD hotspot (C.A.M., unpublished data). Also, this peak disappeared when using coalescent analysis on HapMap genotype data filtered for markers unambiguously typed in >70% of individuals, suggesting that this HapMap peak is an artefact.
We designed long-range polymerase chain reaction (PCR) assays spanning the major LD hotspot to detect de novo CO molecules in sperm DNA (45) (Supplementary Material, Table S2). Seventeen men (9 north Europeans and 8 Africans) were screened, and 7–587 COs were identified per man (median 94). Sperm CO frequencies showed ∼50-fold variation among these semen donors (0.017 to 0.845%), with the highest activity comparable to that of the most active autosomal hotspots described to date (48). Reciprocal assays performed on seven of the men showed no significant difference in CO frequency between orientations as expected for classical reciprocal CO.
All CO events were mapped by typing intervening SNPs. More than 99% of recombinants involved a single switch from one haplotype to the other. These simple exchange points were clustered into a single 1 kb wide recombination hotspot in both African and European men, with CO breakpoints apparently normally distributed across the hotspot (Fig. 1D).
CO asymmetry triggered by heterozygosity at a hotspot SNP
SNP rs700442 (hereafter known as ‘0442’) is the most central marker in the hotspot, located ∼45 bp away from the centre (5). All 0442C/T heterozygotes assayed in both orientations showed highly significant (P = 1 × 10−7–5.5 × 10−11) over-transmission of the C allele to recombinant sperm, with on average 66% of COs inheriting the C allele (Fig. 2A). Similar though less intense transmission distortion (TD) was also seen at other heterozygous SNPs within the hotspot in two of these men. In contrast, 0442C and 0442T homozygotes showed no significant TD at any hotspot SNP (Fig. 2B), suggesting that this biased gene conversion is specifically triggered by heterozygosity at 0442. As a consequence of distortion, reciprocal events in 0442 heterozygotes map to different locations and the offset of the centre points gives an indication of the average conversion tract associated with CO (19). A mean displacement of 329 bp was seen at the SPRY3 hotspot, very similar to ∼350 bp shifts seen at autosomal hotspots displaying CO asymmetry (12,13,48,49).
TD at the hotspot centre is associated with heterozygosity at SNP rs700442. (A) Transmission frequencies, with 95% confidence intervals, of SNP 0442 allele C and linked alleles into normalised numbers of reciprocal sperm COs. Data are shown for two 0442 heterozygotes each informative for an additional marker within the hotspot (indicated by grey shading). Significant TD is seen for all hotspot markers. (B) Hotspot markers carried by men homozygous for 0442 show no significant deviation from 50% transmission regardless of their location. Man 10 is heterozygous for both rs306882 and rs56750753, markers that show distortion in 0442 heterozygous men. (C) Sequence context of SNP 0442 showing the close match to the hotspot motif CCNCCNTNNCCNC (3).
TD at the hotspot centre is associated with heterozygosity at SNP rs700442. (A) Transmission frequencies, with 95% confidence intervals, of SNP 0442 allele C and linked alleles into normalised numbers of reciprocal sperm COs. Data are shown for two 0442 heterozygotes each informative for an additional marker within the hotspot (indicated by grey shading). Significant TD is seen for all hotspot markers. (B) Hotspot markers carried by men homozygous for 0442 show no significant deviation from 50% transmission regardless of their location. Man 10 is heterozygous for both rs306882 and rs56750753, markers that show distortion in 0442 heterozygous men. (C) Sequence context of SNP 0442 showing the close match to the hotspot motif CCNCCNTNNCCNC (3).
CO asymmetry is most readily explained by differences in the frequency of recombination-initiating DSBs between haplotypes, with 0442C marking the less active haplotype (12). Gene conversion accompanying CO ensues as the overhangs generated by subsequent resection are returned to the double-stranded state using information from the other intact chromosome. This implies that 0442T-bearing haplotypes experience more DSBs than 0442C haplotypes and consistent with this, 0442T homozygotes tend to have higher overall CO frequencies than 0442C homozygotes among the 17 men analysed (∼3-fold higher).
SNP 0442 falls within a near-perfect match to the CCNCCNTNNCCNC motif found in 40% of European LD hotspots (3) that likely serves as a binding site for the common ‘A’ form of PRDM9. Consistent with this, the PRDM9 A variant specifically activates this hotspot, as previously shown on a subset of the present data (hotspot ‘PAR2’ in 5), and the derived recombination-suppressing allele 0442C disrupts the motif (Fig. 2C).
High levels of inter-individual variation in CO frequency
PRDM9 and SNP 0442 have major influences on SPRY3 hotspot activity but cannot explain all the variation in activity between men. For example, the seven men tested who were homozygous for the PRDM9 A variant and heterozygous at 0442 showed ∼100-fold variation in CO frequency (Supplementary Material, Table S3). We quantified the combined effects of the PRDM9 ZnF array and 0442 status on CO frequency using a linear regression model on data from all 29 men analysed. Assuming an additive effect for each, the proportion of the variance explained by these two factors was found to be 29%. This increased modestly to 34% when a term was included to model the biological interaction between these two co-variates. In relative terms, the trans-acting factor PRDM9 was found to account for 1.4 times more of the variance in CO frequency than the cis-acting SNP 0442. We did attempt to identify other factors that might be influencing CO frequencies at the SPRY3 hotspot, including additional variants within the hotspot as well as trans-acting loci reported to influence overall CO frequencies in the human genome (50,51) (Supplementary Material, Tables S3–S5, Supplemental Note). However, once we took PRDM9 and 0442 status into account, we lacked statistical power to detect or exclude influences of these other candidate factors on CO frequency.
NCO events are also prevalent and regulated by PRDM9
To look at interhomologue recombination without exchange, we designed assays capable of detecting NCO as well as CO molecules in sperm DNA (45). Analysis of two 0442 heterozygotes showed comparable CO frequencies and the same TD as seen in regular CO assays, thus validating the approach. Data on 14 men (7 north Europeans and 7 Africans) revealed NCO frequencies ranging from 0.01 to 0.49%, with ∼85% of events encompassing only a single SNP site. True NCO frequencies will be higher since markerless conversion tracts cannot be detected. Indeed mean tract lengths estimated across all men analysed were heavily affected by differences in marker distribution (see Supplementary Material, Table S6). Nonetheless, detectable conversions appear to have short tract lengths (shortest 1–99 bp in length, longest 613–2910 bp) and relatively steep gradients extending from the hotspot centre, and are likely consistent with those seen at other human hotspots (14).
For direct comparisons between men, we focussed on NCO frequencies at the central SNP 0442 in thirteen 0442 heterozygotes (366 NCO events detected in total). NCO frequencies ranged from 0 to 0.35% and, as seen for COs at this hotspot, appear to be regulated by PRDM9, with the highest activities seen in men homozygous for the activating PRDM9 A variant (Fig. 3A). As with COs, NCOs showed significantly biased gene conversion, with most NCOs showing transfer of the 0442C allele (plus linked hotspot alleles) to the 0442T haplotype and not vice versa (Fig. 3B). These findings are consistent with the PRDM9 A variant acting upstream of the CO/NCO decision, with both types of recombinant being different outcomes of the same initiating events. However, there was only a marginally significant correlation between NCO and CO frequencies, with NCO:CO ratios varying from ∼1:10 to 3.8:1 between men and with no evidence that PRDM9 influences the NCO/CO decision (Fig. 4).
Trans and cis factors influence NCO frequencies at SNP 0442 in the SPRY3 hotspot. (A) Variation in NCO frequency, with 95% confidence intervals, in 0442C/T heterozygotes carrying two PRDM9 A alleles (A/A, black), one A allele (A/N, grey) and two non-A alleles (N/N, white). The median conversion frequency per group is indicated by a dotted line. The low ranking of both N/N men amongst all men tested is significant (Mann-Whitney test, P < 0.05). (B) Preferential over-transmission of the SNP 0442C allele among NCOs. Data are displayed for man 8 with the upper graph showing events detected by amplifying the parental haplotype bearing the 0442T allele (white circles) and screening for the presence of markers from the parental haplotype bearing the 0442C allele (black circles), and vice versa for the lower graph. The hotspot region is shaded in grey.
Trans and cis factors influence NCO frequencies at SNP 0442 in the SPRY3 hotspot. (A) Variation in NCO frequency, with 95% confidence intervals, in 0442C/T heterozygotes carrying two PRDM9 A alleles (A/A, black), one A allele (A/N, grey) and two non-A alleles (N/N, white). The median conversion frequency per group is indicated by a dotted line. The low ranking of both N/N men amongst all men tested is significant (Mann-Whitney test, P < 0.05). (B) Preferential over-transmission of the SNP 0442C allele among NCOs. Data are displayed for man 8 with the upper graph showing events detected by amplifying the parental haplotype bearing the 0442T allele (white circles) and screening for the presence of markers from the parental haplotype bearing the 0442C allele (black circles), and vice versa for the lower graph. The hotspot region is shaded in grey.
Variation in the frequency of NCOs and COs between men. Frequencies are shown with associated 95% confidence intervals and the PRDM9 status of each man is indicated as in Figure 3. Using arcsine-transformed data, there is a weak correlation between NCO and CO frequency but of only borderline significance (PMCC r = 0.553, P = 0.0502). There is no correspondence between PRDM9 homozygosity and the ratio of NCO to CO in these men (Mann–Whitney U-test, P > 0.05).
Variation in the frequency of NCOs and COs between men. Frequencies are shown with associated 95% confidence intervals and the PRDM9 status of each man is indicated as in Figure 3. Using arcsine-transformed data, there is a weak correlation between NCO and CO frequency but of only borderline significance (PMCC r = 0.553, P = 0.0502). There is no correspondence between PRDM9 homozygosity and the ratio of NCO to CO in these men (Mann–Whitney U-test, P > 0.05).
Qualitative as well as quantitative NCO differences between men
Not only did the NCO:CO ratio differ between men but also the level of biased gene conversion seen at SNP 0442 in COs versus NCOs, with COs showing a fairly constant 2:1 C:T bias across all men but with NCOs showing biases ranging from ∼1:1 to >85:1 (Fig. 5A). Curiously, men with the highest NCO:CO ratios tended to show the strongest conversion bias specifically in NCOs (Fig. 5B). For example, man 3 showed similar 0442 bias in NCOs and COs and ∼2.5 times fewer NCOs than COs, whereas man 22 showed extreme 0442 bias in NCOs and around 3.5 times more NCOs than COs. Comparison of conversion tracts in these two men revealed further differences: 23% of NCOs (13/57) had co-converted at SNP rs56750753 (‘753’) in man 3, versus only 4% (3/78) in man 22 (Fig. 5C, Fisher exact test, P = 0.001). Interestingly, co-conversion at SNPs 0442 and 753 implies a tract length of at least 313 bp, very similar to mean conversion tract lengths seen in COs (see above). Despite these differences between NCOs, the distribution of COs between these two men was nonetheless comparable (Fisher exact test, P = 0.792).
TD at SNP 0442 and the CO/NCO decision. (A) Variation in the SNP 0442 C:T transmission ratio among sperm NCOs and COs in different men. Circle size reflects the number of each type of recombination event scored for a given man, as shown in the key below. Men showing significant differences in the C:T ratio in COs versus NCOs are shown in black. For clarity, 95% CIs are shown only for men 3 and 22. Man 22 showed extreme bias, with all NCO events involving transfer of 0442C to the 0442T haplotype; we conservatively plotted this assuming one NCO event in the opposite direction. (B) Relationship between NCO:CO ratios and the relative C:T transmission bias in NCOs versus COs in different men. Extreme TD in NCOs is significantly associated with higher NCO:CO ratios (Mann–Whitney U test, P < 0.01). (C) Men 3 and 22 show a significant difference in the structures of NCO events involving 0442. Both men are also heterozygous at hotspot marker rs56750753 (‘753’), allowing single and co-conversion events to be scored. The structures of the four types of NCO events are shown with marker alleles from the parental haplotype bearing 0442C in black. The numbers of convertants of each type, together with the numbers of sperm molecules screened, are indicated to the right. Man 22 was also heterozygous for an intervening SNP rs306882; this marker was ignored when scoring single-site conversions at SNP 0442 (hatched circles). No NCOs were scored in one orientation for man 22, although CO frequencies estimated from this assay were indistinguishable between the two orientations (9/17 865 versus 9/10 729, Fisher's exact test, P = 0.33).
TD at SNP 0442 and the CO/NCO decision. (A) Variation in the SNP 0442 C:T transmission ratio among sperm NCOs and COs in different men. Circle size reflects the number of each type of recombination event scored for a given man, as shown in the key below. Men showing significant differences in the C:T ratio in COs versus NCOs are shown in black. For clarity, 95% CIs are shown only for men 3 and 22. Man 22 showed extreme bias, with all NCO events involving transfer of 0442C to the 0442T haplotype; we conservatively plotted this assuming one NCO event in the opposite direction. (B) Relationship between NCO:CO ratios and the relative C:T transmission bias in NCOs versus COs in different men. Extreme TD in NCOs is significantly associated with higher NCO:CO ratios (Mann–Whitney U test, P < 0.01). (C) Men 3 and 22 show a significant difference in the structures of NCO events involving 0442. Both men are also heterozygous at hotspot marker rs56750753 (‘753’), allowing single and co-conversion events to be scored. The structures of the four types of NCO events are shown with marker alleles from the parental haplotype bearing 0442C in black. The numbers of convertants of each type, together with the numbers of sperm molecules screened, are indicated to the right. Man 22 was also heterozygous for an intervening SNP rs306882; this marker was ignored when scoring single-site conversions at SNP 0442 (hatched circles). No NCOs were scored in one orientation for man 22, although CO frequencies estimated from this assay were indistinguishable between the two orientations (9/17 865 versus 9/10 729, Fisher's exact test, P = 0.33).
Discussion
We describe the first in-depth analysis of recombination in the minor human PAR, PAR2, more than 20 years after its discovery (32). European population diversity data together with high-resolution mapping of familial recombinants showed that most CO events in PAR2 co-localize to an interval encompassing the promoter region of the most proximal gene, SPRY3. Sperm DNA typing, on more than 1 million haploid genome equivalents from 29 men, resulted in characterization of 3000 recombinant gametes and localized recombination to a ∼1 kb wide hotspot active in both CO and NCO. The tight clustering of both types of event is similar to that seen at autosomal hotspots (1,48,52) and at a hotspot characterized in PAR1 (53). These findings allay early concerns that conventional COs might not occur in this human-specific PAR (33). The activity of the SPRY3 recombination hotspot is similar to that of the hottest human hotspots, but nonetheless, the maximum CO frequency seen in this study (0.8%) is entirely compatible with the more limited cytological studies of the XY bivalent that have failed to reveal definitive signs of CO such as chiasmata or MLH1 foci in PAR2.
Kauppi et al. (54) have recently shown that the appearance of RAD51/DMC1 foci in the mouse PAR, which mark the DSBs that mediate pairing and successful exchange, are temporally delayed compared with the rest of the genome, possibly being predominantly induced by Spo11α rather than the conventional Spo11β isoform. They speculated that this conserved splice form might also be responsible for DSBs in the human major PAR, PAR1. Cytological studies have shown that human X–Y pairing is initiated at PAR1, and that it can go beyond the region of homology and even lead to pairing at PAR2 (36–39). It is therefore possible that DSBs within PAR2 might also be induced predominantly by Spo11α rather than Spo11β. Our data cannot address this issue but do suggest that recombination in PAR2 is subject to the same types of cis- and trans-acting factors as the autosomes. For example, both CO and NCO frequencies at the SPRY3 hotspot are strongly influenced by PRDM9 status, suggesting similar hotspot activation mechanisms (possibly chromatin remodelling) in PAR2 and the rest of the genome. In addition, we have shown that recombination at the SPRY3 hotspot is also modulated in cis by a motif-disrupting SNP that appears to down-regulate CO frequency. The scale of this suppression, and the resulting displacement in the locations of reciprocal COs, is very similar to that seen at autosomal hotspots, strongly suggesting that downstream processing of PAR2 and autosomal DSBs is likely to be the same too, at least for COs.
Motif-disrupting variants like 0442C are systematically over-transmitted to CO progeny at other human hotspots too (12,13,49). This constitutes a form of meiotic drive that is predicted to lead to the demise of the hotspots as the recombination-suppressing variants become fixed in the population (55). While the degree of overtransmission to COs at the SPRY3 hotspot is relatively modest (66:34 in favour of the motif-disrupting allele), it is significantly enhanced to 74:26 by the stronger bias seen in NCOs. The very high activity of this hotspot predicts a 0442 C:T gametic ratio of 50.13:49.87 and thus the most extreme level of meiotic drive seen to date at a human hotspot (Table 1). Population simulations indicate that the recombination-suppressing C allele is not only guaranteed to go to fixation (P > 0.999), leading to hotspot attenuation, but will do so rapidly, within ∼60 000 ± 16 000 years, ∼10% the time required for an undriven allele. Thus, NCOs can play a significant role in determining the lifespan of a recombination hotspot (17).
Comparison of the nature, extent and consequence at the population level of TD seen among COs at human recombination hotspots
| Hotspota | TD at central SNP | P-valueb | Central SNP | Approx. dist. from centre | Over- transmitted allelec | Mean CO freq | Gametic ratiod |
|---|---|---|---|---|---|---|---|
| NID1 | 0.740 | <0.001 | Y | 70 bp | T(der) | 0.03% | 50.009:49.991 |
| DNA2 | 0.870 | <0.001 | R | 5 bp | G(der) | 0.004% | 50.00049:49.99951 |
| B | 0.810 | <0.001 | S | 68 bp | C(der) | 0.04% | 50.012:49.988 |
| J1 | 0.850 | <0.001 | R | 63 bp | A(anc) | 0.04% | 50.014:49.986 |
| S2 | 0.743 | <0.001 | R | 60 bp | A(anc) | 0.08% | 50.020:49.980 |
| SPRY3 | 0.660 | <0.0001 | Y | 45 bp | C(der) | 0.31% | 50.049:49.950 |
| SPRY 3 including NCOs: | 50.132:49.870 | ||||||
| Hotspota | TD at central SNP | P-valueb | Central SNP | Approx. dist. from centre | Over- transmitted allelec | Mean CO freq | Gametic ratiod |
|---|---|---|---|---|---|---|---|
| NID1 | 0.740 | <0.001 | Y | 70 bp | T(der) | 0.03% | 50.009:49.991 |
| DNA2 | 0.870 | <0.001 | R | 5 bp | G(der) | 0.004% | 50.00049:49.99951 |
| B | 0.810 | <0.001 | S | 68 bp | C(der) | 0.04% | 50.012:49.988 |
| J1 | 0.850 | <0.001 | R | 63 bp | A(anc) | 0.04% | 50.014:49.986 |
| S2 | 0.743 | <0.001 | R | 60 bp | A(anc) | 0.08% | 50.020:49.980 |
| SPRY3 | 0.660 | <0.0001 | Y | 45 bp | C(der) | 0.31% | 50.049:49.950 |
| SPRY 3 including NCOs: | 50.132:49.870 | ||||||
bTypical probabilities in each man tested that alleles at the central SNP showed 50:50 transmission to reciprocal COs.
c‘der’, derived; ‘anc’, ancestral.
dGametic ratios in favour of the over-transmitted allele. The gametic ratio for the SPRY3 hotspot taking NCOs as well as COs into account is shown in bold.
This study provides the largest survey to date of NCOs at a human recombination hotspot. The ratio of NCOs to COs at SPRY3 is surprisingly variable, and is comparable to that seen between other human hotspots surveyed in just one or two men. Until now, these differences in ratio between hotspots have been interpreted as different hotspot-specific biases in resolution of recombination-initiation events (15). However, it is possible that this apparent variation between hotspots reflects variation between men at each hotspot, as seen at SPRY3. This can be tested by more extensive surveys of NCO activity at autosomal hotspots.
If variable NCO:CO ratios prove to be unique to the SPRY3 hotspot, then this would provide evidence for a PAR2-specific process of recombination. One possibility is that successful reciprocal exchange in PAR1 on the short arms of X and Y confers strong CO interference and diverts the repair of DSBs in PAR2 down a NCO route. If so, then men who exhibit more COs than NCOs at SPRY3 may be less proficient at exchange within PAR1 than those that show the opposite trend at this hotspot. Thus, the balance of CO and NCO events at SPRY3 for a given man might be related to successful reciprocal exchange in the major PAR. Unfortunately, our existing sperm data from PAR1 do not allow us to test this as they are on limited numbers of men who do not overlap the men tested at SPRY3 (53).
Variation in the NCO:CO ratio at SPRY3 appears to be associated with differences in the degree of TD at SNP 0442, particularly in NCOs. This could reflect both quantitative and qualitative variation in the biochemical behaviour of the NCO pathway, or alternatively indicate that NCOs are produced by two distinct pathways. Since there is no significant difference in TD between men among COs, we assume that this distortion most closely reflects the initiation bias caused by the different 0442 SNP alleles. Men with low NCO:CO ratios show similar levels of bias in NCOs and COs, plus relatively long conversion tracts in NCOs apparently comparable to those seen in COs. These data are compatible with COs and NCOs in these men being produced from the same recombination intermediate, for example by alternative resolution of dHJs mediated by different resolvase complexes (56). Both types of event would then have to be subject to the same heteroduplex repair bias to generate the same level of TD seen in COs and NCOs. In contrast, men with a high NCO:CO ratio show greater transmission bias and shorter conversion tracts in NCOs but not apparently in COs. This might reflect a predominant and distinct NCO pathway that is variable in activity between men and differentially repairs heteroduplex DNA compared with the CO pathway, leading to greater TD. It is tempting to speculate that the SDSA pathway (23) might fulfil this role. Interestingly, a recent genome-wide analysis of recombination intermediates challenged current thinking by indicating that at least two separate pathways are frequently used to generate NCO events in budding yeast (57).
We have presented data that for the first time implicate differential processing in the formation of NCOs in humans. The factors that influence the frequency and nature of such events have yet to be elucidated, but may come to light when NCOs are similarly investigated in detail at other human hotspots.
MATERIALS AND METHODS
DNA samples
North European semen samples were collected with informed consent and approval from Leicestershire Health Authority Research Ethics Committee. Collection of African samples is detailed in ref. (58). DNAs were extracted and handled to minimize contamination risk (45), and concentrations quantified on a NanoDrop 1000 spectrophotometer. Aliquots (40 ng) of each DNA were whole-genome amplified using the GenomiPhi HY DNA amplification kit (GE Healthcare Bio-Sciences) and the resulting products used for routine genotyping. Recombination assays were performed on high molecular weight total genomic DNA.
SNP genotyping
Genotypes were established by ASO hybridization against dot blots of suitable PCR products (45). The PAR2 markers typed in this study are listed in Supplementary Material, Table S1 and details of the primers and PCR conditions are listed in Supplementary Material, Table S2.
Population analysis
Historical recombination activity was estimated by coalescent-based analysis of genotype data using LDhat (46). Historical hotspot locations were inferred from metric LD maps generated using LDMAP (47).
Sperm CO assays
Semen donors with suitable SNP heterozygosities flanking the region of LD breakdown were chosen for analysis. Allele-specific primers for rs5940571, rs17653343 and rs9645286 were designed and initially tested on DNA from men homozygous for one or the other SNP allele at various annealing temperatures to achieve maximum discrimination and PCR efficiency. For each man, linkage phases were established by ASO hybridization against allele-specific PCR products. CO molecules were selectively amplified by repulsion-phase allele-specific long PCR (Supplementary Material, Table S2). Specifically, for a given man and orientation of COs, between 94 and 190 PCR reactions each containing typically 2.5 ng of DNA were amplified using the buffer described previously (59) supplemented with 12 mm Tris base, 0.2 μm of each primer, 0.03 U/μl Taq polymerase (KAPA Biosystems) and 0.0015 U/μl cloned Pfu polymerase (Agilent Technologies). Secondary allele-specific PCRs were seeded with 1/86th vol primary PCR products, and CO-positive PCRs identified by agarose gel electrophoresis. CO breakpoints were mapped by reamplifying the secondary PCR products using internal universal primers and typing the resulting products by ASO hybridization. The numbers of each type of CO molecule observed were Poisson adjusted to account for PCR reactions containing more than one CO, and a single-molecule PCR efficiency of 50% (one amplifiable molecule of each haplotype per 12 pg sperm DNA) was assumed when calculating CO frequencies (45).
Detecting sperm NCOs
For a given man, molecules derived from one parental haplotype were amplified from between 187 and 457 pools of sperm DNA containing typically 38 amplifiable molecules each (range ∼20–50), using two rounds of PCR with allele-specific primers directed to SNPs on one side of the hotspot in conjunction with universal primers on the other side of the hotspot (Supplementary Material, Table S2). COs and NCOs were simultaneously detected against the background of non-recombinant molecules by ASO hybridization using probes corresponding to the opposite parental haplotype from that selectively amplified in the PCR reactions. Single-molecule PCR efficiencies and Poisson corrections were as for COs.
Assigning proportion of variance in recombination activity to PRDM9 and 0442
Linear regression was undertaken on inverse-normal transformed CO frequencies using PRDM9 and 0442 status as covariates. An additive effect was assumed for each PRDM9 A allele, and similarly for each 0442C allele.
Hotspot sequencing
Separated haplotypes generated using allele-specific primers (see Supplementary Material, Table S2) from each of the men in this study were used in standard BigDye Terminator v3.1 Cycle Sequencing reactions (Applied Biosystems) with primers 5′-GGCTGAGTAGATTGGTAGA-3′ and 5′-AGTGGCGTCATCCAGATGA-3′. Extension products were purified using Performa Gel Filtration columns (EdgeBiosystems) and run on a 3730 DNA Analyzer (Applied Biosystems). Potential additional cis-acting effects were evaluated by rank order analysis controlling for PRDM9 and 0442 status.
Meiotic drive simulations
Population simulations were carried out as described previously (12) taking into account the biased transmission of 0442 in both COs and NCOs averaged over five PRDM9 A/A homozygotes. The chances of fixation of the favoured allele and mean time to fixation were estimated assuming a starting population frequency of 50%, an effective human population size Ne of 10 000 and a generation time of 20 years. The same parameters were also estimated assuming no drive of the favoured allele.
SUPPLEMENTARY MATERIAL
Supplementary Material is available at HMG online.
FUNDING
This work was supported by the Medical Research Council (grant number G0601068).
ACKNOWLEDGEMENTS
We thank J. Wetton and colleagues for helpful discussions.
Conflict of Interest statement. None declared.

![Recombination activity in PAR2 inferred from population and pedigree data. (A) Published historical recombination map of PAR2 as estimated by coalescent analysis of CEU genotypes from phase II HapMap, with peaks of historical activity arrowed. This map does not cover the proximal and distal regions of PAR2 (shaded). Coding sequences and the PAR2 boundary PAB2 are indicated below. (B) Mapping de novo recombination events identified in CEPH pedigrees, with individual identifiers to the right. All four events had been previously mapped to the interval between markers JXYQ and sDF-1 (40,41) shown by thin lines between triangles. Further mapping localized the events to the regions marked by black boxes. Partial pedigree data for the two key markers for family 1332 are shown beneath; the recombinant child 133206 (asterisk) receives his father's Y-allele at rs700442, but his father's X-allele at rs71190340. [Note: despite a previous report (41) we were unable to identify a recombinant child within CEPH pedigree 1416.] (C) Metric LD maps over a 15 kb interval encompassing the 5′ end of SPRY3, established from high-resolution SNP genotype data from 97 men of North European descent (filled circles) and 74 men of Southern East-African descent (open circles). Co-ordinates are given for the X chromosome (NCBI36/hg18). (D) Cumulative frequency distributions of sperm COs from seven men are shown for the assay interval (0 kb = chrX:154,644,494), with filled symbols representing Europeans and open symbols Africans. The least-squares best-fit cumulative normal distribution is shown by the black curve. Note that despite an apparent double LD hotspot in Africans (Fig. 1C), there is only one active hotspot in the African men tested over this interval.](https://oup.silverchair-cdn.com/oup/backfile/Content_public/Journal/hmg/21/9/10.1093_hmg_dds019/2/m_dds01901.jpeg?Expires=1500654801&Signature=IXyE89ofTCDIwlz0fIemyELRh9XnUcv3vjhTRgYLul-QqeM7D9OOR~pToI75v6zcvyjKjJCPgE6bmSUfLgEITY6J0S-9VCBxeBce0jRAwp95iPVze1kqOvYOR26Fv~PsnYOtGm530CxHraOaZCN0P3jm2GGwoc0lzrcewhbYuRIfMIOYM-MJlovd4QEJX958f2dmvNur4lE~UidAIe802Hu-aoMUKB1-cPACiDs0SASCNcthEXdIpdzEQAtHPW8TDMCmJ10ilX9suouD~AxPcwBxfAqvNSP-xwSy8LKIR46RMz8V9ZvoRCcuYTIGwkxt7skapQIlxXidhTwnJWZZYw__&Key-Pair-Id=APKAIUCZBIA4LVPAVW3Q)



