We analyzed the polymorphic (CAG)n and (GGC)n repeats of the androgen receptor gene in 113 unrelated X-linked spinal and bulbar muscular atrophy (SBMA) X chromosomes and 173 control X chromosomes in Japanese males. The control chromosomes had an average CAG repeat number of 21 ± 3 with a range from 14–32 repeat units, and SBMA chromosomes had a range from 40–55 with a median of 47 ± 3 copies. The control chromosomes had seven different alleles of the (GGC)n repeat with the range of 11 to 17; the most frequent size of (GGC)n was 16 (79%), while (GGC)17 was very rare (1%). However, in SBMA chromosomes only two alleles were seen; the most frequent size of (GGC)n was 16 (61%) followed by 17 (39%). (GGC)n size distribution was significantly different between SBMA and control chromosomes (P <0.0001), indicating the presence of linkage disequilibrium. There was no allelic association between the (CAG)n and (GGC)n microsatellites among control subjects as well as SBMA patients, which suggests that a founder effect makes a more significant contribution to generation of Japanese SBMA chromosomes than new mutations.
Several neurogenetic diseases caused by triplet repeat mutations including myotonic dystrophy (DM) (1), fragile X syndrome (Fra X) (2,3), spinocerebellar ataxia type 1 (SCA1) (4), Machado-Joseph disease (MJD) (5) and Huntington's disease (HD) (6–9), dentatorubral and pallidoluysian atrophy (DRPLA) (10) have been reported to be associated with founder chromosomes, based on linkage disequilibrium to flanking polymorphic markers. A recent study has suggested that new mutations occur in a subgroup of unstable high-normal triplet repeat alleles with a particular founder chromosomal haplotype (7–10).
The molecular abnormality of X-linked spinal and bulbar muscular atrophy (SBMA) is expansion of a triplet repeat, (CAG)n, located in the first exon of the androgen receptor (AR) gene (11). Another triplet repeat, (GGC)n is located approximately 1.2 kb downstream from (CAG)n in the first exon of the AR gene (12).
In order to investigate the origin of the SBMA mutations in the Japanese population, we performed (CAG)n and (GGC)n analysis of the AR gene locus in unrelated SBMA and normal X chromosomes in Japanese males, and found linkage disequilibrium between (GGC)n haplotype and (CAG)n mutation. We further investigated the distribution of (CAG)n repeat sizes in a large cohort of Japanese SBMA patients and controls and analyzed the association between the two microsatellites on the normal and SBMA chromosomes separately.
One hundred and fifty-three Japanese control chromosomes displayed an average CAG repeat number of 21 ± 3 with a range from 14–32 repeat units, 20 repeat units being the most common (Table 1, Fig. 1). The analysis of 113 SBMA chromosomes revealed 15 alleles ranging in size from 40–55 with a median of 47 ± 3 (CAG)n copies (Table 2, Fig. 1).
The range of GGC repeat sizes was from 11–17 repeats in 173 controls, while restricted to 16 and 17 repeats in 92 SBMA chromosomes (Table 1, 2; Figs 2, 3). In both control and SBMA samples, the most frequent allele size of the (GGC)n repeat was 16; 137 (79%) of 173 control chromosomes and 56 (61%) of 92 SBMA chromosomes (Table 1, 2; Fig. 3). Although there were (GGC)17 in over one third of SBMA chromosomes (39%), only 1% of control chromosomes exhibited (GGC)17 (Fig. 3). Comparing SBMA to all controls, the repeat length distributions were significantly different for GGC (χ2 = 81.2, P <0.0001 with d.f. = 6), consistent with linkage disequilibrium (Fig. 3).
There was no allelic association between the (CAG)n and (GGC)n microsatellites among control subjects (Kruskal-Wallis test: 6 d.f., P = 0.18) (Table 1), although the distribution variance of CAG repeat size with an allele of (GGC)16 was significantly wider than that with (GGC)11–15,17 (F test, F = 0.57, P <0.05) (Table 1). There was also no allelic association among SBMA subjects (Mann-Whitney U test: U = 790.5, P = 0.45) (Table 2).
A cohort of 153 Japanese control chromosomes displayed an average CAG repeat number of 21 ± 3 with a range from 14–32 repeat units (Table 1, Fig. 1), which is consistent with a previous report of 39 Asian controls with an average repeat number of 22 ± 3 ranging in size from 15–29 (13) and those of 37 Caucasian control population with 16–26 with a median of 22 ± 3 (14). The 113 SBMA chromosomes ranging in size from 40–55 with a median of 47 ± 3 (CAG)n copies was also consistent with a previous report of a mixed SBMA population with 47 ± 4 (CAG)n (14), although the range of size for Caucasians was somewhat larger (40–62). The expanded CAG repeats of the SBMA AR gene in the present study showed a narrower range of distribution (16 different repeats, 40–55) than the other CAG repeat diseases (35–50 different repeats in HD, SCA1 and DRPLA, except 21–24 repeats in MJD) (15–17). These findings also indicate relative stability of (CAG)n in SBMA, consistent with the relatively narrow meiotic instability (14) and relatively narrow instability in somatic tissue mosaicism (18) that has been previously reported.
The size-frequency distribution of the GGC repeats significantly differs among the various racial-ethnic control subjects (13). In one series, 70% of Asian controls had (GGC)16 compared to 57% Caucasian and 20% African-American controls, and very few Asians (3%) had alleles longer than 16 which otherwise were common [32% (13) or 42% (11)] in Caucasians. The remaining 27% had (GGC)n ranging from 10–15 (13). The distribution of GGC repeat sizes in our Japanese control subjects (Table 1, Fig. 3) is consistent with reported Asian controls (13).
We observed no association between the two microsatellites among control subjects (Kruskal-Wallis test: 6 d.f., P = 0.18) (Table 1). These findings are consistent with the previously reported results of lack of allelic association between (CAG)n and (GGC)n microsatellites among normal subjects in any of the racial groups (13,19), which indicate that either one or both of the repeats mutate occasionally. The rate of mutation at the CAG site was measured using single-cell assays of sperm and a rate of 1–4% was obtained (20). Such mutations would lead to lack of allelic association between (CAG)n and (GGC)n microsatellites among normal subjects despite the close proximity of the two microsatellites.
Furthermore, we observed no association between the two microsatellites among SBMA patients (Table 2). The mutation rate at the expanded (CAG)n allele in sperm is more than that of control; the average expansion was 2.7 repeats (21), and average gain of (CAG)n repeat units in paternal transmission was 1.4 in SBMA (22), which would account for the lack of allelic association between the two microsatellites among SBMA patients (Table 2).
Our results demonstrate that the distribution of (GGC)n repeat size differs between control and SBMA chromosomes (χ2 = 81.2, P <0.0001 with d.f. = 6) (Fig. 3), providing evidence for linkage disequilibrium in SBMA. The linkage disequilibrium has three possible explanations. First is the presence of a founder effect in the SBMA mutation; second is the generation of new (GGC)n microsatellite mutations with expansion of the (CAG)n repeat; third is susceptibility to pathologic expansion of the (CAG)n repeat associated with the (GGC)n alleles with 16 and 17 repeats. The second explanation has been suggested for fragile X syndrome (3). Large FMR-1 CGG repeats may be associated with new adjacent microsatellite mutations, which lead to microsatellite heterogeneity in patients and a different distribution in controls and patients (3). In the present study, control chromosomes showed seven different (GGC)n haplotypes while SBMA chromosomes showed less heterogeneous distribution of only two (GGC)n types (Fig. 3), which could not be explained by this hypothesis. The third explanation has been suggested for HD and DRPLA especially as a multi-step model (7–10). In this model the expanded triplet repeats evolves from a founder chromosomal haplotype with long-normal repeats and repeat number further increases occasionally. In HD, a block of CCG repeats lies immediately adjacent to the block of CAG repeats which is responsible for the disease and a significant inverse relationship between CCG and CAG repeat size on normal chromosomes has been demonstrated (9). A CCG of 7 repeats is overrepresented on HD chromosomes and associated with long-normal (CAG)n alleles; new mutations for HD likely arise from these long-normal alleles (9). In the present study, however, we could not observe allelic association between the (CAG)n and (GGC)n microsatellites among control subjects (Kruskal-Wallis test: 6 d.f., P = 0.18) (Table 1), this being consistent with the previous reports (13,19). Single sperm data also show that expansion is less common than contraction in long-normal CAG alleles in the AR gene (20), which is markedly different from that at the HD locus; expansion is more frequent in the long-normal allele (30 repeats) and in the intermediate allele (36 repeats) (23). These findings suggest that new mutations for SBMA are more unlikely to arise from long-normal CAG alleles than for HD. However, since divergence of CAG repeat distribution with (GGC)16 in the control chromosomes was significantly wider than that with other GGC haplotypes (F test, F = 0.57, P <0.05) (Table 1), it is possible that Japanese SBMA mutation could have arisen more than once on (GGC)16 background. The (GGC)17 allele is common in control Caucasian subjects [32% (13) or 42% (12)], whereas it was rare (1%) in Japanese control subjects. Nonetheless there is no evidence that SBMA is more prevalent in the Caucasian than in the Japanese population. Japanese SBMA with (GGC)17 could have been introduced from Caucasian ancestry, although there is no evidence to support such a view.
Based on this evidence, a founder effect may make a more significant contribution to linkage disequilibrium in (CAG)n-(GGC)n haplotypes in Japanese SBMA subjects than new mutations.
In the original report of (CAG)n expansion in SBMA, there was no evidence of allelic association with a nearby restriction endonuclease polymorphism in an ethnically diverse population (11). Later, an absence of linkage disequilibrium between SBMA and the (GGC)n repeat was reported in ethnically mixed subjects (24). In ethnically mixed control chromosomes, in contrast to Japanese ones, there are two main alleles at 16 and 17 copies of GGC, which amount to about 90% of all control chromosomes (24). Similar results in Caucasian normal subjects were obtained (12,13). This (GGC)n size distribution in normal Caucasian subjects could have masked linkage disequilibrium between SBMA and (GGC)n and account for the different results.
Our results, demonstrating a founder effect for SBMA in a Japanese population, indicates that de novo pathological expansion is probably rarer than in other triplet repeat diseases, despite instability in normal and mutant chromosomes.
Materials and Methods
The Japanese control and SBMA subjects were drawn from widely dispersed geographical areas of Japan.
For determining the GGC repeat sizes by polymerase chain reaction (PCR), we used previously described methods (12,13) with some modifications: the reaction volumes were 10 µl containing ∼150 ng of purified genomic DNA, 0.4 µM of each fluorescein-labeled primer flanking the GGC repeat in exon 1 of the AR gene (5′-ACACTCTCTTCACAGCCGA-3′) and ( 5′-ACTGGGATAGGGCACT CTGCT-3′), 1 × PCR buffer (50 mM KCl, 10 mM Tris-HCl (pH 8.3), 1.5 mM MgCl2), 200 µM dATP, dCTP and dTTP, 50 µM dGTP, 150 µM7-deaza dGTP and 0.1 U of Taq polymerase (Takara, Japan). The Taq polymerase was added to the PCR reactions after the genomic DNA had denatured at 95°C for 5 min; PCR conditions were 35 cycles of denaturation at 95°C for 1 min, annealing at 60°C for 1 min, and elongation at 72°C for 1 min, with a final extension of 7 min. The PCR products were analyzed by electrophoresis in 6% Hydro-Link Long Ranger gels (AT Biochem, PA, USA) with an autoread sequencer (ALFred, Pharmacia, Sweden). Their sizes were determined by comparison to M13 DNA dideoxy sequencing ladders. The size of (CAG)n was determined in SBMA subjects and controls as previously described (25).
The size of (GGC)n was analyzed for association with the SBMA mutation using a χ2 test. In addition, to examine the association between CAG and GGC size, Kruskal-Wallis test was used among controls and Mann-Whitney U test was used among SBMA subjects. The distribution variance of CAG repeat size among controls was analyzed using an F test.
Part of this work was supported by grants from the Ministry of Welfare and Health of Japan, the Uehara Memorial Research Foundation, the Muscular Dystrophy Association and the National Institutes of Health.