We report the identification of a partial duplication of GABRA5, a gene within the imprinted 15q11–q13 region. The duplicated locus maps to the pericentromeic region of 15q, proximal to the large deletions associated with Angelman and Prader-Willi syndromes. We also observed variation in the number of copies of this locus in different individuals, indicating that the duplication is part of a variable repeat. Investigation of the duplication in individuals with a normal karyotype revealed between one and four copies of the repeat on each chromosome 15, whereas from eight to 20 copies were found in individuals possessing a cytogenetically detectable elongation of the 15q region. The variable region is roughly 1 Mb in size and contains two other non-processed duplications, the immunoglobulin heavy chain (IgH) D segment gene and the neurofibromatosis type 1 (NF1) gene. One unit of the pericentromeric repeat is thus composed of duplications of genes from different chromosomal regions. Moreover, we have found replication asynchrony across the GABRA5 duplication, suggesting for the first time that the imprinted part of chromosome 15q extends proximal of the region commonly deleted in Angelman and Prader-Willi syndromes.
Chromosome 15q11–q13 is an imprinted region of the human genome and is characterized by allele-specific transcription, DNA replication asynchrony and allelic methylation. The extent of the imprinted region is defined by the large cytogenetic deletions associated with the oppositely imprinted disorders, Prader-Willi syndrome (PWS) and Angelman syndrome (AS) (1,2). In patients with large deletions, the deletions are remarkably homogeneous in size, with a distal breakpoint lying within a yeast artificial chromosome (YAC) across D15S12 (IR10), and a proximal breakpoint lying either just distal, or just proximal to D15S18/542 (2–4). In addition to these large interstitial deletions, there have been descriptions of interstitial duplications and inverted duplications involving material from proximal 15q. Generally, duplications of the Angelman/Prader-Willi syndrome region (APWR) can be associated with normal or abnormal phenotypes (5–8), while duplications involving the region of chromosome 15 proximal to D15S18 do not seem to correlate with an abnormal phenotype (9–14). However, a precise phenotype-genotype correlation has been difficult to establish, owing to differences in the size and parental origin of the duplicated region, and to the poor characterization of these rearrangements due to a paucity of probes proximal to D15S18 (IR39).
Our laboratory previously has reported the mapping of a cluster of GABAA receptor subunit genes (GABRB3, GABRA5 and GABRG3) to the imprinted region on chromosome 15 (15–18). The data concerning the imprinted expression of the 15q11–q13 GABAA receptors are contradictory (19,20; C. Rougeulle and M. Lalande, submitted. While it is not clear whether these genes show allele-specific expression, as might be expected for imprinted) genes, the chromosomal segment containing the GABAA receptor genes displays DNA replication asynchrony, a property of imprinted regions (21–23).
In the present report, a duplication of the GABRA5 gene has been identified and mapped to proximal 15q. The duplication lies outside the large region most frequently deleted in AS and PWS individuals and proximal to D15S18, and may be the most proximal marker on 15q yet reported. We found that the GABRA5 duplication is contained within a large, low level repeat spanning ∼1 Mb. The repeat contains two other non-processed gene duplications, is variable in copy number and is greatly amplified on chromosomes with an elongation of the proximal 15q region. Finally, as observed in other imprinted regions, the GABRA5 duplication, within the large repeat, shows asynchronous replication, suggesting that the imprinted region of 15q extends proximal to the region commonly deleted in PWS and AS.
Identification of a duplication of the GABRA5 5′ region
During the process of constructing a detailed physical map of GABRA5 (18), we observed hybridization to a second site in genomic DNA when using a probe across exon 1B of GABRA5. Preliminary mapping of the duplication to proximal chromosome 15 was performed using somatic cell hybrids (data not shown). A 7.5 kb HindIII genomic clone encompassing the duplicated region subsequently was isolated from a size-selected bacterio-phage library using a probe from GABRA5 exon 1B. Nucleotide sequencing of the clone revealed a 5.0 kb duplication (gb:AF061786, nucleotides 178–5269) of the GABRA5 5′-untranslated region (5′-UTR) (gb: AF061785, nucleotides 1324–6899), including the promoter region, the three alternatively spliced first exons (1A, 1B and 1C) and exon 2 (Fig. 1c) (24). Homology between GABRA5 and its duplication was relatively low, with the exons showing slightly higher homology than the introns, suggesting that the duplication had not arisen recently in evolution (Table 1). At relatively low stringency washing (0.5× SSC/0.1% SDS, 65°C), probes from either of the two loci detect both loci, although, at higher stringency (0.1× SSC/0.1% SDS, 65°C), the same probes detect only their own locus (data not shown).
The duplicated region is flanked by Alu elements (gb:AF061786, nucleotides 178–396 and 5281–5527) and contains a (CA) microsatellite which lies just 5′ of the GABRA5 locus (Fig. 1c). The Alus at this locus are all arranged in a head-to-tail orientation (Fig. 1c), and belong to old Alu sub-families: Sb, or Sc in the case of the middle Alu. However, each Alu also contains sequence differences that do not appear to be specific to any specific Alu family, which may reflect random mutation at this locus. No direct repeats, which might indicate recent retrotransposition, were found at any of the Alus in the duplicated region.
GABRA5 duplication maps to proximal AS/PWS deletions of chromosome 15q
The location of the new locus was determined by fluorescence in situ hybridization (FISH) using the P1 clones (π765, π766, π767) or bacteriophage clones (λ884/5, λ954) from the region (Fig. 1b). Co-hybridization of probe π766 (green) and D15S18 (red) to normal metaphase chromosomes confirmed our localization to chromosome 15 by somatic cell hybrid analysis and also showed that it maps proximal to D15S18 (Fig. 2a). The localization to proximal 15q was refined by hybridization of a bacteriophage λ954 (green) spanning the GABRA5 duplication and chromosome 15 centromeric probe D15Z1 (red) to metaphase chromosomes from cells from a class I deletion AS individual (WJK36) (Fig. 2b) who displays a heterozygous deletion of D15S18 and the more distal markers in 15q11–q13 (3). A signal from probe λ954 (green) is detected on each chromosome 15 homolog (Fig. 2b). Thus, not only does the duplicated region lie proximal to D15S18, but it also lies proximal to the typical deletions associated with AS and PWS (Fig. 1a).
The GABRA5 duplication is part of a variable repeat
In the course of characterizing the GABRA5 duplication, we observed apparent differences in the copy number of the duplication on different chromosomes. First, PCR across the duplicated (CA)n repeat yields a very complex banding pattern suggesting the presence of multiple loci (25). Second, P1, bacteriophage and plasmid probes from across the duplicated region gave signals of different intensity or size, or gave several signals, when used in FISH analysis on chromosomes from different individuals (Fig. 2a and b). To investigate this possible variation in the copy number of the duplication, we performed FISH analysis on released chromatin from nuclei from 10 cytogenetically normal individuals. We found that the number of repeats ranged from one to four in these individuals. For example, probe λ885 (green) detects two green signals, on each chromosome 15 (red) in the individual B (Fig. 2c). In addition, FISH using probes λ885(green) and λ954 (red) on interphase nuclei allowed determination of the number of repeats. Each repeat is identified by a green and a red signal or, when the signals overlap, a yellow signal. In individual C, two repeats are seen on one chromosome and four repeats on the other (Fig. 2d). The number of repeats was then analyzed in individuals who possess a cytogenetic elongation of the proximal 15q region, and we found that these individual possessed a larger than normal number of repeats. For example, using probe λ885 (green), a total of 10 signals, nine on one chromosome 15 (red) and one on the other (Fig. 2e) were detected in individual A who possesses a cytogenetic elongation of proximal 15q. In all individuals analysed, probes λ954 and λ884/885 gave identical hybridization signals.
Characterizing the contents of the variable repeat
Two other duplications were also mapped to within this repeat. While performing exon trapping with P1 clones isolated using the GABRA5 duplication, we found that clone π765 contained exon 14 of the neurofibromatosis (NF1) gene duplication which had been mapped previously to proximal 15q (26–29). Subsequent PCR analysis of clone π765 using PCR primers from exons 13, 15 and 21 confirmed the presence of these exons, and introns 13 and 14, suggesting that one of the NF-1 non-processed pseudo-genes was contained in π765 (Fig. 1b). An IgH D segment gene duplication had also been mapped previously to proximal 15q (30,31), and experiments were performed to determine whether this duplication was also contained within the repeat. We obtained P1s (π923, π924 and π925) across the D segment duplication and confirmed their localization to proximal chromosome 15q by FISH. Subsequent FISH analysis with these probes in the individuals already studied gave hybridization patterns identical to those seen with probes across the GABRA5 duplication, indicating that the IgH D segment is also contained within the repeated block. Hybridization of π925 to metaphase chromosomes from individual A shows a cluster of signals on the elongated chromosome 15q, with only one signal on the other homolog (Fig. 2f). Thus, the non-processed IgH D segment duplication is also part of the repeat region and exhibits variable copy number (Fig. 2e and f). The size of the repeat is estimated to be 1 Mb since the extent of the proximal elongation in individual A is approximately equal to one cytogenetic band on metaphasic 300 band resolution chromosomes, or the equivalent of ∼10 Mb (M.-G. Mattei, unpublished data), and this individual is thought to possess nine repeats on this chromosome. The large size of the repeat is also suggested by the observation that the GABRA5/NF1 P1 contig is distinct from the contig across the IgH duplication, and that multiple signals could be distinguished in our FISH analysis.
The repeat region is amplified in an individual with a proximal 15q amplification
To determine whether the repeat region could account for the cytogenetic amplification of the proximal region of chromosome 15q seen in some individuals (9,10,32), FISH analysis using probe λ885 (green) was performed on individual D known to possess an amplification of the region proximal to D15S18 (32). Figure 3a clearly shows that multiple copies of the repeat region are present on one of the metaphase chromosomes 15 from individual D. FISH analysis using probe π925 from the IgH region on interphase chromosomes from this individual also possesses multiple copies of the repeat region. Analysis of >100 nuclei in the G1–G2 stage of the cell cycle makes it possible to determine the approximate number of copies of the duplicated region (Fig. 3b). In the G2 nucleus in Figure 3, ∼40 copies of the repeat region can be seen on one replicated chromosome, suggesting the presence of 20 copies on that chromosome. Probe λ954 (green) hybridized to a G1 nucleus from the same individual detects ∼20 copies of the GABRA5 repeat on one chromosome and one on the other (Fig. 3c). However, the precise number of copies becomes more difficult to determine when large numbers of repeats are present due to the three-dimensional conformation of the nucleus.
In addition to our FISH analysis, variability in the copy number of the repeat could be detected by Southern blotting using probes from the duplicated region. Co-hybridization of BglII-digested DNA with probe p834 from the GABRA5 duplication, and probe 438S/P from the GABRA5 gene detected a 5.0 and 3.9 kb fragment respectively. The ratio of intensity of this 5 kb band to the 3.9 kb band clearly varies in different individuals (Fig. 3d). While the different sizes and nucleotide composition of the probes make it difficult to determine the exact number of copies of the duplicated locus due to differences in their hybridization efficiencies, the ratio of intensities of the duplication and GABRA5 bands differed among individuals (Fig. 3d). Densitometric analysis using the Phosphoimager confirmed the variation in the duplication: GABRA5 hybridization band ratio. Individuals F, G and H have ratios of 3:1, 5:1 and 6:1 (Fig. 3d), while individual D and her brother (individual E), who possess the same cytogenetic amplification, were found to have a 14:1 and a 21:1 ratio of dup:GABRA5, respectively. We believe that the difference in ratio between individuals D and E is due to slight degradation of the DNA from individual D, although it is possible that the difference is due to instability of the repeats. Nevertheless, the Southern blot hybridization results using a probe across the GABRA5 duplication confirms the variation in the repeats number seen with the IgH D segment probe, and probes from the GABRA5 duplication and adjacent region, in the FISH analysis.
The GABRA5 duplication shows asynchronous replication
As GABRA5 has been shown previously to display allele-specific DNA replication (22,23), we investigated whether the GABRA5 duplication, or indeed any of the large repeat region, also displays asynchronous replication. An individual who possesses just one copy of the GABRA5 duplication on each chromosome (individual I) was identified. Using probe λ885 (Figure 4) in the FISH replication assay (33), singlet-singlet, singlet-doublet and doublet-doublet hybridization patterns were observed in 32, 36 and 32% of the nuclei, respectively. A singlet-doublet hybridization pattern is indicative of DNA replication asynchrony and is detected in >20% of nuclei in imprinted regions (21–23). Thus, the high fraction (36%) of nuclei displaying the singlet-doublet pattern using the λ885 probe suggests that the repeat region displays at least one characteristic of imprinted regions. Identical results were seen with probe λ954 (data not shown), revealing that both the GABRA5 duplication and the adjacent region show asynchronous replication. Replication asynchrony was also seen in an individual who possessed two repeats on each chromosome, although the hybridization pattern was much more difficult to interpret (data not shown).
We have identified a partial duplication of the GABRA5 receptor subunit gene including the promoter region and the alternatively spliced first exons (1A, 1B and 1C) and exon 2. The duplication shows 75–88% homology to the GABRA5 gene and is flanked by Alu repeats although we do not have evidence that the Alus were involved in the duplication event. The duplication is part of a large repeat on proximal 15q that varies in number in individuals with a normal karyotype and is increased in number in individuals with a cytogenetic elongation of chromosome 15q. In addition to the GABRA5 duplication, we found the repeat to contain a partial duplication of the NF1 gene and the IgH D segment gene. These loci have been mapped previously to this region and we have refined their localization to this large variable repeat proximal to D15S18 (IR39) and proximal to the large deletions associated with AS and PWS. The presence of other loci within the variable repeat provides an explanation of previous reports suggesting multiple copies of the NF1 duplication (26–29) and the presence of multiple alleles of the D segment duplications (30,31) on chromosome 15q. The cytogenetic and molecular probes described herein are probably the most proximal described to date and will permit characterization of chromosomal rearrangements of the proximal 15q region.
It is unlikely that amplification of the proximal 15q region is associated with an abnormal phenotype. Individual A, who possesses nine repeats on one chromosome, shows slight mental retardation and suffers from epilepsy (M.-G. Mattei, unpublished data), whereas individual D who inherited ∼20 repeats from her paternal grandfather appears to be normal except for growth retardation and short stature. However, as both parents of individual D have short stature, the phenotype of individual D is probably unrelated to the cytogenetic amplification of the repeat (32, M.-G. Mattei, unpublished data). In addition, her brother, individual E, and father also possess the amplification and are considered phenotypically normal. Thus, repeat size alone does not correlate with an abnormal phenotype and, as individual A probably inherited her nine repeats from her father, paternal transmission of large repeats is not associated with a consistent phenotype. We have not yet observed maternal transmission of large repeats; however, there does not seem to be a correlation between maternal transmission of a large number of repeats with an abnormal phenotype in individuals with proximal 15q duplications (9,10,12). It is possible that large repeats of this region could contribute to an abnormal phenotype by either disrupting normal chromatin conformation, which may in turn affect gene expression or replication in the repeat or regions further distal, or by hindering normal recombination of the 15q11–q13 region. However, this seems unlikely since allelic methylation at D15S63 and expression of the imprinted gene encoding small nuclear ribonucleoprotein peptide N (SNRPN) were normal in individuals A and D (32). Nevertheless, more detailed characterization of the repeat region in normal individuals is now possible by Southern blot or FISH analysis, and should help clarify any phenotype-repeat number association.
Our observation of asynchronous replication in the proximal 15q region (Fig. 4) suggests that the 15q11–q13 imprinted region extends more proximal than currently thought. We have preliminary evidence from FISH replication assays that the maternal allele replicates earlier than the paternal allele, and we currently are confirming this in other individuals. Indeed, this result is consistent with our observations that the large paternally derived repeat in individual D is late replicating as determined by BrdU replication banding (32). It is possible that the maternal earlier than paternal replication pattern within the repeat is due to duplication of the GABRA5 gene which displays maternal earlier than paternal replication (22,23). Alternatively, it may be that as a result of the duplication event, the replication of the 5′-UTR of GABRA5 changed to maternal earlier than paternal from the predominant paternal earlier replication pattern in the remainder of 15q11–q13 (22,23). Confirmation that this proximal region of 15q is imprinted requires further study of the allele-specific transcription of genes and allelic methylation in the region. In this regard, our preliminary data suggest that a few sites within the GABRA5 duplication display allele-specific differences in methylation.
Recently two large repeat regions have been found which contain full-length, apparently functional genes (34,35). The 1 Mb repeat described here is large enough to contain functional genes, and previous reports suggest that both the NF1 and D segment duplications are transcribed. The NF1 duplication described on chromosome 15 was actually identified by sequencing and mapping an RT-PCR product across the GAP domain (exons 20–27b) of the NF1 gene (26), and the D segments in the D5-b cluster also have open reading frames, and apparently functional recombination signals (38). Although only the 5′-flanking region of GABRA5 is duplicated, it is possible that the promoter is active and affects the transcription of nearby sequences. We currently are searching the 1 Mb region for transcribed sequences in order to investigate whether these show allele-specific expression.
Finally, the identification and characterization of this repeat is interesting in light of several recent reports describing other large repeats within the genome (34,35). Gene duplications and pseudogenes are not uncommon; however, this is the first example where a cluster of duplications has become part of a large polymorphic tandem repeat. Based on sequence homologies and FISH on primate nuclei, the NF1 and IgH D segment duplications are thought to have arisen 16–30 Mya (28,29,31). Similarly, despite low homology to the original locus, the GABRA5 duplication probably occurred about the same time (R.J. Ritchie et al., in preparation). However, sequence data suggest that the large tandemly repeated region encompassing these duplications probably arose many millions of years after the duplication events. Based on the sequence divergence between the NF1 copies on chromosome 15 (28,29) and between the sequences within our P1 contig (data not shown), the region probably evolved into a repeat 1–2 Mya, around the time when the subtelomeric repeats described by Trask (35) are thought to have arisen. Although, unlike the subtelomeric repeats which are variably present in several different subtelomeric regions (35), the pericentromeric repeats like these described here on 15q probably arose by a distinct mechanism.
It is not known how the gene duplications accumulated on this region of proximal 15q and then developed into a large variable repeat. It has been suggested that duplications may be targeted to the pericentromeric regions of chromosomes, and spread amongst these chromosomal regions (36,37). However, such duplications, with the possible exception of the NF1 duplication on chromosome 14q (28), do not appear to be associated with a repeat like that described here. Alternatively, perhaps the duplication event, or the subsequent repeat generation, is related to asynchronous replication. The repeat described here contains a duplication of the asynchronously replicating GABRA55′-UTr, and the repeat described by Trask (35) contains members of the olfactory receptor gene family, which also show asynchronous replication, where the maternal allele replicates earlier than the paternal (39). Furthermore, we have evidence that the IgH gene on chromosome 14 shows a similar replication pattern (M.-G. Mattei, unpublished data). Perhaps maternal early replicating regions are predisposed to duplication, and subsequent amplification. Alternatively, the asynchronous replication observed may arise in response to the duplication event. In this regard, it will be interesting to determine the replication timing of genes contained within such repeated blocks of DNA at other locations in the genome.
Material and Methods
Individual A is a 25-year-old female who presented with a sub-normal phenotype with very slight mental retardation and seizures. One of her chromosomes 15 showed an excess of material in the proximal 15q region. Individuals B and I are normal females, individual C is male. Individual D, a 15-year-old female with short stature and a very long 15q+ chromosome, has been described elsewhere (32). Individual E is the normal brother of individual D. Individuals E and F are phenotypically normal and each possesses a supernumary inv dup(15) which does not include the APWR. Individual G is a male with AS of unknown etiology. Individual H is a normal male.
Cytogenetics/Fluorescence in situ hybridization
Primary T lymphocytes were isolated from peripheral blood mononuclear cells (PBMC) which had been isolated by density centrifugation on Ficoll-hypaque (Pharmacia). Lymphocytes were stimulated with 2 mg/ml pHa.P (Wellcome Diagnostics) at a concentration of 106/ml in media containing 10% pooled human serum (Sigma). Following 48 h in culture, interleukin-2 (IL-2; Human T Stim, Collaborative Research) was added at a 5% final concentration, and cells were restimulated with PHA.P and 106/ml irradiated allogenic PMBC in the media described above. Cells were swollen in hypotonic solution and fixed with methanol:acetic acid (3:1) as described previously. Released chromatin was prepared from methanol/acetic acid-fixed cells with 0.05 M NaOH in 30% ethanol as described by Senger et al. (40). Metaphase chromosomes from all the patients were banded with RHG and CBG staining techniques, and 30 metaphases were analyzed for each of them. The replication timing analysis was performed on a total of 300 nuclei in three different experiments.
P1 clones (1 µg) were labeled with biotin-16-dUTP or digoxigenin-11-dUTP (Boehringer) by nick translation using standard protocols. Labeled DNA was hybridized to chromosomes as prepared above. Biotin-labeled probes were detected with fluorescein isothiocyanate (FITC)-avidin DCS (Vector Laboratories), and digoxigenin-labeled probes were detected with rhodamine anti-digoxigenin (Boehringer). Nuclei were counter-stained in 100 ng/ml 4,6-diamidino 2-phenylindole (DAPI) or 500 ng/ml propidium iodide (PI) in Vectashield mounting media (Vector Laboratories). Chromosomes were visualized using a Zeiss Axioplan 2 microscope and captured with a Photometrics ‘SenSys’ camera. Images were collected and merged using IP Lab Spectrum software.
Library construction and screening
Genomic DNA from an individual containing an inverted duplication of the proximal 15q region was digested with HindIII and ligated into the HindIII-digested ZapExpress vector (Strata-gene), which had been treated with calf intestine alkaline phosphatase (Boehringer Mannheim), according to the manufacturer's instructions. Ligated genomic DNA was packaged using the Giga Pack Gold packaging system (Stratagene). The library was plated and transferred to nitrocellulose and hybridized with a probe across the GABRA5 exon 1B. Hybridization was carried out at 65°C for 2 h using Express-Hyb. Filters were washed at 2× SSC/0.1% SDS at 55°C.
Sequencing was performed on a 370A (Applied Biosystems) automated sequencing machine and analyzed using the DNASIS and Sequencher programs.
Southern Blot Analysis
DNA was isolated from resting blood and lymphocytes using the Puregene kit. Alternatively, DNA was isolated by saline extraction using standard protocols. Genomic DNA was digested according to the manufacturer's instructions, electrophoresed on a 1% TBE gel for 18–20 h and transferred to Hybond N membrane using standard protocols. DNA probes were labeled with [32P]dCTP using the Mega Prime labeling kit (Amersham). Probe p834 is a PCR product of 1.1 kb from the duplicated region. Probe 438S/P is a 1 kb SacII-PvuII fragment from clone p438 across the GABRA5 locus. All hybridizations were carried out at 65°C in a rotating hybaid oven using Express-hyb solution (Clontech) for 1.5–2 h. Blots were washed at 0.1×SSC/0.1% SDS at 65°C (p834) and 0.1× SSC/0.1% SDS at 65°C (438S/P) and exposed to film or phophoimage plate overnight or as needed.
The authors thanks Heather Glatt and Paulena Lieski for their assistance, Claire Rougeulle for helpful discussion, and Drs P. Collingnon and A.M. Frances for refering their patients. This work is supported by the INSERM, Association pour la Recherche contre le Cancer, NIH grant R01-NS30628 and the Howard Hughes Medical Institute.