Regions containing palindromic sequence are known to be susceptible to genomic rearrangement in prokaryotes and eukaryotes. Palindromic AT-rich repeats (PATRR) are hypervariable in the human genome, manifesting size polymorphisms and a propensity to rearrange. Size variations are mainly the result of internal deletions, while two PATRRs on 11q23 and 22q11 (PATRR11 and 22) contribute to generation of the t(11;22), a recurrent constitutional translocation. In this study, we analyzed the PATRR11 sequence of numerous polymorphic alleles in detail. Various types of shorter variants are likely derived from the most frequent ∼450 bp PATRR11 by deletion. Deletion variants possess a significant number of identical nucleotides at their two endpoints, indicating the possible involvement of direct repeats within the PATRR11. Rare variants with insertional alterations involve AT-rich sequences of unknown origin. This is in contrast to palindrome-mediated translocations between PATRRs that manifest smaller deletions and only a limited number of identical nucleotides at the breakpoints. Further, we identified a rare translocation product that has a non-AT-rich insertion of a transcribed gene segment at the translocation breakpoint. Our data suggest that the outcomes of palindrome-mediated re-arrangements reflect distinct molecular pathways; intra-palindrome re-arrangements are possibly dictated by a replication slippage or microhomology-directed repair pathway, and inter-palindrome translocations are likely driven by non-homologous end joining.
Genomic regions containing palindromic sequence are known to be susceptible to double-strand breaks (DSBs), contributing to the generation of diverse genomic re-arrangements. This genomic instability of palindromic regions has been consistently demonstrated in many experimental organisms, such as bacteria, yeast and mice. In E. coli, a palindromic region is either partially or completely deleted, regardless of whether it is transfected as a plasmid (1,2) or introduced into the bacterial genome (3). It is formally possible that a DSB results from stalling of DNA synthesis in the replication fork because of intra-strand base pairing. Another important pathway results from endonuclease cleavage of DNA secondary structure by the SbcCD, orthologue of the Rad50/Mre11 nuclease complex in higher organisms (4). In Saccharomyces cerevisiae, it has been generally acknowledged that palindromic sequences induce meiotic or mitotic recombination by creating a hotspot for DSBs (5,6). In mammals, analysis of transgenic mice has demonstrated frequent deletions or insertions at artificially created palindromic regions, indicative of substantial meiotic and mitotic instability (7,8). The fact that Alu inverted repeats are underrepresented in human genome databases also provides indirect evidence of the susceptibility of palindromes to deletion in humans (9,10).
The best studied example of palindrome-induced genomic instability in humans is the constitutional t(11;22)(q23;q11). The constitutional t(11;22)(q23;q11) is the only known recurrent non-Robertsonian translocation in humans (11). Although balanced translocation carriers have no clinical symptoms, they occasionally manifest male infertility or recurrent pregnancy loss in the females. They are often identified after the birth of unbalanced offspring with the supernumerary der(22)t(11;22) syndrome, which was recently named Emanuel syndrome (MIM# 609029). The breakpoints of numerous unrelated t(11;22) cases have been consistently shown to be located within palindromic AT-rich repeats (PATRRs) on 11q23 and 22q11 (PATRR11 and PATRR22) (12–16). The majority of the breakpoints have been localized at the center of the PATRRs, suggesting that the center of the palindrome is susceptible to DSBs inducing illegitimate chromosomal re-arrangement. PATRR22 has been known to be a hotspot for translocation breakpoints (17). Recent findings of PATRR-like sequences at the translocation breakpoints of other chromosome 22 partner chromosomes supports the conclusion that palindrome-mediated chromosomal translocation appears to be one of the universal pathways for human genomic re-arrangements (18–22).
PATRRs in humans manifest another type of genomic instability: central modification. This is evidenced by the fact that the PATRRs exhibit size polymorphisms due to deletion or insertion near the center of the palindrome (14,23–25). One way to account for these central modifications is to stall replication as is observed for palindrome resolution in model organisms. Alternatively, the mechanism of deletion formation could also be explained in the context of a DSB repair model similar to that of translocations.
In this study, we performed an extensive investigation to examine the extent of polymorphic sequence variation of the PATRR11 in normal healthy individuals. We compared the breakpoints of intrachromosomal re-arrangements (deletion/insertion) with those of interchromosomal re-arrangements (translocation). Our results imply that different mechanisms govern these two types of genomic instability that are induced by the PATRR.
Polymorphism in the PATRR11
A total of 396 individuals (792 chromosomes) were examined to determine their PATRR11 sequence. As reported previously, the PATRR11 was shown to be hypervariable in size among individual chromosomes (Fig. 1A) (25). Sequence analysis shows that the most frequent allele (88.0%) is a ∼450 bp PATRR11 (L-PATRR11), that comprises a nearly perfect palindrome although it includes minor nucleotide variations (23). Various types of short variants were also identified (S-PATRR11). These variations seem to occur primarily by deletion, near the symmetric center of the palindromic structure, and are likely to be derived from the typical longer version. We classified the S-PATRR11s into four subgroups according to the previous report (Fig. 1B) (25). The most frequent variant, Type1 S-PATRR11 (S1-PATRR11), consists of a 350 bp nearly perfect palindrome. It has a 50 bp deletion at both of the palindromic arms, but still remains completely symmetrical including an intact palindromic center. Since the sequence, including the deletion endpoints, is completely identical among multiple individual alleles, the divergence of this type from the L-PATRR11 appears to be old and stably transmitted.
Type2 and Type3 S-PATRR11s (S2-, S3-PATRR11s) are simple deletions that include the palindromic center. S2-PATRR11 has an asymmetric deletion at its center, but the new center manifests a quasi-symmetric palindrome. S3-PATRR11 does not retain palindromic features by virtue of an asymmetric deletion at the center of the original palindrome. A total of 10 types of S2- and S3-PATRR11 were identified, the sizes of which vary ranging from 130–410 bp (Fig. 2). It can be assumed that all of the minor alleles of the PATRR11 were independently derived from the most frequent L-PATRR11 during the course of human genome evolution. It appears that once it is partially deleted, the PATRR usually manifests a short asymmetric structure that rarely acts as a substrate for further deletion since asymmetric deletion stabilizes the palindromic region (25).
We analyzed the breakpoints of these various PATRR11 deletions in detail by comparing the sequence of the short variants with that of the putative original version, L-PATRR11. There are 13 identical nucleotides at both endpoints of the deletion in the S1-PATRR11. All other deletion variants also possess a significant number of identical nucleotides (<14 nucleotides) at the breakpoints (Fig. 2). This finding suggests that the deletions occur using direct repeats within the PATRR.
We identified six Type4 S-PATRR11s (S4-PATRR11), which had sustained an asymmetric central deletion followed by an insertion (Fig. 1B). All of the inserted sequence was found to consist of an unknown AT-rich segment that does not usually appear in the PATRR11 (Fig. 2). To further investigate the origin of the unknown sequence, we examined the sequences of the PATRR11 syntenic region in several primates. However, we could not identify the sequences similar to the AT-rich insertion (Fig. 1C). Thus, the insertion of AT-rich sequence may originate from somewhere in the human genome, although we could not identify it in the human genome database. We also identified a rare allele with a duplication of the proximal palindromic arm (EL-PATRR11). The size of the EL-PATRR11 is 603 bp, which is longer than the typical 450 bp L-PATRR11, and it has an asymmetric structure. The EL-PATRR11 appears to have been generated by a duplication of the proximal arm of the PATRR11 with an AT-rich insertion at the junction (Fig. 2).
Analysis of de novo t(11;22) breakpoints
We previously examined junctions of PATRR-mediated translocations in t(11;22) balanced carriers (14). That initial analysis had intrinsic limitations with respect to accuracy because we did not have the original PATRR11 and PATRR22 sequences that the translocations were derived from. On the other hand, translocation-specific PCR detects a high frequency of de novo t(11;22)s in normal sperm samples from healthy males with a normal karyotype (26). Thus, we can obtain numerous translocation products generated independently from known chromosome 11 and chromosome 22 parental sequences. Using this technique, we characterized junction fragment sequences of de novo translocations derived from a single male homozygote for the typical L-PATRR11 to compare with the original PATRR11 and PATRR22 sequences. Each of 29 translocations appeared unique. Junction fragments often possess small deletions (<50 bp) at the breakpoints on both chromosomes as well as an occasional insertion of a single nucleotide at the junction (Fig. 4A). Consistent with previous sequence analysis of transmitted translocation products from balanced carriers, only a small number of identical nucleotides are found at the breakpoint junctions (Fig. 3).
We compared the estimated extent of microhomology between PATRR11 deletion and translocation products. The extent of microhomology at deletion endpoints was significantly greater than that for translocations (P = 0.0007) (Fig. 4B). To exclude the possibility that target sequence differences, rather than the molecular mechanism, might affect variation in the extent of microhomology, we established a random deletion or translocation model. We simulated translocations and deletions in silico on the assumption that they occur via blunt end ligation, and then compared the length of false microhomology between deletions and translocations (14). No apparent difference was observed between the in silico deletion (PATRR11 versus PATRR11) and the in silico translocation (PATRR11 versus PATRR22) (Fig. 4C). The data suggest that variation in target sequence did not contribute to the results of the microhomology analysis when comparing between the deletions and translocations. Then, we compared the extent of false microhomology in the simulation with the actual length of identical sequence. The extent of microhomology for translocation agreed with the results of blunt-end simulation, whereas the extent of microhomology for the deletion was greater than that of the simulation (Fig. 4D and E).
In the process of characterization of de novo translocations derived from an L-PATRR11 homozygote, we identified one atypical large PCR product in 112 de novo der(11)-specific products. Sequence analysis revealed that the translocation junction sustained an insertion of unknown sequence between the proximal arm of the PATRR11 and the distal arm of the PATRR22. The inserted 140 bp sequence was found to be part of the cAMP-specific phosphodiesterase 4A gene (PDE4A). The first 10 bp corresponds to intron 9, while the remainder is exon 10. Target site duplication, which is often seen in retrotransposition, was not observed. The observed microhomology between the insertion and the target sequence was only 1 bp at both ends.
In this study, we analyzed the endpoints of deletion polymorphisms and the breakpoints of translocations mediated by the PATRR11 in humans. Estimation of the extent of microhomology participating in the generation of such genomic rearrangements poses several inherent problems. Since a deletion or translocation mediated by PATRR11 and PATRR22 is innocuous to the individual who harbors the re-arrangement, such changes are likely to have been transmitted for many years. Thus, it is possible to misinterpret the location of the breakpoints, because (1) the rearranged junction fragments cannot be directly compared with the original sequence for which information is lacking and (2) other mutations due to genomic instability may have been accumulated. By using de novo translocation products in sperm samples obtained from non-translocation carriers, we could circumvent these limitations. This allowed us to accurately analyze the translocation breakpoint by comparing it to its original sequence configuration. However, we could not apply this approach to the analysis of deletion endpoints, since detection of de novo deletion by PCR is technically difficult in the presence of a large amount of non-deleted PATRR. Further, de novo deletion within the PATRR is too rare to detect by a similar method with translocation-specific PCR (H.K., unpublished data).
Assuming that PATRRs are hotspots for DSBs, both deletions and translocations can be the products of DSB repair. PATRR-mediated translocation arises when DSBs are present at two PATRRs located on different chromosomes. When a DSB is present only in one of the PATRRs, it may be repaired within the PATRR, possibly leading to deletion. In this context, both PATRR-mediated deletions and translocations arise through a similar mechanism for DSB repair. A similar situation is observed when one or two DSB(s), which are introduced by I-SceI in mammalian genomes, are repaired either by non-homologous end joining (NHEJ) or microhomology-mediated end joining (MMEJ) (27–29). However, our current data for breakpoint analysis indicates that deletions and translocations differ in three crucial aspects regarding repair pathways; the extent of microhomology, the size of the deletion and the character of any inserted sequences. These data imply that different mechanisms govern PATRR-mediated deletions and translocations.
With respect to translocations, we observed only a few identical nucleotides at the point where the original PATRR11 and PATRR22 sequences were joined. Although there is only a weak homology between PATRR11 and PATRR22 (14), they are so highly AT-rich that the PATRRs include enough identical segments of great enough length to utilize for repair in a homology-directed manner. However, they do not utilize the homologous regions, suggesting that NHEJ is the molecular pathway that is selectively used in generating this translocation (14,30–32). The actual number of identical nucleotides is parallel to the NHEJ simulation, and is also consistent with the fact that NHEJ generally utilizes less than four nucleotides (29). It is likely that, in the few cases with more than four nucleotides of putative microhomology, several nucleotides may represent false microhomology due to the AT-richness of the original. Translocation is often accompanied by small deletions (<50 bp) and occasionally small insertions, which are also in agreement with repair by NHEJ.
One exceptional translocation junction is that with the PDE4A gene insertion at the translocation breakpoint. Since a target site duplication typically observed in cases with retrotransposition is not present, it is likely that the insertion occurred concurrently with translocation formation, not during the transmission of the translocation. An insertion during DSB repair through the NHEJ pathway often comprises reverse-transcribed sequences including retrotransposons such as LINE elements (33–35). These data also lend support to the hypothesis that translocation is generated via NHEJ. We have proposed that the PATRR, under certain conditions, can adopt a cruciform configuration that induces genomic instability leading to the translocation (36). The translocation pathway might be analogous to that of immunoglobulin or T-cell receptor gene rearrangements, where a hairpin-capped intermediate is finally rejoined by an NHEJ pathway (37,38).
In contrast, these findings are remarkably different from those of deletions in the PATRR. The results indicate that the PATRR11 appears to utilize much more extensive microhomology for deletion formation. We did not observe any subtle central deletion that is often seen with the translocations. Although it is still possible that the deletion results from DSB repair at the center of the PATRR11 via MMEJ, it is rather easy to imagine that replication of the palindromic region is slowed or even stalled as a result of formation of a hairpin structure in the template of the replication fork. Formation of a free DNA end followed by a microhomology-directed restart of DNA synthesis might be the major consequence, which eventually produces a deletion (1,2,4,39). It is possible that the AT-rich insertion might originate as a result of template switch within the palindromic region or resumption of a stalled replication fork using an unknown AT-rich region located elsewhere in the human genome (40,41).
Such a homology-directed deletion, so-called replication slippage, has also been demonstrated in yeast, which utilizes a direct repeat located either inside or outside of the palindrome (6). In mice carrying a palindromic transgene, on the other hand, non-homology-based deletion is more frequent than homology-directed deletion (42). Such apparently inconsistent results might result from the fact that no direct repeat is present in the palindromic sequence in the mouse model investigated. Thus, combined with the fact that insertion polymorphism of the PATRR also involves unknown AT-rich sequences, central modification of the PATRR appears to arise from replication slippage, or less likely, via MMEJ pathways.
Our results imply that the mechanism for generation of genomic rearrangements of the PATRR11 differs between deletions and translocations. It has also been shown by our previous observations that the PATRR22 is a known translocation breakpoint hotspot, whereas the PATRR22s do not manifest size polymorphisms generated by deletion (14). In fact, a PATRR11 containing plasmid is susceptible to deletion within the palindromic region, whereas the PATRR22 plasmid is relatively stable in E. coli (39). Genomic rearrangement fundamentally consists of two steps: DNA damage and repair. In this study, we have demonstrated differences in the repair mechanisms between PATRR-mediated deletions and translocations. As far as the DNA damage is concerned, DNA replication might be responsible for DNA breakage that precedes the deletion. In this sense, it is not unreasonable to think that PATRR-mediated deletions might arise in somatic cell replication cycles. On the other hand, PATRR-mediated translocations only occur in germ cells (26), suggesting that standard DNA replication is not responsible for the rearrangement and that other mechanisms dictate PATRR-mediated translocations.
MATERIALS AND METHODS
PCR and sequencing
For analysis of the PATRR11 polymorphism, genomic DNA was extracted from blood samples or cheek swabs using PureGene (Gentra). PCR for PATRR11 was performed using optimized conditions for amplification of the PATRR (24). Nested PCR products were used for sequencing. For the PATRR11 polymorphism, we compared sequences obtained with those of standard PATRR11 (AB235178).
For analysis of de novo t(11;22) translocations, genomic DNA was extracted from the sperm sample of a typical L-PATRR11 homozygous individual. PCR primers for amplification of the t(11;22) junction fragments were described previously (13). The PCR condition was 40 cycles of 10 s of 98°C and 5 min of 60°C for detection of the single molecule of a de novo translocation (25,38). We directly sequenced translocation-specific PCR products. We used the original PATRR11 (AB235178) and PATRR22 (AB261997 and AB261999) sequences as standards in the analysis of translocation products.
All human samples were provided from individual volunteers in a Japanese population after obtaining the appropriate informed consent. The study was approved by the Ethical Review Boards for Human Genome Studies at Fujita Health University.
For analysis of PATRR11-like sequences of primates, we first performed a homology search in the primate genome database using BLAST. We then designed PCR primers at well-conserved regions flanking the PATRR11-like sequence. We used genomic DNA from cos7 cells as a template for the PCR. Primers we used were as follows: 5′-GAGAGTAAAGAAATAGTTCAGAAAGG-3′ and 5′-GGTTGAAGAAGAATCTTGGCTGG-3′.
In silico deletion
We analyzed putative microhomology at the deletion endpoints using an in silico deletion method as was described previously (14). In brief, we cut the PATRR11 sequence at two sites at random in silico, and then connected the proximal and the distal segments as artificial deletion products. We attempted to determine, assuming that NHEJ occurs between completely blunt ends, how many nucleotides appeared to represent false microhomology. We compared the distal end sequence of the proximal breakpoint with that of the distal breakpoint. The number of identical nucleotides in the same orientation at the ends was recorded. We counted the total number of putative deletion events for each number of microhomologies. Similarly, we analyzed false microhomology for the translocations between the PATRR11 and the PATRR22. All of the analyses were performed using Microsoft Excel (Microsoft).
Intergroup comparisons were performed by the Mann–Whitney test.
These studies were supported by a grant-in-aid for Scientific Research, Genome, and for 21st Century COE program from the Ministry of Education, Culture, Sports, Science and Technology of Japan (H.K.) and by a grant (CA39926) from the National Institutes of Health, USA (B.S.E.) and funds from the Charles E.H. Upham Chair (B.S.E.).
The authors wish to thank Dr Hasbaira Bolor, Misses K. Nagaoka, T. Mori and E. Hosoba for technical assistances.
Conflict of Interest statement. None declared.