-
PDF
- Split View
-
Views
-
Cite
Cite
Jun Wan Shin, Kyung-Hee Kim, Michael J. Chao, Ranjit S. Atwal, Tammy Gillis, Marcy E. MacDonald, James F. Gusella, Jong-Min Lee, Permanent inactivation of Huntington's disease mutation by personalized allele-specific CRISPR/Cas9, Human Molecular Genetics, Volume 25, Issue 20, 15 October 2016, Pages 4566–4576, https://doi.org/10.1093/hmg/ddw286
- Share Icon Share
Abstract
A comprehensive genetics-based precision medicine strategy to selectively and permanently inactivate only mutant, not normal allele, could benefit many dominantly inherited disorders. Here, we demonstrate the power of our novel strategy of inactivating the mutant allele using haplotype-specific CRISPR/Cas9 target sites in Huntington's disease (HD), a late-onset neurodegenerative disorder due to a toxic dominant gain-of-function CAG expansion mutation. Focusing on improving allele specificity, we combined extensive knowledge of huntingtin (HTT) gene haplotype structure with a novel personalized allele-selective CRISPR/Cas9 strategy based on Protospacer Adjacent Motif (PAM)-altering SNPs to target patient-specific CRISPR/Cas9 sites, aiming at the mutant HTT allele-specific inactivation for a given diplotype. As proof-of-principle, simultaneously using two CRISPR/Cas9 guide RNAs (gRNAs) that depend on PAM sites generated by SNP alleles on the mutant chromosome, we selectively excised ∼44 kb DNA spanning promoter region, transcription start site, and the CAG expansion mutation of the mutant HTT gene, resulting in complete inactivation of the mutant allele without impacting the normal allele. This excision on the disease chromosome completely prevented the generation of mutant HTT mRNA and protein, unequivocally indicating permanent mutant allele-specific inactivation of the HD mutant allele. The perfect allele selectivity with broad applicability of our strategy in disorders with diverse disease haplotypes should also support precision medicine through inactivation of many other gain-of-function mutations.
Introduction
Huntington's disease (HD; OMIM # 143100) (1–3) is one of many genetic disorders, in which a mutation causes disease by a dominant effect of the mutant protein (4). The HD mutation involves expansion of a CAG repeat in the huntingtin gene (HTT; OMIM # 613004) that results in an elongated polyglutamine tract in the huntingtin protein. Because this mutation has occurred independently many times in humans, it resides on many different DNA haplotypes in the HD population, in each case with a distinct array of associated alleles at polymorphic sites in HTT (5,6). Regardless of surrounding DNA haplotype, CAG repeat expansions longer than 35 elicit characteristic clinical symptoms, including involuntary movements, cognitive decline and psychiatric disturbance in a fully dominant fashion (2,7). Age at onset of motor signs and age at death are both determined primarily by the size of the expanded repeat (7–11), and through genome-wide association analysis, we recently discovered genetic loci significantly associated with the difference between observed age at onset of motor signs and that expected based upon the CAG repeat length of HD subjects (12). Although this gain-of-function mutation has been known for more than 20 years (1), and huntingtin has been implicated in many biological processes (2,13,14), effective mechanism-based treatments have yet to be developed.
The recognition in dominant disorders like HD that the presence of the mutant protein is the trigger of disease has stimulated the pursuit of gene silencing as a potential therapeutic avenue (15). For example, RNA-mediated interference or antisense oligonucleotide (ASO)-mediated silencing have emerged as direct solutions to counter the CAG triplet repeat expansion ‘polyglutamine’ diseases and amyotrophic lateral sclerosis, based on therapeutic efficacy in mouse and rat models (16–18). A major concern is the impact of the treatment on the normal allele and the potential for causing damage to the patient through loss of normal protein activity. Consequently, allele-specific targeting of polymorphic differences between the mutant and normal mRNAs has been attempted in HD model systems, where both ASO and single-stranded RNA strategies have demonstrated the feasibility of preferentially reducing the levels of the mutant allele (15,19,20). The potential applicability of such allele-specific strategies is enhanced in HD by the extensive genetic analysis that has been performed for HTT and its haplotypes, including in some cases full-sequence determination (21). This knowledge of the sequence diversity of HD disease chromosomes also provides the opportunity to extend beyond mRNA-lowering strategies to capitalize directly on the greater number of variants found in genomic DNA. Importantly, loss of one copy of HTT due to inheritance of balanced translocation chromosome did not generate typical HD symptoms (22), supporting that 1) HD is not caused by haploinsufficiency, and 2) one functional copy of HTT is sufficient to maintain cells' integrity. Thus, permanent and selective inactivation of the mutant HTT DNA could represent an alternative precision medicine approach to HD and provide a model for application of such a strategy to other dominant disorders. In this study, we developed a strategy to target only the mutant HTT DNA, using a powerful CRISPR/Cas9 gene editing technology (23) to create pre-specified inactivating deletion mutation exclusively on the mutant allele.
We first revealed individual DNA variations whose alleles generate or eliminate Protospacer Adjacent Motif (PAM) sequences on the eight most frequent HTT gene haplotypes. We then identified pairs of PAM sequences that are present on the mutant chromosome haplotype but absent from the normal chromosome haplotype in a given HD patient. As a proof-of-principle, we used personalized CRISPR/Cas9 strategy to target two patient-specific PAM sites simultaneously to eliminate the promoter region, transcription start site and the CAG expansion mutation of the mutant allele without altering the normal allele. This CRISPR/Cas9 strategy using pairs of custom-designed mutant haplotype-specific PAM-altering SNP variations provides the means to inactivate mutant alleles from the source in a completely allele-specific manner, and with appropriate knowledge of background haplotype, can be used to permanently inactivate any gain-of-function mutations in the human genome.
Results
HTT gene haplotype-specific CRISPR PAM sites generated by PAM-altering SNPs
Our extensive characterization of many haplotypes carrying the HD mutation points to the need in such disorders for a personalized approach to mutant allele-specific strategies to eliminate production of the toxic mutant huntingtin protein. Because we have previously defined individual polymorphic sites that distinguish the common haplotypes in the HD and normal populations (6,21), we reasoned that we could achieve optimal selectivity to mutant allele inactivation using a CRISPR/Cas9 strategy targeting DNA variations that create PAM sequences on the mutant allele. We focused on SNPs altering the CRISPR PAM sites as our basis of allele discrimination since CRISPR SpCas9 nuclease tolerates single base mismatch between the target sequence and crRNA (24). We have elaborated a strategy for predictable haplotype-specific inactivation that relies on two gRNAs to simultaneously target locations with PAM-altering variations and thereby to generate a pre-designed knockout deletion only on the mutant chromosome.

A comprehensive landscape of haplotype-specific NGG PAM sites. For 984 SNPs whose reference or alternative alleles generate NGG PAM sites (chr4: 3042474-3315873; hg19 coordinate), we identified haplotype-specific PAM sites. For a given HTT haplotype, 1000 Genomes Project chromosomes do not have entirely identical sequences. Therefore, for a given HTT haplotype and NGG PAM site, we calculated the percentage of chromosomes with the NGG PAM sequence. We then compared haplotypes in a pair-wise fashion to identify optimal sites with the maximum presence/absence on disease (diagonal-axis) and normal chromosomes (horizontal-axis) to target the disease chromosome in any given combination of haplotypes. Each PAM site on the disease chromosome is represented by a dot whose size reflects the difference in its frequency on the disease and normal haplotype of the pair. The large dots are exclusively on the disease chromosome (i.e., 100% differential). The vertical-axis represents the chromosomal location of the PAM site variant with an adjacent schematic diagram of HTT (RefSeq, NM_002111). A red dotted boundary line indicates the location of the CAG repeat (a filled red triangle on the Y-axis) relative to PAM sites in each haplotype. Each mutant haplotype is colour-coded. For example, red dots represent the differences between hap.01 and other chromosomes, in cases where hap.01 is the mutant chromosome. Red and black arrows point to hap.01/hap.08 and hap.08/hap.01 combinations, respectively.
Allele-specificity of PAM-altering SNP-based CRISPR/Cas9
For a proof-of-principle experiment, we specifically examined the potential for discrimination of disease and normal chromosomes based on variants present in the most frequent diplotype in the European HD population, namely hap.01/hap.08, accounting for approximately 9% of HD individuals. There are 9 locations where all 47 hap.01 chromosomes from the 1000 Genomes Project have an NGG PAM site that is absent from any of 382 hap.08 chromosomes (Fig. 1, red arrow). Conversely, 16 sites have NGG PAM sequences on hap.08 chromosomes that are not on hap.01 chromosomes (Fig. 1, black arrow). As the goal of this proof-of-principle test was to inactivate HTT on the hap.01 chromosome in primary fibroblast cells from an HD patient (Coriell ID, GM01169) with an expanded CAG allele on hap.01 and a normal CAG allele on hap.08, we chose targets among the 9 NGG PAM sites specific to hap.01.

Allele-specific PAM sites and monitoring SNPs on the hap.01/hap.08 diplotype. The most common HTT haplotypes on mutant and normal chromosomes are hap.01 and hap.08, respectively. Thus, we tested our novel strategy using primary fibroblast cells derived from a HD patient (Coriell cell line ID, GM01169) with the hap.01 and hap.08 haplotype combination. This HD patient carries 44 and 17 CAGs on hap.01 and hap.08, respectively. The percentage of PAM-containing chromosomes from the 1000 Genomes Project analysis was used to identify hap.01-specific PAM sites. Each triangle represents the location of an NGG PAM site and its frequency difference between hap.01 and hap.08 in the 1000 Genomes Project data. Subsequently, two NGG PAM sites that 1) are present only on the hap.01 haplotype, 2) enclose a region spanning the promoter and the HD mutation, and 3) are flanked by nearby SNPs with different genotypes for hap.01 and hap.08 were chosen to test our strategy. SNPs rs1313774 and rs16843804 were used for the NGG PAM sequence of gRNA 1 and gRNA 2, respectively. Successful targeting by gRNA1 and gRNA2 would excise approximately 44 kb of DNA including the promoter region, transcription start site, and the first 3 exons of HTT including the CAG expansion mutation. Two monitoring SNPs, rs1313774 and rs2024115, were used to evaluate the allele specificity of CRISPR/Cas9. Genotypes of those 4 SNP sites were determined by PCR assay followed by sequencing. Phased genotypes were independently validated in a surrogate trio samples (Supplementary Material, Fig. S5).

Mutant allele-specific modifications recovered after dual gRNA CRISPR/Cas9 targeting. (A) Cells were treated simultaneously with gRNA 1 and gRNA 2. PCR assays were performed on the transfected pool to determine whether the ∼44 kb deletion had occurred. Based upon the assay design, a ∼700 bp PCR product would be generated only if the 44 kb were removed from the genomic DNA. EV and Dual gRNA represent transfection of empty CRISPR/Cas9 vector and co-transfection of gRNA1 and gRNA2 CRISPR/Cas9 vectors, respectively. (B) The 700 bp PCR product in Figure 3A was extracted from the gel and subjected to MiSeq sequence analysis to determine allele-specificity and to characterize modified alleles. (C) 5 different modifications were recovered by MiSeq analysis in a representative transfection experiment. All of them carried the ‘G’, ‘C’ and ‘A’ alleles at rs1313773 (Monitoring SNP 1), rs1313774 (PAM site 1) and rs2024115 (monitoring SNP 2), respectively, indicating that the mutant chromosome (hap.01) was targeted simultaneously by the two gRNA. rs16843804 (PAM site 2) was part of the excised region, and therefore, not detected by MiSeq analysis. All alleles with modifications indicated a ∼44 kb large deletion of the mutant chromosome. Breakpoints are connected by the black solid line in each targeted allele. Letters in black and grey represent observed and expected sequences, respectively.
Modifications of DNA by single gRNA Cells were transfected with a single vector containing gRNA 1 (with rs1313774 NGG PAM site), and DNA samples (collected after 5 days) were amplified for MiSeq analysis of total reads to determine the types of modification of target sites on the mutant and normal chromosomes. Mutant and normal alleles represented in the sequence reads were identified by the ‘G’ and ‘A’ alleles of monitoring SNP 1 (i.e., rs1313773), respectively. Recovered w/o modification and recovered with modification indicate that the sequence was unaltered or altered, respectively, by the CRISPR/Cas9. Number of reads and % indicate the number of unique MiSeq sequence reads and the proportion of such reads, respectively. Similarly, cells were transfected with a single vector containing gRNA 2 (with rs16843804 NGG PAM site), and DNA samples were amplified for MiSeq analysis. For this region, mutant and normal alleles were identified by the A and G alleles of the monitoring SNP 2 (i.e., rs2024115), respectively.
gRNA 1 . | ||
---|---|---|
Allele . | Description . | Number of reads (%) . |
Mutant | Recovered w/o modification | 28,538 (46.5) |
Recovered with modification | 861 (1.4) | |
Normal | Recovered w/o modification | 31,942 (52.1) |
Recovered with modification | 0 (0) | |
gRNA 2 | ||
Allele | Description | Number of reads (%) |
Mutant | Recovered w/o modification | 16,753 (29.7) |
Recovered with modification | 8,824 (15.7) | |
Normal | Recovered w/o modification | 30,103 (53.4) |
Recovered with modification | 671 (1.2) |
gRNA 1 . | ||
---|---|---|
Allele . | Description . | Number of reads (%) . |
Mutant | Recovered w/o modification | 28,538 (46.5) |
Recovered with modification | 861 (1.4) | |
Normal | Recovered w/o modification | 31,942 (52.1) |
Recovered with modification | 0 (0) | |
gRNA 2 | ||
Allele | Description | Number of reads (%) |
Mutant | Recovered w/o modification | 16,753 (29.7) |
Recovered with modification | 8,824 (15.7) | |
Normal | Recovered w/o modification | 30,103 (53.4) |
Recovered with modification | 671 (1.2) |
Modifications of DNA by single gRNA Cells were transfected with a single vector containing gRNA 1 (with rs1313774 NGG PAM site), and DNA samples (collected after 5 days) were amplified for MiSeq analysis of total reads to determine the types of modification of target sites on the mutant and normal chromosomes. Mutant and normal alleles represented in the sequence reads were identified by the ‘G’ and ‘A’ alleles of monitoring SNP 1 (i.e., rs1313773), respectively. Recovered w/o modification and recovered with modification indicate that the sequence was unaltered or altered, respectively, by the CRISPR/Cas9. Number of reads and % indicate the number of unique MiSeq sequence reads and the proportion of such reads, respectively. Similarly, cells were transfected with a single vector containing gRNA 2 (with rs16843804 NGG PAM site), and DNA samples were amplified for MiSeq analysis. For this region, mutant and normal alleles were identified by the A and G alleles of the monitoring SNP 2 (i.e., rs2024115), respectively.
gRNA 1 . | ||
---|---|---|
Allele . | Description . | Number of reads (%) . |
Mutant | Recovered w/o modification | 28,538 (46.5) |
Recovered with modification | 861 (1.4) | |
Normal | Recovered w/o modification | 31,942 (52.1) |
Recovered with modification | 0 (0) | |
gRNA 2 | ||
Allele | Description | Number of reads (%) |
Mutant | Recovered w/o modification | 16,753 (29.7) |
Recovered with modification | 8,824 (15.7) | |
Normal | Recovered w/o modification | 30,103 (53.4) |
Recovered with modification | 671 (1.2) |
gRNA 1 . | ||
---|---|---|
Allele . | Description . | Number of reads (%) . |
Mutant | Recovered w/o modification | 28,538 (46.5) |
Recovered with modification | 861 (1.4) | |
Normal | Recovered w/o modification | 31,942 (52.1) |
Recovered with modification | 0 (0) | |
gRNA 2 | ||
Allele | Description | Number of reads (%) |
Mutant | Recovered w/o modification | 16,753 (29.7) |
Recovered with modification | 8,824 (15.7) | |
Normal | Recovered w/o modification | 30,103 (53.4) |
Recovered with modification | 671 (1.2) |
Permanent inactivation of the CAG expansion mutation by mutant haplotype-specific CRISPR/Cas9

Characterization of the targeted mutant allele. (A) Genotypes of untreated cells at rs1313773 (monitoring SNP 1), rs1313774 (PAM site 1), rs16843804 (PAM site 2), and rs2024115 (Monitoring SNP 2) were determined by Sanger sequencing analyses. All genotypes were heterozygous, as expected for the hap.01/hap.08 diplotype. (B) A targeted single cell clone (TSCC 1) was genotyped for same variations to confirm the large deletion and to reveal its chromosome of origin. As anticipated, PCR assays revealed only the normal hap.08 allele at all 4 SNP sites, indicating that the deletion occurred on the hap.01 disease chromosome, removing binding sites for assay PCR primers. (C) DNA sequencing of the ∼700 bp products revealed that it was derived from the targeted disease chromosome as evidenced by the presence of hap.01-specific alleles at 2 monitoring sites and an ∼44 kb genomic deletion (i.e., from chr4:3059653 to chr4:3104394). The breakpoint junction is marked above the central sequence by an inverted triangle. Genomic coordinates are given below.

Consequences of the excised region. To determine whether the ∼44 kb excised DNA from chromosome 4 was incorporated into any other region of the genome, we PCR amplified and genotyped two SNPs that are informative in terms of distinguishing hap.01 from hap.08. Haplotype-discriminating SNPs rs3856973 and rs2285086 are in intron 1 and intron 2 of HTT, respectively; hap.01 and hap.08 carry the ‘G’ and ‘A’ alleles for rs3856973, and the ‘A’ and ‘G’ alleles for rs2285086, respectively. If these segments of the excised ∼44 kb DNA fragment were incorporated elsewhere in the genome without inversion, genotyping assays would show the presence of both alleles. Untreated cells showed heterozygous genotypes for both haplotype-discriminating SNPs, but the targeted cell clone exhibited only the alleles diagnostic of the normal HTT segment. In addition, primers designed to detect inversion of the excised region did not generate PCR products (data, not shown), indicating that the segment was not incorporated in the genome with an inverted configuration.

Outcomes of dual gRNA-mediated mutant allele-specific CRISPR/Cas9 targeting. A targeted single cell clone (TSCC 1) was fully characterized to understand the outcomes of mutant allele-specific CRISPR/Cas9 targeting. Untreated cells (GM01169) were used as a reference sample. (A) A deletion of approximately 44 kb DNA in the TSCC 1 was confirmed by a PCR analysis. (B) Removal of the expanded CAG repeat from the targeted hap.01 chromosome was confirmed by PCR assay of DNA from the untreated cells and a targeted clone. (C) Absence of HTT RNA with the expanded CAG repeat as a result of our targeting was confirmed by RT-PCR analysis of RNA from the untreated parental line and a targeted clone. (D) Total huntingtin protein levels (normal and mutant) were determined by western blot analysis using a pan huntingtin antibody (MAB 2166) in the untreated parental line and targeted cell clone. (E) Western blot analysis was performed using mutant huntingtin-specific antibody (1F8) to determine the expression levels of mutant huntingtin. (F) Loading of equal amount of protein samples was confirmed by beta-actin (ACTB) antibody. (G) A summary of the CRISPR/Cas9-targeting induced DNA structure is shown. The normal haplotype is intact and its RNA production is unhampered. However, the targeted disease chromosome now lacks the promoter region, transcription start site, the first exon with the mutation, and two additional exons and did not produce mRNA with expanded CAG.
Applications of various combinations of allele-specific gRNAs to different HD patients’ cell lines

Application of allele-specific dual gRNA-mediated CRISPR/Cas9 to independent cell lines. Different combinations of allele-specific gRNAs based on PAM-altering SNPs were tested on independent HD cell lines. We tested 1) an NPC line from an iPS cell line (CS97iHD-180n1) (27,28) derived from a Coriell fibroblast GM09197, and 2) an iPS cell line derived from a Coriell fibroblast GM04723. An NPC line of GM09197 carries hap.01 and hap.08 as expanded (CAG 180) and normal chromosome (CAG 18), respectively. An iPS line of GM04723 carries hap.03 and hap.08 as expanded (CAG 69) and normal chromosome (CAG15), respectively. Among other possible gRNA combinations, we tested 3 sets of gRNAs: (A) gRNA 1 and gRNA 2 (Fig. 2) to target hap.01 mutant allele in the NPC line, (B) gRNA 3 and gRNA 4 to target hap.01 mutant chromosome in the NPC line, and (C) gRNA 3 and gRNA 2 to target mutant hap.03 chromosome of an iPS cells. gRNA 3 and gRNA 4 depend on PAM sites on the mutant chromosomes, generated by SNPs rs2857935 and rs7659144, respectively (Supplementary Material, Table S4). After co-transfection of two CRISPR/Cas9 vectors expressing two gRNAs/puromycin selection, PCR assays were performed using sets of primers described in the method section. Expected size of excision for each experiment is shown under each gel picture. Based upon the PCR assay designs, each PCR products would be generated only if the large deletion occurs. EV represents an empty vector control transfection.
Discussion
Gene silencing approaches to lower the levels of gain-of-function mutant proteins have demonstrated some therapeutic benefit in animal models and are being moved forward to human trials (15–18). However, the degree to which the toxic protein must be reduced to eliminate any disease-producing effect is not known in any of these disorders. On the other hand, mRNA-lowering approaches that are not sufficiently allele-selective risk deleterious consequences due to insufficient normal protein activity. For example, with respect to HTT, huntingtin function is thought to be essential for the development and maintenance of mature neurons, but one copy of the gene is sufficient for normal development and viability (22,29–31). Consequently, the maximal therapeutic benefit is expected when the mutant allele is completely silenced while expression from the normal allele is left intact. In addition, the hurdle of balancing allelic effects of gene silencing at the RNA level may require repeated treatments which could increase the risk of complications in individuals with chronic progressive neurodegenerative diseases. These considerations make mutant allele-specific DNA targeting approaches highly promising, because the therapeutic modification of DNA represents a permanent change and therefore does not require repeated treatments. With some beneficial effects of RNA-lowering approaches in HD model systems, establishing a personalized DNA targeting strategy with perfect mutant allele specificity to avoid any permanent inactivation of the normal allele is critically important for the development of effective and safe therapeutics. The alternative approach that we propose is to directly target the DNA to completely inactivate only the mutant allele from the source, as we demonstrate that a personalized dual-gRNA CRISPR/Cas9 strategy of utilizing PAM-altering variants can achieve this complete selectivity to prevent the generation of mutant HTT mRNA.
Our novel strategy to improve allele specificity, not focusing on increasing targeting efficiency, is significant for a number of reasons. Firstly, perfect allele specificity was achieved by simultaneously using two gRNAs targeting mutant allele PAM sites generated by SNP variations. Partial allele specificity may be obtained by targeting an SNP allele by crRNA hybridization (23,32). However, we hypothesize that better allele discrimination can be achieved by using PAM-altering SNPs as we demonstrated near perfect and perfect allele specificity by single gRNA and dual gRNA approaches, respectively. Non-canonical PAM sequences were able to generate DNA breaks as previously reported (26) and as shown in our data. However, a dual gRNA approach overcame the limitation of non-canonical PAM- mediated CRISPR/Cas9 targeting, allowing perfect allele specificity. Very low levels of DNA breaks may occur at a single gRNA site of normal alleles, but such non-allele-specific modifications at single site in the promoter region or in an intron may be tolerated because those changes do not impact the transcription of the normal allele dramatically. The use of two gRNAs simultaneously may increase the levels of off-targeting and lower the targeting efficiency as observed in our data, but effort in the field to improve the specificity and efficacy of CRISPR/Cas9 systems is expected to significantly improve the target specificity and efficiency of the CRISPR/Cas9 systems (15,33). Secondly, our strategy of eliminating the promoter region, transcription start site and the disease mutation completely prevented the production of mRNA from the mutant allele, and therefore is not dependent on nonsense-mediated decay mechanisms for gene inactivation. Depending on the types of modifications generated by CRISPR/Cas9, nonsense-medicated decay may not be activated (e.g. in-frame deletion), or even truncated protein may be produced potentially leading to even worse disease outcomes (34,35). The problem of unpredictable modification in DNA is a serious problem when CRISPR/Cas9 is directly applied to humans. However, this potential problem can be solved by eliminating the essential elements required for transcription of the mutant allele as demonstrated here. Disease alleles by point mutations may be distinguished from normal alleles by crRNA hybridization only when nearby PAM sites are present (32,36). In addition, rare cases of disease mutations directly alter PAM sequences permitting allele discrimination (37,38). Even in challenging situations where the PAM sites are not present near the disease mutations or disease mutations do not directly generate PAM sites, our strategy of eliminating promoter and transcription start site in an allele-specific fashion will successfully be applied to knock out disease alleles regardless of the types and positions of the mutations as long as haplotypes of genes are characterized. Our approach targeting disease allele-carrying haplotype guarantees the inactivation of the mutant allele regardless of the types of the modifications, providing a versatile mechanism for completely shutting down any gene of choice. Lastly, our novel approach of using PAM-altering SNPs and nearby DNA variations provides highly powerful experimental tools for directly monitoring specificity of allele-selective CRISPR/Cas9 reagents. When haplotype-informative SNPs are used, determination of targeting efficiency, allele specificity and possibility of genome-integration are quite straightforward. In addition, when a single cell clones with an inactivated mutant allele have to be established, this advantage make the screening assays highly feasible. DNA polymorphisms that are wide-spread in humans make our strategy powerful and broadly applicable, permitting the application of our concept to virtually any places in the genome.
With prior detailed delineation of necessary haplotypes, this strategy can be applied to a wide variety of devastating dominant gain-of-function disorders; some notable examples are listed in Table 2, along with the array of polymorphisms that affect only NGG PAM sites which provide a rich collection for targeting sites using our dual-gRNA approach. In our proof-of-principle experiment, we chose not only to inactivate the mutant HTT allele, but in doing so, to remove the mutation itself to eliminate any opportunity for expressing a potentially toxic truncated polyglutamine protein. A similar consideration could be applied, where appropriate, in the design of therapeutic approaches in these other dominant disorders. Critical challenges have to be addressed before applying CRISPR strategies to humans. Nonetheless, the promise of this strategy to completely inactivate only the mutant gene dictates that it now be tested in animal model systems to define the safety of the approach and to establish optimal mechanisms for delivery and efficiency in order to support eventual clinical trials.
NGG PAM-altering SNPs in selected genes associated with dominant disease. We searched NGG PAM-altering SNPs in a limited selection of genes responsible for dominant disease outcomes in order to evaluate the breadth of applicability of our approach. For each locus, the longest transcript from the RefSeq database was chosen to represent the gene. Subsequently, a flanking region of 10 kb was added at both ends of the transcript region to search for NGG PAM-altering SNPs in 1000 Genomes Project data (phase 1, release 3; minor allele frequency > 1% in all populations). The numbers of NGG PAM sites generated by either reference alleles or alternative alleles in the region, representing a repertoire of allele-specific CRISPR/Cas9 target sites across each gene, were counted
Disease . | Gene . | Region (hg19) . | Number of SNPs with MAF > 1% . | NGG PAM sites generated by reference alleles . | NGG PAM sites generated by alternative alleles . |
---|---|---|---|---|---|
Alzheimer disease | APP | chr21:27242860-27553446 | 1329 | 288 | 196 |
Amyotrophic lateral sclerosis | SOD1 | chr21:33021934-33051243 | 84 | 24 | 18 |
Frontotemporal dementia | MAPT | chr17:43961747-44115699 | 972 | 214 | 207 |
Frontotemporal dementia | C9orf72 | chr9:27536542-27583864 | 266 | 47 | 45 |
Parkinson disease | LRRK2 | chr12:40608812-40773086 | 754 | 138 | 105 |
Dentatorubral-pallidoluysian atrophy | ATN1 | chr12:7023625-7061484 | 133 | 42 | 24 |
Myotonic dystrophy 1 | DMPK | chr19:46262966-46295815 | 94 | 38 | 10 |
Myotonic dystrophy 2 | CNBP | chr3:128876657-128912810 | 96 | 26 | 9 |
Spinocerebellar ataxia Type 1 | ATXN1 | chr6:16289342-16771721 | 1884 | 426 | 276 |
Spinocerebellar ataxia Type 2 | ATXN2 | chr12:111880017-112047480 | 370 | 78 | 53 |
Spinocerebellar ataxia Type 3 | ATXN3 | chr14:92514895-92582965 | 416 | 94 | 61 |
Spinocerebellar ataxia Type 6 | CACNA1A | chr19:13307255-13627274 | 1543 | 408 | 263 |
Spinocerebellar ataxia Type 7 | ATXN7 | chr3:63840232-63999136 | 419 | 75 | 67 |
Spinocerebellar ataxia Type 17 | TBP | chr6:170853420-170891958 | 147 | 33 | 21 |
Disease . | Gene . | Region (hg19) . | Number of SNPs with MAF > 1% . | NGG PAM sites generated by reference alleles . | NGG PAM sites generated by alternative alleles . |
---|---|---|---|---|---|
Alzheimer disease | APP | chr21:27242860-27553446 | 1329 | 288 | 196 |
Amyotrophic lateral sclerosis | SOD1 | chr21:33021934-33051243 | 84 | 24 | 18 |
Frontotemporal dementia | MAPT | chr17:43961747-44115699 | 972 | 214 | 207 |
Frontotemporal dementia | C9orf72 | chr9:27536542-27583864 | 266 | 47 | 45 |
Parkinson disease | LRRK2 | chr12:40608812-40773086 | 754 | 138 | 105 |
Dentatorubral-pallidoluysian atrophy | ATN1 | chr12:7023625-7061484 | 133 | 42 | 24 |
Myotonic dystrophy 1 | DMPK | chr19:46262966-46295815 | 94 | 38 | 10 |
Myotonic dystrophy 2 | CNBP | chr3:128876657-128912810 | 96 | 26 | 9 |
Spinocerebellar ataxia Type 1 | ATXN1 | chr6:16289342-16771721 | 1884 | 426 | 276 |
Spinocerebellar ataxia Type 2 | ATXN2 | chr12:111880017-112047480 | 370 | 78 | 53 |
Spinocerebellar ataxia Type 3 | ATXN3 | chr14:92514895-92582965 | 416 | 94 | 61 |
Spinocerebellar ataxia Type 6 | CACNA1A | chr19:13307255-13627274 | 1543 | 408 | 263 |
Spinocerebellar ataxia Type 7 | ATXN7 | chr3:63840232-63999136 | 419 | 75 | 67 |
Spinocerebellar ataxia Type 17 | TBP | chr6:170853420-170891958 | 147 | 33 | 21 |
NGG PAM-altering SNPs in selected genes associated with dominant disease. We searched NGG PAM-altering SNPs in a limited selection of genes responsible for dominant disease outcomes in order to evaluate the breadth of applicability of our approach. For each locus, the longest transcript from the RefSeq database was chosen to represent the gene. Subsequently, a flanking region of 10 kb was added at both ends of the transcript region to search for NGG PAM-altering SNPs in 1000 Genomes Project data (phase 1, release 3; minor allele frequency > 1% in all populations). The numbers of NGG PAM sites generated by either reference alleles or alternative alleles in the region, representing a repertoire of allele-specific CRISPR/Cas9 target sites across each gene, were counted
Disease . | Gene . | Region (hg19) . | Number of SNPs with MAF > 1% . | NGG PAM sites generated by reference alleles . | NGG PAM sites generated by alternative alleles . |
---|---|---|---|---|---|
Alzheimer disease | APP | chr21:27242860-27553446 | 1329 | 288 | 196 |
Amyotrophic lateral sclerosis | SOD1 | chr21:33021934-33051243 | 84 | 24 | 18 |
Frontotemporal dementia | MAPT | chr17:43961747-44115699 | 972 | 214 | 207 |
Frontotemporal dementia | C9orf72 | chr9:27536542-27583864 | 266 | 47 | 45 |
Parkinson disease | LRRK2 | chr12:40608812-40773086 | 754 | 138 | 105 |
Dentatorubral-pallidoluysian atrophy | ATN1 | chr12:7023625-7061484 | 133 | 42 | 24 |
Myotonic dystrophy 1 | DMPK | chr19:46262966-46295815 | 94 | 38 | 10 |
Myotonic dystrophy 2 | CNBP | chr3:128876657-128912810 | 96 | 26 | 9 |
Spinocerebellar ataxia Type 1 | ATXN1 | chr6:16289342-16771721 | 1884 | 426 | 276 |
Spinocerebellar ataxia Type 2 | ATXN2 | chr12:111880017-112047480 | 370 | 78 | 53 |
Spinocerebellar ataxia Type 3 | ATXN3 | chr14:92514895-92582965 | 416 | 94 | 61 |
Spinocerebellar ataxia Type 6 | CACNA1A | chr19:13307255-13627274 | 1543 | 408 | 263 |
Spinocerebellar ataxia Type 7 | ATXN7 | chr3:63840232-63999136 | 419 | 75 | 67 |
Spinocerebellar ataxia Type 17 | TBP | chr6:170853420-170891958 | 147 | 33 | 21 |
Disease . | Gene . | Region (hg19) . | Number of SNPs with MAF > 1% . | NGG PAM sites generated by reference alleles . | NGG PAM sites generated by alternative alleles . |
---|---|---|---|---|---|
Alzheimer disease | APP | chr21:27242860-27553446 | 1329 | 288 | 196 |
Amyotrophic lateral sclerosis | SOD1 | chr21:33021934-33051243 | 84 | 24 | 18 |
Frontotemporal dementia | MAPT | chr17:43961747-44115699 | 972 | 214 | 207 |
Frontotemporal dementia | C9orf72 | chr9:27536542-27583864 | 266 | 47 | 45 |
Parkinson disease | LRRK2 | chr12:40608812-40773086 | 754 | 138 | 105 |
Dentatorubral-pallidoluysian atrophy | ATN1 | chr12:7023625-7061484 | 133 | 42 | 24 |
Myotonic dystrophy 1 | DMPK | chr19:46262966-46295815 | 94 | 38 | 10 |
Myotonic dystrophy 2 | CNBP | chr3:128876657-128912810 | 96 | 26 | 9 |
Spinocerebellar ataxia Type 1 | ATXN1 | chr6:16289342-16771721 | 1884 | 426 | 276 |
Spinocerebellar ataxia Type 2 | ATXN2 | chr12:111880017-112047480 | 370 | 78 | 53 |
Spinocerebellar ataxia Type 3 | ATXN3 | chr14:92514895-92582965 | 416 | 94 | 61 |
Spinocerebellar ataxia Type 6 | CACNA1A | chr19:13307255-13627274 | 1543 | 408 | 263 |
Spinocerebellar ataxia Type 7 | ATXN7 | chr3:63840232-63999136 | 419 | 75 | 67 |
Spinocerebellar ataxia Type 17 | TBP | chr6:170853420-170891958 | 147 | 33 | 21 |
Materials and Methods
Identification of PAM-altering variations and mapping PAM sites to HTT haplotypes
DNA variations reported by the 1000 Genomes Project (phase 1, release 3) were analysed to identify 1) SNPs and indels that either create or eliminate PAM sequences, and 2) indel polymorphisms flanked by PAM sequences for various bacterial strains. This analysis was focused on HTT (chromosome 4: 3042474-3315873, hg19 assembly) to avoid targeting of regulatory or coding regions of other genes. Subsequently, 1000 Genomes Project chromosomes were classified into the HTT haplotypes that we have developed previously (6,21). For each haplotype and PAM site, the percentage of PAM-containing chromosomes out of all chromosomes was calculated to generate a comprehensive map of PAM sites on each HTT haplotype. PAM sites were compared in a pair-wise manner to reveal haplotype-specific PAM sites for personalized allele-selective targeting in any given diplotype.
Selection of haplotype-specific PAM sites to specifically inactivate mutant HTT
Since hap.01 and hap.08 are the most common HTT haplotypes on HD disease and normal chromosomes, respectively (21), we tested our strategy in the cells of an HD patient with the hap.01/hap.08 diplotype. Sites of the most commonly used PAM sequence (NGG from Streptococcus pyogenes Cas9) were compared between hap.01 and hap.08 chromosomes in the 1000 Genomes Project data. We aimed at completely inactivating the mutant allele using two mutant allele-specific CRISPR/Cas9 vectors. Therefore, we identified two NGG PAM sites that are: 1) present on hap.01 (representing the mutant chromosome), but absent from hap.08 (representing the normal chromosome), 2) flanking the promoter region and the HD CAG expansion mutation in the first exon and 3) flanked by nearby SNPs to monitor allele specificity (monitoring SNPs). Genotypes of PAM sites, monitoring SNPs and other haplotype-informative SNPs were confirmed by Sanger sequencing.
Generation of vector co-expressing gRNA and Cas9
CRISPR/Cas9 vectors were prepared specifying two different gRNAs at regions requiring the chosen PAM sites: gRNA 1 (chr4:3059651-3059670) targets upstream of the promoter region and gRNA 2 (chr4:3104392-3104411) targets an intron 3 region. lentiCRISPRv2 plasmid was a purchased from Addgene (plasmid # 52961). gRNA sequences for target sites were designed and these sequences were cloned into the vector according to the protocol on the GeCKO website (http://genome-engineering.org/gecko/). These vectors produce both crRNA and tracerRNA as a contiguous molecule. Sequences of crRNAs are provided in Supplementary Material, Figure S6. Quality and potential off-target sites of gRNAs were evaluated using the Optimized CRISPR Design website (http://crispr.mit.edu/).
Cell culture, transfection and establishment of targeted single cell clones
Untransformed primary fibroblast cells derived from a male Caucasian HD patient were obtained from the Coriell Cell Repositories. Cells (GM01169, https://catalog.coriell.org/0/Sections/Search/Sample_Detail.aspx?Ref=GM01169) carry hap.01 and hap.08 haplotypes based on phasing and haplotype analysis (6). Fibroblast cells were cultured in Dulbecco’s modified eagle medium (DMEM, Gibco) supplemented with 20% FBS (Gibco) and incubated at 37 °C in a humidified chamber with 5% CO2. Cells were transfected with both plasmid vectors simultaneously or singly. Approximately, 1.2 million cells were transfected with 4 µg of vector by electroporation (Amaxa Nucleofector I, Lonza) using the Basic Primary Fibroblasts Nucleofector Kit (Lonza). Forty-eight hours after transfection, cells were treated with 1.5 µg/ml of puromycin (Invitrogen) for 72 h to enrich transfected cells. To obtain single cell clones, a small fraction of cells were then transferred to new plates; ∼1000 single cells were seeded onto a 10 of 100 mm culture dishes by limiting dilution and incubated for 2 weeks. Subsequently, each well-isolated visible colony was picked and independently cultured for validation analysis. Four independent targeted clones whose mutant HTT alleles were selectively inactivated were obtained from 4 independent transfection experiments. One representative clone was further fully characterized.
Extraction of genomic DNA, total RNA and cDNA synthesis
Genomic DNA was isolated from fibroblast cells using DNeasy Blood & Tissue Kit (Qiagen) according to the manufacturer’s instructions. Total RNA was extracted from cells using RNeasy Plus Mini Kits (Qiagen). The quality and quantity of extracted RNA were measured using NanoDrop (Thermo Scientific). cDNA was synthesized from 50 ng of total RNA using SuperScript III Reverse Transcriptase (Invitrogen) according to the manufacturer’s protocol.
Polymerase chain reaction (PCR)
PCR reactions were performed by using Q5 High-Fidelity DNA Polymerase (NEB) or AccuPrime GC-Rich DNA Polymerase (Invitrogen) for high GC content PCR. The primer sets used in PCR experiments are provided in Supplementary Material, Table S5. The reaction for Q5 High-Fidelity DNA Polymerase was performed as follows: initial denaturation for 3 min at 98 °C, then 35 cycles of denaturation for 30 s at 98 °C, annealing for 30 s at 64 °C and extension for 30 s at 72 °C, then the final extension for 2 min at 72 °C. The reaction for AccuPrime GC-Rich DNA Polymerase was performed as follows: initial denaturation for 3 min at 95 °C, then 30 cycles of denaturation for 30 s at 95 °C, annealing for 30 s at 59 °C and extension for 30 s at 72 °C, then the final extension for 10 min at 72 °C. PCR amplicons were resolved in 2% agarose gel (Invitrogen), and extracted from the gel using QIAquick Gel Extraction Kit (Qiagen) for subsequent analysis.
Deep sequencing of PCR products by MiSeq and western blot analysis
Deep sequencing of PCR products was performed by the CCIB DNA Core at the Massachusetts General Hospital (Boston, MA). Briefly, upon ligation of Illumina adaptors and a unique identifier to the amplicons, paired-end sequencing (2x150b) was performed on the Illumina MiSeq system. Analysis-ready data were provided by the core facility and further summarized. Mutant huntingtin protein levels were determined by Western blot analysis. 15 microgram of whole cell lysate was resolved by 4% Tris-Glycine gel. A transferred membrane was probed by mutant huntingtin-specific 1F8 antibody or pan huntingtin antibody (MAB2166 from EMD Millipore). Equal loading was confirmed by beta-actin (ACTB) protein.
Application of allele-specific dual gRNA-mediated CRISPR/Cas9 to an NPC and an iPS cell line from independent HD subjects
An NPC cell line was generated from an iPS cell line (CS97iHD-180n1) by a STEMdiff protocol using Neural Induction Medium (STEMCELL Technologies). The parental iPS cell line CS97iHD-180n1 was derived from Coriell fibroblast GM09197 (https://catalog.coriell.org/0/Sections/Search/Sample_Detail.aspx?Ref=GM09 197&Product=CC) (28). This NPC cell line was maintained on poly-L-ornithine/laminin coated plates in media (70% DMEM, 30% Hams F12, 1X B27 Supplement, 20 ng/ml FGF, 20 ng/ml EGF and 5 μg/ml Heparin) with 5% CO2 at 37 °C. An iPS cell line was generated from Coriell fibroblast GM04723 (https://catalog.coriell.org/0/Sections/Search/Sample_Detail.aspx?Ref=GM04723& Product=CC) by the Harvard Stem Cell Institute iPS Core Facility (http://ipscore.hsci.harvard.edu/) (39). iPS cell lines were cultured on matrigel-coated plates with mTeSR™1 (STEMCELL Technologies) with 5% CO2 at 37 °C. The HTT gene haplotype of these cell lines were determined by haplotyping phasing as described elsewhere (6). The same gRNAs (i.e., gRNA 1 and gRNA 2; Fig. 2) were tested on the NPC line. In addition, two additional gRNAs were tested on an NPC and an iPS cell line. The gRNA 3 targets chr4:0375669-3075688 (hg19 assembly), and depends on the PAM site generated by rs2857935. This gRNA was used in combinations to treat both NPC and iPS lines. The gRNA 4 targets chr4:3098324-3098343, and depends on the PAM site generated by rs7659144. This gRNA was used in combinations to test the NPC line. In summary, the NPC line was treated with either 1) gRNA 1 + gRNA 2, or 2) gRNA 3 + gRNA 4, and the iPS line was treated with gRNA 3 + gRNA 2 (Fig. 7). NPC and iPS cell lines were transfected with 2 μg of CRISPR vectors containing gRNAs by an electroporation method (Amaxa Nucelofector I, Lonza) using Human Stem Cell Nucleofector Kit 1 (Lonza). Twenty-four hour post-transfection, iPSCs and NPCs were then treated with 0.5 μg/ml of puromycin for 48 h or 2 μg/ml of puromycin for 72 h, respectively. Subsequently, genomic DNA was extracted (Qiagen) for PCR assays to confirm large deletion. Excision of the expanded CAG repeat by gRNA 1 + gRNA 2 would generate approximately 400 bp PCR amplification when the following primers were used; forward primer, GTAT T T T T A G T A GAGACGGGGTTTC and reverse primer, ATTCC T C ACA G C A C ATCTCT. Excision of the expanded CAG repeat by gRNA 3 + gRNA 4 was predicted to generate approximately 400 bp PCR product when amplified by the following two primers: forward primer, TGAGTATGGCTCTGGCCACG and reverse primer, AA A T G GAATCCAGAGTTACCAGAAGGG. The excision of CAG expansion mutation by application of gRNA 3 and gRNA 2 to the iPS cell line was expected to generate approximately 700 bp PCR product when the following primers were used; forward primer, CTTCTCGCTGCACTAATCAC and reverse primer, ATTCC TC A C AGCACATCTCT. Amplified DNA from an NPC cell line/gRNA 1 + gRNA 2 experiment was cleaned up using ExoSAP-IT (Affymetrix) (Fig. 7A), and was subjected to MiSeq analysis to detect modified alleles as described in the method section.
Software package for data analysis
All data analyses were performed using R (version, 3.0.2).
Supplementary Material
Supplementary Material is available at HMG online.
Acknowledgements
We also thank Jayla Ruliera (Massachusetts General Hospital), Dr. Feng Zhang (Broad Institute), and Dr. Clive D. Svendson (The Board of Governors Regenerative Medicine Institute) for technical assistance, helpful discussion and CS97iHD-180n1 cell line.
Conflict of Interest statement. None declared.
Funding
This research was supported by Harvard NeuroDiscovery Center, CHDI Foundation and grants from the National Institute of Neurological Disorders and Stroke (U01NS082079 and R01NS091161).
References