Clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated (Cas) protein 9 system provides a robust and multiplexable genome editing tool, enabling researchers to precisely manipulate specific genomic elements, and facilitating the elucidation of target gene function in biology and diseases. CRISPR/Cas9 comprises of a nonspecific Cas9 nuclease and a set of programmable sequence-specific CRISPR RNA (crRNA), which can guide Cas9 to cleave DNA and generate double-strand breaks at target sites. Subsequent cellular DNA repair process leads to desired insertions, deletions or substitutions at target sites. The specificity of CRISPR/Cas9-mediated DNA cleavage requires target sequences matching crRNA and a protospacer adjacent motif locating at downstream of target sequences. Here, we review the molecular mechanism, applications and challenges of CRISPR/Cas9-mediated genome editing and clinical therapeutic potential of CRISPR/Cas9 in future.
Benefiting from the rapid development of high-throughput sequencing technology and bioinformatics, researchers make great progress on gene mapping in a short time. Currently, a major challenge faced by researchers is how to reveal the molecular mechanism of genes influencing individual phenotypes. A good way to elucidate the function of a gene is to shut it down or overexpress it in living organisms, which is previously complicated and time consuming (1–4). A new approach named ‘genome editing’ emerged and widely used in the studies of functional genomics, transgenic organisms and gene therapy during the past decades. Genome editing is built on engineered, programmable and highly specific nucleases, which can induce site-specific changes in the genomes of cellular organisms through a sequence-specific DNA-binding domain and a nonspecific DNA cleavage domain. Subsequent cellular DNA repair process generates desired insertions, deletions or substitutions at the loci of interest.
Multiple artificial nuclease systems have been developed for genome editing. Zinc-finger nucleases (ZFNs) are one of widely applied engineered nucleases (5–11). ZFNs contain a common Cys2-His2 DNA-binding domain and a DNA cleavage domain of the FokI restriction endonuclease (8). Another popular genome editing platform is transcription activator-like effector nucleases (TALENs) (12–19), which are derived from a natural protein of plant pathogenic bacteria Xanthomonas. The DNA-binding domain of TALENs is composed of 33–35 conserved amino acid repeated motifs, each of which recognizes a specific nucleotide. Through shuffling repeated amino acid recognition motifs, TALENs can be programmed to target-specific DNA sequence. Recently, clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated (Cas) protein 9 system provides an alternative to ZFNs and TALENs for genome editing (20). Distinct from the protein-guided DNA cleavage of ZFNs and TALENs, CRISPR/Cas9 depends on small RNA for sequence-specific cleavage (21). Because only programmable RNA is required to generate sequence specificity, CRISPR/Cas9 is easily applicable and develops very fast over the past year. Here, we review the molecular mechanism, applications and challenges of CRISPR/Cas9-mediated genome editing and clinical therapeutic potential of CRISPR/Cas9 in future.
CRISPR/CAS9-MEDIATED GENOME MODIFICATION
In bacteria and archaea, CRISPR/Cas was discovered as an acquired immune system against viruses and phages through CRISPR RNA (crRNA)-based DNA recognition and Cas nucleases-mediated DNA cleavage (21,22). CRISPR/Cas is observed in nearly 40% genomes of sequenced bacteria and nearly 90% genomes of sequenced archaea (23). CRISPR locus consists of a series of conserved repeated sequences interspaced by distinct nonrepetitive sequences named spacers (Fig. 1A). In CRISPR/Cas system, invading foreign DNA is processed by Cas nuclease into small DNA fragments, which are then incorporated into CRISPR locus of host genomes as the spacers. In response to viruses and phage infections, the spacers are used as transcriptional templates for producing crRNA, which guides Cas to cleave target DNA sequences of invading viruses and phages (Fig. 1B). More than 40 different Cas protein families have been reported (24), playing important roles in crRNA biogenesis, spacers incorporation and invading DNA cleavage. Based on the sequences and structures of Cas protein, CRISPR/Cas system is primarily classified into three types, I, II and III (25). The type II CRISPR/Cas system only needs a single Cas protein Cas9, which contains a HNH nuclease domain and a RuvC-like nuclease domain (21). CRISPR/Cas9 has been demonstrated to be a simple and efficient tool for genome editing.
CRISPR/Cas9-mediated genome editing depends on the generation of double-strand break (DSB) and subsequent cellular DNA repair process. In endogenous CRISPR/Cas9 system, mature crRNA is combined with transactivating crRNA (tracrRNA) to form a tracrRNA:crRNA complex that guides Cas9 to a target site. TracrRNA is partially complementary to crRNA and contributes to crRNA maturation. At the target site, CRISPR/Cas9-mediated sequence-specific cleavage requires a DNA sequence protospacer matching crRNA and a short protospacer adjacent motif (PAM). After binding to the target site, the DNA single-strand matching crRNA and opposite strand are cleaved, respectively, by the HNH nuclease domain and RuvC-like nuclease domain of Cas9, generating a DSB at the target site (Fig. 2). For easy application in genome editing, researchers designed a delicate guide RNA (gRNA), which was a chimeric RNA containing all essential crRNA and tracrRNA components (21). Multiple CRISPR/Cas9 variants have been developed, recognizing 20 or 24 nt sequences matching engineered gRNA and 2–4 nt PAM sequences at target sites. Therefore, CRISPR/Cas9 can theoretically target a specific DNA sequence with 22–29 nt, which is unique in most genomes. However, recent studies observed that CRISPR/Cas9 had high tolerance to base pair mismatches between gRNA and its complementary target sequence, which was sensitive to the numbers, positions and distribution of mismatches (21,26–29). For instance, the CRISPR/Cas9 of Streptococcus pyogenes appeared to tolerate up to six base pair mismatches at target sites (21).
The DSB generated by CRISPR/Cas9 will trigger cellular DNA repair processes, including nonhomologous end-joining (NHEJ)-mediated error-prone DNA repair and homology-directed repair (HDR)-mediated error-free DNA repair. NHEJ-mediated DNA repair can rapidly ligate the DSB but generate small insertion and deletion mutations at target sites. These mutations can help us to disrupt or abolish the function of target genes or genomic elements. For instance, Gratz et al. generated frame-shifting indels at the yellow locus of Drosophila genome through CRISPR/Cas9-induced DNA cleavage following by NHEJ-mediated DNA repair (30). DSB can also initiate HDR-mediated DNA repair, which is more complicated than NHEJ-mediated DNA repair. HDR-mediated error-free DNA repair requires a homology-containing donor DNA sequence as repair template. Through co-injection of Cas9, two gRNA targeting, respectively, the 5′ and 3′ sequences of the yellow locus, and a single-strand oligodeoxynucleotide template, Gratz et al. successfully replaced the yellow locus with a 50 nt attP recombination site in Drosophila genome (30).
Comparing with ZFNs and TALENs, there are several advantages for CRISPR/Cas9. ZFNs and TALENs are built on protein-guided DNA cleavage, which needs complex and time-consuming protein engineering, selection and validation. In contrast, CRISPR/Cas9 only needs a short programmable gRNA for DNA targeting, which is relative cheap and easy to design and produce. Through using Cas9 and several gRNA with different target sites, CRISPR/Cas9 is able to simultaneously induce genomic modifications at multiple independent sites (26). This technology can accelerate the generation of transgenic animals with multiple gene mutations (31,32), and disrupt multiple genes or a whole gene family to investigate gene function and epistatic relationships.
CRISPR/Cas9 provides a robust and multiplexable genome editing tool, enabling researchers to precisely manipulate specific genomic elements, and facilitating the function elucidation of target genes in biology and diseases. Through co-delivery of plasmids expressing Cas9 and crRNA, CRISPR/Cas9 has been used to induce specific genomic modifications in human cells (26,33–36). Through integrating multiple distinct gRNA with Cas9 in a CRISPR array, CRISPR/Cas9 can simultaneously induce multiple mutations in mammalian genomes (26). In addition to mammalian genomes, CRISPR/Cas9 also demonstrates its potentiality in the genome editing of zebrafish (37–41), mice (31,42), drosophila (30,43,44), caenorhabditis elegans (45), Bombyx mori (46) and bacteria (47,48). For instance, Bassett et al. provided an improved RNA injection-based CRISPR/Cas9 system, which was highly efficient for creating desired mutagenesis in Drosophila genome (43). Through directly injecting Cas9 mRNA and gRNA into embryo, they successfully induced mutagenesis at target sites in up to 88% of injected flies. The generated mutations were stably transmitted to 33% of total offspring through the germline (43).
CRISPR/Cas9 is also used to induce desired genomic alterations in plants for generating specific traits, such as valuable phenotypes or disease resistance (49–57). To validate the application of CRISPR/Cas9 in plants, Jiang et al. transferred green fluorescence protein gene into Arabidopsis and tobacco genomes, and bacterial blight susceptibility genes into rice genome (57). Miao et al. illustrated the robustness and efficiency of CRISPR/Cas9 in the genome editing of rice (56). Through modification of crop genomes, CRISPR/Cas9 can be used to improve crop quality as a new breeding technique in future.
Gene transcription regulation in living organisms is very useful for gene function and transcriptional network studies. Through disrupting transcription-related functional sites, CRISPR/Cas9 can regulate the transcription of specific genes. However, this process is irreversible due to permanent DNA modifications. Recently, a modified CRISPR/Cas9 system named CRISPR inference (CRISPRi) is develped for RNA-guided transcription regulation (58–60). Qi et al. generated a catalytically defective Cas9 (dCas9) mutant without nucleases activity. dCas9 was co-expressed with gRNA to form a recognition complex, which could interfere with transcriptional elongation, RNA polymerase and transcription factor binding (60). With two gRNA targeting, respectively, a red fluorescent protein (RFP) gene and a green fluorescent protein (GFP) gene, Qi et al. observed that CRISPRi could simultaneously repress the expression of RFP and GFP without crosstalk in Escherichia coli (60). However, the degree of gene expression repression achieved by CRISPRi was modest in mammalian cells (60). Gilbert et al. fused repressive or activating effector domains to dCas9, which together with gRNA could implement precise and stable transcriptional control of target genes, including transcription repression and activation (59). Chen et al. illustrated the performance of CRISPRi for individually or simultaneously regulating the transcription of multiple genes (58). CRISPRi provides a novel highly specific tool for switching gene expression without genetically altering target DNA sequence.
Precisely genome editing has the potential to permanently cure diseases through disrupting endogenous disease-causing genes, correcting disease-causing mutations or inserting new protective genes (61–66). Using ZFNs-induced HDR, Urnov et al. corrected disease-causing gene mutation in human cell for the first time (61). Subsequently, ZFNs were used to correct the gene mutations causing sickle-cell disease (63) and hemophilia B (62). Through disabling virulence genes or inserting protective genes, ZFNs have been used to induce resistance to virus infection in human cells (67–69) and enhance the efficiency of immunotherapies (70,71). As the newest engineered nucleases, CRISPR/Cas9 provides a novel highly efficient genome editing tool for gene therapy studies. For instance, Ebina et al. disrupted the long-terminal repeat promoter of HIV-1 genome using CRISPR/Cas9, which significantly decreased HIV-1 expression in infected human cells (72). The integrated proviral viral genes in host cell genomes can also be removed by CRISPR/Cas9 (72).
With the rapid development of induced pluripotent stem (iPS) cells technology, engineered nucleases are applied to genome manipulation of iPS cells (73,74). The unlimited self-renewing and multipotential differentiation capacity of iPS cells make them very useful in disease modeling and gene therapy. Using CRISPR/Cas9, Horri et al. created an iPS cell model for immunodeficiency, centromeric region instability, facial anomalies syndrome (ICF) causing by DNMT3B gene mutation (75). In this study, iPS cells were transfected with plasmids expressing Cas9 and gRNA, which disrupted the function of DNMT3B in transfected iPS cells (75). Using the same hPSC lines and delivery method, Ding et al. compared the efficiencies of CRISPR/Cas9 and TALENs for genome editing of iPS cells (76). They observed that CRISPR/Cas9 was more efficient than TALENs (76). However, it is still a long road to clinically applying CRISPR/Cas9 for gene therapy. We must ensure the high specificity of CRISPR/Cas9 for target sites and eliminate possible off-target mutations with negative effects. Careful selection of target sites, delicate gRNA design and genome-wide search of potential off-target sites are mostly required.
Despite the great potential of CRISPR/Cas9 in genome editing, there are some important issues that need to be addressed, such as off-target mutations, PAM dependence, gRNA production and delivery methods of CRISPR/Cas9.
Off-target mutations are one major concern about CRISPR/Cas9-mediated genome editing. Compared with ZFNs and TALENs, CRISPR/Cas9 presents relative high risk of off-target mutations in human cells (27). Large genomes often contain multiple DNA sequences that are identical or highly homologous to target DNA sequences. Besides target DNA sequences, CRISPR/Cas9 also cleaves these identical or highly homologous DNA sequences, which leads to mutations at undesired sites, called off-target mutations. Off-target mutations can result in cell death or transformation. To reduce the cellular toxicity of CRISPR/Cas9, more and more efforts are paid to eliminate the off-target mutations of CRISPR/Cas9 (26,27,29,77). To ensure the specificity of CRISPR/Cas9, it is better to select the target sites with the fewest off-target sites and mismatches between gRNA and its complementary sequence. Xiao et al. recently developed a flexible searching tool CasOT, which could identify potential off-target sites across whole genomes (77). The dosage of CRISPR/Cas9 is another factor affecting off-target mutations and should be carefully controlled (29,78). Methylation of target DNA sequences appeared not to affect the specificity of CRISPR/Cas9 (29). Additionally, converting Cas9 into nickase can help to reduce off-target mutations, while maintaining the efficiency of on-target cleavage implemented by CRISPR/Cas9 (26).
Theoretically, CRISPR/Cas9 can be applied to any DNA sequence through engineered programmable gRNA. However, the specificity of CRISPR/Cas9 requires a 2–5 nt PAM sequence locating at immediately downstream of the target sequence, besides gRNA/target sequence complementarity (21). The identified PAM sequences vary among different Cas9 orthologs, such as NGG PAM from Streptococcus pyogenes (21,79), NGGNG and NNAGAAW PAM from Streptococcus thermophiles (22,80,81) and NNNNGATT PAM from Neisseria meningitidis (36,82). Recently, Hsu et al. reported a NAG PAM, which had only ∼20% efficiency of NGG PAM for guiding DNA cleavage (29). On the one hand, the PAM-dependent manner of CRISPR/Cas9-mediated DNA cleavage constrains the frequencies of targetable sites in genomes. For instance, it is possible to find a target site per 8 nt for NGG PAM and NAG PAM, while per 32 and 256 nt for NGGNG PAM and NNAGAAW PAM. On the other hand, PAM dependence also increases the specificity of CRISPR/Cas9. The off-target mutations of CRISPR/Cas9 requiring long PAM should be less than that of CRISPR/Cas9 requiring short PAM.
gRNA production is another important issue for CRISPR/Cas9-mediated genome editing. Due to extensive posttranscriptional processing and modification of mRNA transcribed by RNA polymerase II, it is currently difficult to apply RNA polymerase II for gRNA production. RNA polymerase III, U3 and U6 snRNA promoters are currently used to produce gRNA in vivo. However, U3 and U6 snRNA genes are ubiquitously expressed housekeeping genes, which cannot be used to generate tissue- and cell-specific gRNA (83). The lack of commercially available RNA polymerase III also limits the application of U3- and U6-based gRNA production. Gao et al. designed an artificial gene RGR, the transcribed mRNA of which contained desired gRNA and ribozyme sequences at both ends of gRNA (83). After self-catalyzed cleavage, mature gRNA were produced and successfully induced sequence-specific cleavage in vitro and in yeast (83).
Questions also remain regarding the delivery methods of CRISPR/Cas9 into organisms. DNA and RNA injection-based techniques are used for CRISPR/Cas9 delivery, such as injection of plasmids expressing Cas9 and gRNA (30) and injection of CRISPR components as RNA (43,44). The efficiencies of delivery methods depend on the types of target cells and tissues. More attentions should be paid to develop novel robust delivery methods for CRISPR/Cas9.
Genome editing is initially applied to Drosophila melanogaster (84,85), and rapidly extends to a broad range of organisms. An ideal genome editing tool should have simple, efficient and low-cost assembly of nucleases that can target any site without off-target mutations in genomes. CRISPR/Cas9 has the potential to become a reliable and facile genome editing tool, after addressing some issues. Benefiting from the simplicity and adaptability of CRISPR/Cas9, it opens the door for revealing gene function in biology and correcting gene defects in diseases. Further studies are necessary to explore the characteristic and improve the performance of CRISPR/Cas9, especially the specificity, off-target effects and delivery methods of CRISPR/Cas9. For instance, recent genome-wide deeply sequencing results will be helpful for selecting suitable target sites and designing highly specific gRNA.
Conflict of Interest statement. None declared.
The study was supported by National Natural Scientific Fund of China (81102086), Science and Technology Research and Development Program of in Shaanxi Province of China (2013KJXX-51) and the Fundamental Research Funds for the Central Universities.