CRISPR/Cas9 for genome editing: progress, implications and challenges

Clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated (Cas) protein 9 system provides a robust and multiplexable genome editing tool, enabling researchers to precisely manipulate specific genomic elements, and facilitating the elucidation of target gene function in biology and diseases. CRISPR/Cas9 comprises of a nonspecific Cas9 nuclease and a set of programmable sequence-specific CRISPR RNA (crRNA), which can guide Cas9 to cleave DNA and generate double-strand breaks at target sites. Subsequent cellular DNA repair process leads to desired insertions, deletions or substitutions at target sites. The specificity of CRISPR/ Cas9-mediated DNA cleavage requires target sequences matching crRNA and a protospacer adjacent motif locating at downstream of target sequences. Here, we review the molecular mechanism, applications and challenges of CRISPR/Cas9-mediated genome editing and clinical therapeutic potential of CRISPR/Cas9 in future.


INTRODUCTION
Benefiting from the rapid development of high-throughput sequencing technology and bioinformatics, researchers make great progress on gene mapping in a short time. Currently, a major challenge faced by researchers is how to reveal the molecular mechanism of genes influencing individual phenotypes. A good way to elucidate the function of a gene is to shut it down or overexpress it in living organisms, which is previously complicated and time consuming (1 -4). A new approach named 'genome editing' emerged and widely used in the studies of functional genomics, transgenic organisms and gene therapy during the past decades. Genome editing is built on engineered, programmable and highly specific nucleases, which can induce sitespecific changes in the genomes of cellular organisms through a sequence-specific DNA-binding domain and a nonspecific DNA cleavage domain. Subsequent cellular DNA repair process generates desired insertions, deletions or substitutions at the loci of interest.
Multiple artificial nuclease systems have been developed for genome editing. Zinc-finger nucleases (ZFNs) are one of widely applied engineered nucleases (5 -11). ZFNs contain a common Cys 2 -His 2 DNA-binding domain and a DNA cleavage domain of the FokI restriction endonuclease (8). Another popular genome editing platform is transcription activator-like effector nucleases (TALENs) (12)(13)(14)(15)(16)(17)(18)(19), which are derived from a natural protein of plant pathogenic bacteria Xanthomonas. The DNA-binding domain of TALENs is composed of 33-35 conserved amino acid repeated motifs, each of which recognizes a specific nucleotide. Through shuffling repeated amino acid recognition motifs, TALENs can be programmed to target-specific DNA sequence. Recently, clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated (Cas) protein 9 system provides an alternative to ZFNs and TALENs for genome editing (20). Distinct from the protein-guided DNA cleavage of ZFNs and TALENs, CRISPR/Cas9 depends on small RNA for sequence-specific cleavage (21). Because only programmable RNA is required to generate sequence specificity, CRISPR/Cas9 is easily applicable and develops very fast over the past year. Here, we review the molecular mechanism, applications and challenges of CRISPR/Cas9-mediated genome editing and clinical therapeutic potential of CRISPR/Cas9 in future.

CRISPR/CAS9-MEDIATED GENOME MODIFICATION
In bacteria and archaea, CRISPR/Cas was discovered as an acquired immune system against viruses and phages through CRISPR RNA (crRNA)-based DNA recognition and Cas nucleases-mediated DNA cleavage (21,22). CRISPR/Cas is observed in nearly 40% genomes of sequenced bacteria and nearly 90% genomes of sequenced archaea (23). CRISPR locus consists of a series of conserved repeated sequences interspaced by distinct nonrepetitive sequences named spacers (Fig. 1A). In CRISPR/Cas system, invading foreign DNA is processed by Cas nuclease into small DNA fragments, which are then incorporated into CRISPR locus of host genomes as the spacers. In response to viruses and phage infections, the spacers are used as transcriptional templates for producing crRNA, which guides Cas to cleave target DNA sequences of invading viruses and phages (Fig. 1B). More than 40 different Cas protein families have been reported (24), playing important roles in crRNA biogenesis, spacers incorporation and invading DNA cleavage. Based on the sequences and structures of Cas protein, CRISPR/Cas system is primarily classified into three types, I, II and III (25). The type II CRISPR/Cas system only needs a single Cas protein Cas9, which contains a HNH nuclease domain and a RuvC-like nuclease domain (21). CRISPR/Cas9 has been demonstrated to be a simple and efficient tool for genome editing.
CRISPR/Cas9-mediated genome editing depends on the generation of double-strand break (DSB) and subsequent cellular DNA repair process. In endogenous CRISPR/Cas9 system, mature crRNA is combined with transactivating crRNA (tracr-RNA) to form a tracrRNA:crRNA complex that guides Cas9 to a target site. TracrRNA is partially complementary to crRNA and contributes to crRNA maturation. At the target site, CRISPR/ Cas9-mediated sequence-specific cleavage requires a DNA sequence protospacer matching crRNA and a short protospacer adjacent motif (PAM). After binding to the target site, the DNA single-strand matching crRNA and opposite strand are cleaved, respectively, by the HNH nuclease domain and RuvClike nuclease domain of Cas9, generating a DSB at the target site (Fig. 2). For easy application in genome editing, researchers designed a delicate guide RNA (gRNA), which was a chimeric RNA containing all essential crRNA and tracrRNA components (21). Multiple CRISPR/Cas9 variants have been developed, recognizing 20 or 24 nt sequences matching engineered gRNA and 2 -4 nt PAM sequences at target sites. Therefore, CRISPR/Cas9 can theoretically target a specific DNA sequence with 22 -29 nt, which is unique in most genomes. However, recent studies observed that CRISPR/Cas9 had high tolerance to base pair mismatches between gRNA and its complementary target sequence, which was sensitive to the numbers, positions and distribution of mismatches (21,(26)(27)(28)(29). For instance, the CRISPR/Cas9 of Streptococcus pyogenes appeared to tolerate up to six base pair mismatches at target sites (21).
The DSB generated by CRISPR/Cas9 will trigger cellular DNA repair processes, including nonhomologous end-joining (NHEJ)-mediated error-prone DNA repair and homologydirected repair (HDR)-mediated error-free DNA repair. NHEJmediated DNA repair can rapidly ligate the DSB but generate small insertion and deletion mutations at target sites. These mutations can help us to disrupt or abolish the function of target genes or genomic elements. For instance, Gratz et al. generated frame-shifting indels at the yellow locus of Drosophila genome through CRISPR/Cas9-induced DNA cleavage following by NHEJ-mediated DNA repair (30). DSB can also initiate HDR-mediated DNA repair, which is more complicated than NHEJ-mediated DNA repair. HDR-mediated error-free DNA repair requires a homology-containing donor DNA sequence as repair template. Through co-injection of Cas9, two gRNA targeting, respectively, the 5 ′ and 3 ′ sequences of the yellow locus, and a single-strand oligodeoxynucleotide template, Gratz et al. successfully replaced the yellow locus with a 50 nt attP recombination site in Drosophila genome (30).
Comparing with ZFNs and TALENs, there are several advantages for CRISPR/Cas9. ZFNs and TALENs are built on proteinguided DNA cleavage, which needs complex and time-consuming protein engineering, selection and validation. In contrast, CRISPR/ Cas9 only needs a short programmable gRNA for DNA targeting, which is relative cheap and easy to design and produce. Through using Cas9 and several gRNA with different target sites, CRISPR/Cas9 is able to simultaneously induce genomic modifications at multiple independent sites (26). This technology can accelerate the generation of transgenic animals with multiple gene mutations (31,32), and disrupt multiple genes or

Transcription regulation
Gene transcription regulation in living organisms is very useful for gene function and transcriptional network studies. Through

R42
Human  (58). CRISPRi provides a novel highly specific tool for switching gene expression without genetically altering target DNA sequence.

Gene therapy
Precisely genome editing has the potential to permanently cure diseases through disrupting endogenous disease-causing genes, correcting disease-causing mutations or inserting new protective genes (61)(62)(63)(64)(65)(66). Using ZFNs-induced HDR, Urnov et al. corrected disease-causing gene mutation in human cell for the first time (61). Subsequently, ZFNs were used to correct the gene mutations causing sickle-cell disease (63) and hemophilia B (62). Through disabling virulence genes or inserting protective genes, ZFNs have been used to induce resistance to virus infection in human cells (67)(68)(69) and enhance the efficiency of immunotherapies (70,71). As the newest engineered nucleases, CRISPR/Cas9 provides a novel highly efficient genome editing tool for gene therapy studies. For instance, Ebina et al. disrupted the long-terminal repeat promoter of HIV-1 genome using CRISPR/Cas9, which significantly decreased HIV-1 expression in infected human cells (72). The integrated proviral viral genes in host cell genomes can also be removed by CRISPR/Cas9 (72).
With the rapid development of induced pluripotent stem (iPS) cells technology, engineered nucleases are applied to genome manipulation of iPS cells (73,74). The unlimited self-renewing and multipotential differentiation capacity of iPS cells make them very useful in disease modeling and gene therapy. Using CRISPR/Cas9, Horri et al. created an iPS cell model for immunodeficiency, centromeric region instability, facial anomalies syndrome (ICF) causing by DNMT3B gene mutation (75). In this study, iPS cells were transfected with plasmids expressing Cas9 and gRNA, which disrupted the function of DNMT3B in transfected iPS cells (75). Using the same hPSC lines and delivery method, Ding et al. compared the efficiencies of CRISPR/ Cas9 and TALENs for genome editing of iPS cells (76). They observed that CRISPR/Cas9 was more efficient than TALENs (76). However, it is still a long road to clinically applying CRISPR/Cas9 for gene therapy. We must ensure the high specificity of CRISPR/Cas9 for target sites and eliminate possible offtarget mutations with negative effects. Careful selection of target sites, delicate gRNA design and genome-wide search of potential off-target sites are mostly required.

CHALLENGES
Despite the great potential of CRISPR/Cas9 in genome editing, there are some important issues that need to be addressed, such as off-target mutations, PAM dependence, gRNA production and delivery methods of CRISPR/Cas9.

Off-target mutations
Off-target mutations are one major concern about CRISPR/ Cas9-mediated genome editing. Compared with ZFNs and TALENs, CRISPR/Cas9 presents relative high risk of off-target mutations in human cells (27). Large genomes often contain multiple DNA sequences that are identical or highly homologous to target DNA sequences. Besides target DNA sequences, CRISPR/Cas9 also cleaves these identical or highly homologous DNA sequences, which leads to mutations at undesired sites, called off-target mutations. Off-target mutations can result in cell death or transformation. To reduce the cellular toxicity of CRISPR/Cas9, more and more efforts are paid to eliminate the off-target mutations of CRISPR/Cas9 (26,27,29,77). To ensure the specificity of CRISPR/Cas9, it is better to select the target sites with the fewest off-target sites and mismatches between gRNA and its complementary sequence. Xiao et al. recently developed a flexible searching tool CasOT, which could identify potential off-target sites across whole genomes (77). The dosage of CRISPR/Cas9 is another factor affecting off-target mutations and should be carefully controlled (29,78). Methylation of target DNA sequences appeared not to affect the specificity of CRISPR/Cas9 (29). Additionally, converting Cas9 into nickase can help to reduce off-target mutations, while maintaining the efficiency of on-target cleavage implemented by CRISPR/Cas9 (26).

PAM dependence
Theoretically, CRISPR/Cas9 can be applied to any DNA sequence through engineered programmable gRNA. However, the specificity of CRISPR/Cas9 requires a 2 -5 nt PAM sequence locating at immediately downstream of the target sequence, besides gRNA/target sequence complementarity (21). The identified PAM sequences vary among different Cas9 orthologs, such as NGG PAM from Streptococcus pyogenes (21,79), NGGNG and NNAGAAW PAM from Streptococcus thermophiles (22,80,81) and NNNNGATT PAM from Neisseria meningitidis (36,82 gRNA production gRNA production is another important issue for CRISPR/Cas9mediated genome editing. Due to extensive posttranscriptional processing and modification of mRNA transcribed by RNA polymerase II, it is currently difficult to apply RNA polymerase II for gRNA production. RNA polymerase III, U3 and U6 snRNA promoters are currently used to produce gRNA in vivo. However, U3 and U6 snRNA genes are ubiquitously expressed housekeeping genes, which cannot be used to generate tissueand cell-specific gRNA (83). The lack of commercially available RNA polymerase III also limits the application of U3-and U6-based gRNA production. Gao et al. designed an artificial gene RGR, the transcribed mRNA of which contained desired gRNA and ribozyme sequences at both ends of gRNA (83). After self-catalyzed cleavage, mature gRNA were produced and successfully induced sequence-specific cleavage in vitro and in yeast (83).

Delivery methods
Questions also remain regarding the delivery methods of CRISPR/Cas9 into organisms. DNA and RNA injection-based techniques are used for CRISPR/Cas9 delivery, such as injection of plasmids expressing Cas9 and gRNA (30) and injection of CRISPR components as RNA (43,44). The efficiencies of delivery methods depend on the types of target cells and tissues. More attentions should be paid to develop novel robust delivery methods for CRISPR/Cas9.

CONCLUSION
Genome editing is initially applied to Drosophila melanogaster (84,85), and rapidly extends to a broad range of organisms. An ideal genome editing tool should have simple, efficient and low-cost assembly of nucleases that can target any site without off-target mutations in genomes. CRISPR/Cas9 has the potential to become a reliable and facile genome editing tool, after addressing some issues. Benefiting from the simplicity and adaptability of CRISPR/Cas9, it opens the door for revealing gene function in biology and correcting gene defects in diseases. Further studies are necessary to explore the characteristic and improve the performance of CRISPR/Cas9, especially the specificity, off-target effects and delivery methods of CRISPR/Cas9. For instance, recent genome-wide deeply sequencing results will be helpful for selecting suitable target sites and designing highly specific gRNA.
Conflict of Interest statement. None declared.