DANGER analysis: risk-averse on/off-target assessment for CRISPR editing without a reference genome

Abstract Motivation The CRISPR-Cas9 system has successfully achieved site-specific gene editing in organisms ranging from humans to bacteria. The technology efficiently generates mutants, allowing for phenotypic analysis of the on-target gene. However, some conventional studies did not investigate whether deleterious off-target effects partially affect the phenotype. Results Herein, we present a novel phenotypic assessment of CRISPR-mediated gene editing: Deleterious and ANticipatable Guides Evaluated by RNA-sequencing (DANGER) analysis. Using RNA-seq data, this bioinformatics pipeline can elucidate genomic on/off-target sites on mRNA-transcribed regions related to expression changes and then quantify phenotypic risk at the gene ontology term level. We demonstrated the risk-averse on/off-target assessment in RNA-seq data from gene-edited samples of human cells and zebrafish brains. Our DANGER analysis successfully detected off-target sites, and it quantitatively evaluated the potential contribution of deleterious off-targets to the transcriptome phenotypes of the edited mutants. Notably, DANGER analysis harnessed de novo transcriptome assembly to perform risk-averse on/off-target assessments without a reference genome. Thus, our resources would help assess genome editing in non-model organisms, individual human genomes, and atypical genomes from diseases and viruses. In conclusion, DANGER analysis facilitates the safer design of genome editing in all organisms with a transcriptome. Availability and implementation The Script for the DANGER analysis pipeline is available at https://github.com/KazukiNakamae/DANGER_analysis. In addition, the software provides a tutorial on reproducing the results presented in this article on the Readme page. The Docker image of DANGER_analysis is also available at https://hub.docker.com/repository/docker/kazukinakamae/dangeranalysis/general.

However, genome editing using CRISPR technology presents two challenges that have not been addressed in previous studies.First, phenotypic effects caused by unexpected CRISPR dynamics are not quantitatively monitored.CRISPR-Cas9 is well known for unexpected sequence editing (off-target site) with mismatches (MMs) when compared to protospacers and PAM (Hsu et al. 2013).Off-target gene editing results in incorrectly edited mRNA, unexpected phenotypes, and decreased expression of unrelated genes.Some reports predicted and detected off-target editing using genomic PCR and DNA sequencing analysis (Fu et al. 2014, Kim et al. 2015, Tsai et al. 2015, Luo et al. 2019), but most studies have not assessed the phenotypic effect of the detected off-targets.Second, CRISPR technology generally depends on basic genomic data, including the reference genome.CRISPR technology has potential applications in organisms with incompletely characterized genomes.However, the design of site-specific sgRNAs requires the factual genomic sequence of materials to be treated with CRISPR technology.This hindrance also emerges in the human genome, particularly in the genomes of patients and cancer genomes.These genomes are assumed to be completely distinct from the reference genome (Levy et al. 2007, Vogelstein et al. 2013).The off-target is always "unexpected."Thus, we need a method to observe factual genomic sequences and reduce potential off-target effects.
We devised a method to overcome the two challenges above: phenotypic risk and dependence on a reference genome.Phenotypic risk can be assessed by phenotype analysis using gene ontology (GO) annotation (Ashburner et al. 2000, Gene Ontology Consortium 2021).GO has been widely used for several decades (Bell et al. 2018, Hughes et al. 2020, Bono 2021, Suzuki et al. 2021, Tamura and Bono 2022, Toga et al. 2022).Recently, many RNA-sequencing (RNA-seq) data and mapped genes have been annotated with GO terms to characterize the transcriptome phenotype under a specific condition of the organism.This process is known as enrichment analysis (Subramanian et al. 2005).We expected that we could quantitatively assess the phenotypic risk of off-target genes if each off-target gene with decreased expression was annotated with GO terms.Moreover, de novo transcriptome assembly technology can address the dependency problem of reference genomes.The de novo transcriptome assembly can generate transcriptome sequences without the reference genome using RNA-seq data (Grabherr et al. 2011, Khudyakov et al. 2017, Nespolo et al. 2018, Ho ¨lzer and Marz 2019, Lipka et al. 2019).We identified factual genomic sequences in mRNAtranscribed regions using de novo transcriptome assembly from gene-edited organisms and cells.
In this study, we combined de novo transcriptome assembly and GO annotation analysis in CRISPR editing to establish a DNA on/off-target assessment, including phenotype risk analysis without a reference genome.We named it Deleterious and ANticipatable Guides Evaluated by RNA-sequencing (DANGER) analysis (Fig. 1).This bioinformatics pipeline can elucidate genomic on/off-target sites based on de novo transcriptome assembly using RNA-seq data.Then, it identifies the "deleterious off-targets," defined as off-targets on the mRNA-transcribed regions that represent the downregulation of expression in edited samples compared to wild-type (WT) ones.Furthermore, our pipeline can quantify phenotypic risk at the GO term level by calculating a newly defined indicator of phenotypic risk by the deleterious off-targets, named the D-index.

Implementation of DANGER analysis
The Our pipeline of DANGER analysis is composed of several processes: "Quality control & Adapter trimming," "rRNA Removal," "de novo transcriptome assembly," " Removal of redundancy," "Detection of on-target and potential off-target sites," "Expression quantification," "Search for deleterious off-target sites," "Identification of ORFs and Genes," "GO analysis," and "Validation for phenotypic risk" (Fig. 2A).
DANGER analysis examines paired-end RNA-seq data derived from WT and edited samples using the processes depicted in Fig. 1.The pipeline generates a de novo transcriptome assembly, an expression profile of transcripts belonging to on/off-target sites, and an estimation of the phenotypic risk for off-target sites.The script has been uploaded to our GitHub repository (https://github.com/KazukiNakamae/DANGER_analysis).The analyses were performed on Docker with Ubuntu v. 22.04.1,LTS, and 235 GB of memory.The scripts for this processing pipeline were released as a Docker image (https://hub.docker.com/r/kazukinakamae/dangeranalysis), enabling operation on various operating systems beyond Linux using this Docker image.Each process is explained in detail below.

Quality control and adapter trimming
Quality control and adapter trimming were performed using Cutadapt v. 1.18 (Martin 2011), which also removed low-quality reads.The adapter sequences used were "AGATCGGAAGAG."

De novo transcriptome assembly
The de novo transcriptome assembly was performed using Trinity v. 2.12.0 (Grabherr et al. 2011).The merged read files were composed of RNA-seq data derived from WT samples.Transcriptome completeness was assessed using BUSCO v. 5.2.2_cv1 (Manni et al. 2021).The benchmarking universal single-copy orthologs (BUSCO) evaluated the competence of assembly using estimation of similarity to gene database (BUSCO genes) and classified hit sequence into "complete" (including "single-copy" and "duplicated"), "fragmented," and "missing."The databases used for conserved mammalian BUSCO genes were "mammalia_odb10" and "actinopterygii_odb10" for human and zebrafish assemblies, respectively.

Removal of redundancy
The expression of RNA-seq data derived from WT samples was quantified in advance to remove transcripts with low expression levels.Quantification was performed with align_and_estimate_abundance.pl of Trinity v. 2.12.0.The removal of transcripts was performed with filter_low_expr_ transcripts.pl of Trinity v. 2.12.0.

Detection of on-target and potential off-target sites
Detection of on/off-target sites in de novo transcriptome assembly was performed with Crisflash v.1.2.0 (Jacquin et al. 2019).Our off-target detection focused on up to 8 or up to 11 nt MMs and NGG or NRR PAM because the previous offtarget reports indicated that off-target sites with !5 nt MMs and NGG/NAG/NGA/NAA PAM exist (Tsai et al. 2015, Corsi et al. 2022), although not with a high frequency.
Specifically, we searched for potential off-target sites by executing the following command: crisflash -g Result_denovo_transcriptome_ fde novo transcriptome assembly generated from WT RNA-seq datag -s fsequence of protospacer and PAMg -o foutput file of Crisflashg -m fthe maximum number of mismaches users considerg -p fPAMg -t fthe number of threadg -C.
The output file of Crisflash has on-/off-target locations, off-target sequences, and MM numbers included in the tab-delimited format (Cas-OFFinder format).The on-target locations were extracted using perfect matching with the expected genomic sequence of Cas9-sgRNA binding.The potential off-target sites were classified by the MM number in each text file.
The detection of on/off-target sites uses one de novo transcriptome assembly generated by merging all replicates of RNA-seq data from WT samples to ensure the best quality of assembly.Thus, we did not obtain replicates of numerical data related to off-target sites or conduct statistical evaluations, such as P-values using the replicated data.The analysis design is due to concerns that utilizing de novo transcriptome assembly generated from individual replicates might diminish the accuracy of the analysis.

Expression quantification
Using filtered de novo transcriptome assembly, we quantified the expression of RNA-seq data derived from each sample.Quantification was performed using align_and_estimate_abundance.pl of Trinity v. 2.12.0 (Grabherr et al. 2011).The transcripts per million (TPM) dataset was constructed from "RSEM.isoforms.results" of each output directory and was saved as a single CSV file.
The ratio of TPM values on a transcript was calculated with the following formula: A transcript with a TPM ratio of less than the user-defined value (t) was determined as downregulated TPM (dTPM).Furthermore, in place of TPM, it is compatible with profiling based on Differentially Expressed Genes (DEG).The DEG analysis was performed to compare different analysis methods.The read count data were extracted from the "RSEM.isoforms.results" of each output directory in Section 2.1.6.The raw count data were normalized by the Tag Count Comparison (TCC) R package (Sun et al. 2013) with the parameter "norm.method¼'tmm',test.method¼'edger',iteration ¼ 30, FDR ¼ 0.1, floorPDEG ¼ 0.05" to detect DEG between RNA-seq data derived from WT and edited samples.The MA plot was constructed using an in-house R script.If a DEG transcript had a negative log-ratio of normalized counts (M-value) and its P-value fell below the user-defined value (a), it was determined to be downregulated DE (dDE).It was saved as a single CSV file.

Search for deleterious off-target sites
Our pipeline defined an off-target site where the transcript was annotated with dTPM or dDE as a deleterious off-target site, which could be also paraphrased as "actual off-target site."We counted the number of deleterious off-target sites using an in-house Python script, which required the offtarget site profile and the TPM ratio or DEG described above.

Identification of ORFs and genes
Our pipeline identifies open reading frames (ORFs) in the filtered de novo transcript sequences and predicts the corresponding amino acid sequences.If these predicted sequences exhibit significant homology with protein sequences in a database, we assign them as genes.The ORFs and genes of the filtered de novo transcript were estimated with TransDecoder v. 5.5.0 and ggsearch v. 36.3.8 g using a protein database.The process was based on the Systematic Analysis for Quantification of Everything pipeline (https://github.com/bonohu/SAQE)(Bono 2021).In particular, "11TransDecoder.sh,""12GetRefProts.sh,""15ggsearch.sh,""15parseggsearch.sh,"and "15mkannotbl.pl"were used with the supplemental script "00_prepare_faa_4Fanflow.sh" in the GitHub repository (https://github.com/RyoNozu/Sequence_editor).The referred protein databases were the Ensembl databases of all translations resulting from Ensembl genes in humans and zebrafish.The DANGER analysis database can be manually customized with the organism from which the analyzed RNA-seq data were extracted.
The above identification was based on one de novo transcriptome assembly generated by merging all replicates of RNA-seq data from WT samples to ensure the best quality of assembly.Thus, we did not obtain replicates of numerical data related to the identifications or conduct statistical evaluations, such as P-values using the replicated data.The analysis design is due to concerns that utilizing de novo transcriptome assembly generated from individual replicates might diminish the accuracy of the analysis.
2.1.9GO enrichment analysis GO annotations of gene ontologies were performed using an in-house R script using the org.Hs.eg.db and org.Dr.eg.dbR packages for humans and zebrafish, respectively.Enrichment analysis followed by GO annotations was performed using the topGO R package against genes whose off-target sites were determined to be deleterious off-target sites.Enrichment analyses were performed per off-target MM number.Finally, the enrichment tables were merged into a single table, named the DANGER table, with the MM number annotated.

Validation for phenotypic risk
We defined the following value (D-index) to evaluate the phenotypic risk posed by deleterious off-target effects per GO term.Our analysis requires RNA-seq data derived from WT and edited (each n !3).DANGER analysis has two steps in the workflow: (i) de novo transcriptome assembly (upper background box) and (ii) annotation analysis (lower background box).The de novo transcriptome assembly step is processed with Trinity and preprocessing tools, such as cutadapt and bbduk.sh.Crisflash performs the search of on/off-target sequences.The RSEM quantifies gene expression in edited RNA-seq samples in comparison to the WT de novo transcriptome (dot allow).The step of annotation analysis was involved processing with TransDedoder, ggsearch, org.XX.eg.db (e.g.org.Hs.eg.db in the transcriptome related to humans), and topGO.We implemented specific modules, colored in pink, for considering the phenotypic effect of deleterious off-targets.(B) Comparison between the hg38 reference genome and transcript sequence constructed by de novo assembly of RNA-seq samples derived from WT iPSC-derived cortical neurons on the GRIN2B on-target region.The on-target region of the hg38 reference genome is illustrated with annotations of the GRIN2B CDS, the protospacer, and the NGG PAM sequence of SpCas9.The detected GRIN2B isoforms (1-5) are lined up in the box.The Cas9-sgRNA binding sites are highlighted.(C) Genome completeness of de novo transcriptome assembly RNA-seq data derived from WT iPSC-derived cortical neurons was assessed using conserved mammal BUSCO genes (mammalia_odb10).The result was 79.1% of "complete," 20.7% of "single-copy," 58.4% of "duplicated," 3.2% of "fragmented," and 17.7% of "missing" (n ¼ 9226).

4
Nakamae and Bono where m indicates the number of MMs.The N(m) represents the total number of genes, which have m bases MMs, included in a specific GO term.The m max is the maximum number of MMs that a user considers.The D-index considers both the phenotypic risk and the frequency of off-target effects.Based on previous reports, we used the exponential function to express the frequency of off-targets because the frequency of off-targets tends to decrease exponentially as the MM number of off-target sites increases (Tsai et al. 2015, Fu et al. 2022).Based on a GUIDE-seq study (Tsai et al. 2015), most reported off-targets possess MMs of four or fewer nucleotides (Supplementary Table S1).Therefore, the exponent value is represented as a decreasing function by subtracting the number of MMs from four, and by minimizing the impact of MMs with five or more nucleotides, we have enabled the probabilistic risk assessment at an appropriate level.Calculations were performed using an in-house Python script.

Validation of D-index
We established a validation methodology of statistical significance for the D-index, determined as described above, for each set of GO terms using a permutation test.The permutation test involved random shuffling of the expression profile and off-target profile based on a given seed value, and the D-index was computed based on these randomly shuffled profile data.We will refer to this D-index as a pseudo-D-index.
After repeating this process of creating pseudo-D-indexes 100 times, we made a distribution of pseudo-D-indices for each set of GO terms.A D-index outside the (1ÀL)Â100% confidence interval of this distribution and higher than the mean value was defined as a "significant D-index."Additionally, we implemented a script to measure the falsepositive rate of the permutation test.In false-positive detection, 10 additional shuffled datasets were generated using seed values different from the ones used to create the shuffled expression and MM profiles, and the newly calculated pseudo-D-indices from these data were calculated.If they met the criteria for a significant D-index in the above distribution, the pseudo-D-indices were defined as "significant pseudo-Dindex" and counted.The ratio of the significant pseudo-Dindices to the total number of new pseudo-D-indices was defined as the false-positive rate.Multiplying this false-positive rate by the total number of original D-indices allows us to estimate the number of expected false D-indices, and subtracting this from the total number of original D-indices enables us to count the number of expected true D-indices.The analyses were performed on Docker with Ubuntu v. 22.04.1,LTS, and 235 GB of memory.The scripts for this processing pipeline were released as a Docker image (https:// hub.docker.com/r/kazukinakamae/dangertest),enabling operation on various operating systems beyond Linux by using this Docker image.

Datasets
Two RNA-seq datasets were analyzed to evaluate our pipeline.First, we collected paired-end RNA-seq datasets, which had a total of >100 M reads for all WT samples and an average length of >100 nt, to ensure good quality, which indicated a percentage of complete BUSCO genes of >70%, of transcriptome completeness after de novo transcriptome assembly.The datasets were downloaded from the Sequence Read Archive (Table 1).

GRIN2B
The dataset was extracted from human iPSC-derived cortical neurons with or without indels generated by paired Cas9n-sgRNA (GRIN2B-FW and GRIN2B-REV sgRNAs) on the GRIN2B locus (Bell et al. 2018).Previous studies have established clones with indels resulting in loss-of-function (LOF) and reduced dosage RD.However, we focused only on LOF samples that had been edited and analyzed using our pipeline.Gorodkin and Seemann have previously reported that offtarget sites affect the expression profile of LOF samples using reference-based RNA-seq analysis.Our study used the GRIN2B dataset to benchmark de novo transcriptome assembly-and reference-based RNA-seq analyses (Corsi et al. 2022).Moreover, we profiled the on/off-target assessment of GRIN2B-REV sgRNA.

Park7
The dataset was extracted from the zebrafish brain with or without biallelic indels generated by a single Cas9-sgRNA at the park7 locus (which encodes DJ-1) (Hughes et al. 2020).The analyzed F2 mutants were generated from a cross between two heterozygous F1 mutants.We used the dataset as an in vivo example of the DANGER analysis pipeline in a simple CRISPR-Cas9-mediated knock-out experiment.

Statistical analysis
Plots were made using Microsoft Office and housemade Python scripts.The Exact Fisher's test was performed for the P-value was calculated accordingly using fisher_exact() of the scipy package in Python.The 2-tailed Welch's t-test was performed for the P-value was calculated accordingly using ttest_ind(equal_var¼False) of the scipy package in Python.We used G power software for the statistical power (1-ß) calculation.The Venn diagrams were generated using web software ("Calculate and draw custom Venn diagrams": https:// bioinformatics.psb.ugent.be/webtools/Venn/).

Results and Discussions
3.1 Assessment of CRISPR-Cas9 off-targets using DANGER analysis for RNA-seq data from in vitro differentiated human iPSC We investigated whether our DANGER analysis pipelines could detect deleterious off-target sites without information on the reference genome.First, we applied DANGER analysis to the GRIN2B dataset, which was extracted from human iPSC-derived cortical neurons with or without in-frame deletions at the GRIN2B locus (Bell et al. 2018).The obtained de novo transcriptome assembly contained five isoforms in which on-target sequences of sgRNA (GRIN2B-REV) (Fig. 2B) were located.The assembly comprised 342 910 contigs and exhibited BUSCO transcriptome completeness of 79.1% (Fig. 2C).Previous studies on de novo transcriptome assembly using Trinity reported 64.7%, 77.1%, 80%, and 87% complete BUSCO genes in higher animals, such as Homo sapiens (Ho ¨lzer and Marz 2019), Castor fiber L. (Lipka et al. 2019), Mirounga angustirostris (Khudyakov et al. 2017), and Dromiciops gliroides (Nespolo et al. 2018), respectively.We successfully obtained a de novo transcriptome with standard quality.Consequently, the pipeline DANGER analysis performed an exhaustive search using Crisflash, yielding 33 878 potential off-target sites with up to eight bases MM and NGG PAM, using transcriptome assembly.
Next, we quantified the transcriptome-wide expression of each of the four RNA-seq samples from the WT and edited GRIN2B loci.Our pipeline examines whether the potential off-target site, which is output of Crisflash, is in the transcript and whether it also reduces the expression value.The pipeline considers those potential off-targets that have confirmed down-expression as deleterious off-targets, in other words, actual off-targets.The reduction of expression may occur because nonsense-mediated mRNA decay (Kurosaki et al. 2019) destroys incomplete transcript sequences resulting from offtargeting, or alignment software fails to map the incomplete transcript sequences to the untreated transcript sequence (Srivastava et al. 2020).The DANGER analysis screens transcripts with lower expression levels in edited samples compared to WT samples.In general, there are several criteria for estimating expression using RNA-seq.We implemented two criteria, "downregulated different expression (dDE)" and "dTPM," for the detection of downregulated transcripts (Fig. 3A; two callouts).Our dDE criterion uses DEG analysis with TCC normalization (see Section 2).In the case of DEs in a transcript with a negative M-value and a P-value that was less than the threshold value (a) in the MA plot (Fig. 3A; right callout), we defined the transcript as "dDE."On the other hand, the "TPM" criterion uses the normalized value, named TPM, as first defined by (Wagner et al. 2012) and calculates the ratio of TPM between WT and the edited samples.When the ratio of a transcript is less than the threshold value (t), we defined the transcript as "dTPM" (Fig. 3A; left callout).We confirmed the number of transcripts with dDE (a ¼ 0.001) or dTPM (t ¼ 0.4) annotation that contained off-target sites (Fig. 3A; Venn diagram).There were 730 transcripts with dDE off-targets.A total of 12 747 transcripts with dTPM offtargets were detected, which was $17-fold more than those with dDE off-targets.Our DANGER analysis aims to serve as a screening tool that emphasizes maximizing the estimation of potential risks by capturing as many sites suspected of phenomena as knock-out or knockdown of off-target genes.From this perspective, the dTPM approach can estimate the deleterious off-target effects to the greatest extent, more so than the dDE approach.Furthermore, we focused on the offtargets detected in common by dTPM and dDE.We observed their MM count (Supplementary Fig. S1A).As a result, we confirmed that the off-targets seen in common were distributed at an even ratio for each MM count compared to the offtargets detected only by dTPM (Supplementary Fig. S1B).Given that the number of MMs affects the likelihood of offtarget occurrences, there is no correlation between the common off-targets and their occurrence rate.While there are differences in the number and types of transcripts detected by dTPM and dDE, no evidence suggests that these differences result from a flaw in either approach.Therefore, our pipeline has adopted both methods.
The Gorodkin and Seemann group previously established the on/off-target assessment pipeline (CRISPRroots) (Corsi et al. 2022).They then used STAR (Dobin et al. 2013) to perform expression analysis on the same RNA-seq data of GRIN2B using reference-based mapping.They suggested that two off-target sites (ALK: gcAgaCTGGtTGGAAGCaC CNGG, GBA2: cccTTCcGGccGGAAGCGCCNGG) were binding with the Cas9-GRIN2B-REV sgRNA (AGATTCTG GGTGGAAGCGCCNGG)-DNA seed and were linked to downregulated expression (namely, "RISK: CRITICAL").The definition of off-targets in their study was similar to our idea of deleterious off-targets.Thus, we compared our list of deleterious off-targets in dTPM and dDE with the two offtargets detected using CRISPRroots.As a result, in the DANGER analysis, the off-target of CRISPRroots in the ALK locus was detected by both dTPM (t ¼ 0.4) and dDE (a ¼ 0.001) criteria considering up to eight base MMs with NGG PAM, whereas the off-target of CRISPRroots in GBA2 was not detected by any criteria (Fig. 3B).The absence of GBA2 was understandable because the off-target site on the GBA2 locus was in the promoter region rather than the GBA2 transcript.Additionally, more stringent expression analyses were conducted using dTPM (t ¼ 0.2) and dDE (a ¼ 0.0001).As a result, although 125 off-target genes were detected, neither dTPM nor dDE could identify ALK as an off-target gene (Supplementary Fig. S2).This result suggested the possibility of an increase in false negatives when the threshold for detection of expression decreases resulting from off-target effects is overly strict.It is therefore considered appropriate to set t ¼ 0.4 in dTPM and a ¼ 0.001 in dDE.Our de novo transcriptome approach focuses solely on the potential off-target genes in the transcribed region of the genome.Additionally, unlike traditional reference-based RNA-seq analysis, this approach can provide novel insights.For example, the offtarget search of DANGER analysis detected a transcript with an off-target site downstream GALR2 locus (Fig. 3C).The genome database annotation was not the off-target site of the transcribed region (XM_047436984.1).However, the de novo transcriptome assembly included the site in the transcript (TRINITY_DN86617_c0_g1_i1).This result indicates that our DANGER analysis pipeline can detect bona fide transcripts, which have never been annotated in the reference genome database because of cell-specific transcription, personal genomic variants, and inadequate genomic locus study.Additionally, the de novo transcriptome assembly had 260 770 transcript annotations (contigs), which may include transcript variants that partially came from allelic heterogeneity.The annotation size of DANGER analysis is about 10 times larger than that of CRISPRroots, whose gene annotations were about 25 k.DANGER Analysis was expected to make a larger off-target dataset of transcribed regions using the transcript-aware annotations compared to referencebased analysis.Although the detection range of our DANGER analysis is limited to the transcribed region, our pipeline using dTPM and dDE detected 13 237 and 813 offtarget sites with 0-8 MMs in identified and unidentified transcripts, respectively.Two thousand two hundred thirty-six and 407 gene-annotated off-target sites with four to eight MMs, respectively.In contrast, genome-wide and reference-based RNA-seq analysis (CRISPRroots) features only two offtarget sites in genes with six MMs.There was a large discrepancy in the detection number of off-targets between DANGER Analysis and CRISPRroots.We demonstrated that the DANGER analysis could generate a comprehensive and factual record of off-target sites (Fig. 3D).Finally, our pipeline evaluated the phenotypic risk of deleterious off-targets.Various studies have used GO analysis to assess phenotypic effects (Bell et al. 2018, Hughes et al. 2020, Bono 2021, Suzuki et al. 2021, Tamura and Bono 2022, Toga et al. 2022).However, this study also considered off-target frequency because deleterious off-targets with many MMs were expected to occur (Hsu et al. 2013, Fu et al. 2022).Thus, our pipeline counted off-target genes associated with specific GO terms per MM number (Fig. 4A) and then combined the results with the MM effect by calculating the D-index per GO term (Fig. 4B; see Section 2).The D-index is calculated by multiplying the number of genes containing a GO term for each MM number by a decreasing exponential function with the MM number as the exponent.This approach allows us to consider the number of genes hit by the GO terms and the number of MMs.Moreover, it can suppress the influence of the number of genes hit when the MM number is large.The formula can prioritize evaluating the number of genes hit when the MM number is small.With these two characteristics, the D-index represents a unique off-target metric that considers the impact on phenotype.The sum of the D-index value (total Dindex) in detected GO terms was 6228 (N ¼ 9896) (Fig. 4C and Supplementary Table S2) in the GRIN2B dataset.The DANGER analysis was used to evaluate the phenotypic risk at the GO level using RNA-seq of the human GRIN2B dataset data without any reference genomes.

Evaluation of D-index and optimization of DANGER analysis
Using the D-index, we quantified the phenotypic impact from off-targets at the GO term level.However, a different threshold must be set for each GO term when evaluating the statistical significance of the D-index values because GO is a loosely hierarchical annotation concept, and "parent" terms appear as annotations of various genes even if there is less relationship with off-targets.To address this issue, we implemented a permutation test system to estimate the significance threshold for each GO term (Fig. 5).In this system, after randomly shuffling the expression profile and off-target profile, we repeat the process of calculating a meaningless D-index (named pseudo-D-index) by applying the D-index formula 100 times.We then create a null distribution from the 100 pseudo-Dindex values for each GO term and define a significant D-index as an originally obtained D-index value that exceeds the (1ÀL)Â100% confidence interval of the null distribution.The methodology and the threshold allow us to extract only the D-indices of the GO terms suspected of having an offtarget-associated impact on the phenotype.
Moreover, we devised a scheme to assess the validity of the permutation test system (Supplementary Fig. S3).In the validation scheme, followed by generating several different pseudo-D-indices, the number of the newly generated pseudo-D-indices exceeding the threshold of the previous null distribution is counted.We defined the ratio of the count to the total number of D-indices as the false detection rate in the permutation test.We used the evaluation scheme to verify how the false detection rate changes under various conditions in DANGER Analysis.No significant influence was observed from the threshold of the expression analysis (t, a) or the conditions of off-target search (Supplementary Fig. S4A-C).We confirmed L ¼ 1E-15 confidence interval threshold reduced the false detection rate by more than half in the case of L ¼ 5E-1.Here, we named the condition of dTPM with high false positives (up to 11-MM NRR PAM, t ¼ 0.4, L ¼ 5E-1) as "Approximate dTPM," the condition of dTPM with low false positives (up to 8-MM NGG PAM, t ¼ 0.4, L ¼ 1E-15) as "Optimized dTPM," and the condition of dDE with low false positives (up to 8-MM NGG PAM, a ¼ 0.001, L ¼ 1E-15) as "Optimized dDE."When comparing the three conditions, the optimized dDE showed the lowest false detection rate (Fig. 6A).Next, we calculated the total number of D-indices, significant D-indices, and true significant D-indices (Expected True D-indices) estimated from the False Detection Rate for these three conditions.The total number of D-indices was more than twice as high for dTPM as for dDE, while the number of Expected True D-indices was less than half for dTPM compared to dDE (Fig. 6B).The result indicates that the dTPM criterion is a "sharply-narrowing-down" approach used for initial screening, while dDE is a "meticulously trimming" approach used for a rigorous selection process.Generally, the dDE approach is recommended for use in human RNA-seq data because the criterion is expected to present fewer false detections in significant D-indices.However, in model organisms and non-model organisms with less comprehensive GO annotations than humans, the dDE approach may not yield a sufficient D-index list for the evaluation.Nakamae and Bono  Thus, the dTPM approach, which can obtain more D-indices, is expected to be more effective in RNA-seq data from less characterized organisms than humans.We evaluated the consistency of D-indices and significant D-indices detected by both optimized dTPM and optimized dDE.The concordance of each D-index and significant D-index was $52% and 1.3%, respectively (Fig. 6C), which can be attributed to the fact that optimized dTPM considers more than five times the gene-annotated off-target sites compared to optimized dDE (Fig. 3D).However, the significant D-indices commonly detected by optimized dTPM and optimized dDE corresponded to the top 16 significant D-indices in dTPM (Fig. 3C  and D).The result suggested that the value of the D-index not only served as an indicator of the phenotypic impact from offtargets but could also be an indicator of the strength of its consistency.Thus, it is recommended to conduct follow-up analyses focusing mainly on the top-ranking D-indices in optimized dTPM.
3.3 Assessment of CRISPR-Cas9 on/off-target using DANGER analysis for RNA-seq data from in vivo tissue of zebrafish We performed DANGER analysis using the RNA-seq data from human cells edited by one of the Cas9n-sgRNAs for benchmarking with the previous method (CRISPRroots) and optimized parameters for the DANGER analysis in the previous sections.Next, we investigated whether DANGER analysis could be used to analyze the RNA-seq data from the nonhuman tissue that had been in vivo edited with a single Cas9 nuclease-sgRNA, a more common experimental design for genome editing.Thus, we downloaded and analyzed RNA-seq data from the park7 dataset derived from the zebrafish brain, with and without indels at the park7 locus (Hughes et al. 2020).Our DANGER analysis successfully built a de novo transcriptome assembly with 90.9% complete BUSCO genes and detected on-target sequences in the two transcripts (Fig. 7A).Moreover, DANGER analysis revealed a significant downregulation of the transcript with Cas9-sgRNA on-target sequence in the expression quantification (Fig. 7B).The original report on the park7 dataset reported downregulated park7 mRNA in the park7 mutant using RNA-seq analysis (Hughes et al. 2020).This result implied that de novo transcriptome assembly and the following expression quantification of our DANGER analysis could generate reliable data consistent with the outcome of standard RNA-seq analysis using a reference genome.Consequently, 19 314 potential offtarget sites in all transcripts were detected by optimized dTPM, which were then defined as deleterious off-targets considering the expression profile.There were 4668 and 70 deleterious off-target sites on all and gene-annotated transcripts, respectively (Fig. 7C).The park7 result had no deleterious off-target effects with 4 MMs, which meant that frequent off-targeting was not expected in mRNA-transcribed regions.
The detected rate of gene-annotated transcripts to all transcripts was more than 10 times less than that of optimized dTPM using human GRIN2B (Figs 3D and 7C).The fewer annotations resulted from the poor gene database of zebrafish in comparison with that of humans.Next, our pipeline estimated the D-index per GO term to quantify phenotypic risk.The total D-index was 51 (N ¼ 636) (Fig. 7D and Supplementary Table S5), which was less than that of human GRIN2B due to fewer annotations and off-target genes.
Finally, we validated the D-indices using the permutation test as the same procedure in the last section.Only five significant D-indices were detected (Fig. 7E) because the analysis considered only 70 deleterious off-target sites in gene-annotated transcripts.As discussed in the previous section, the park7 analysis has empirically demonstrated that the initially considered number of genes and the total number of D-indices can be small in organisms other than humans.The screening approach of optimized dTPM allows for the acquisition of significant D-indices in poorly annotated data sets.

Comparison of phenotypic risks in the GRIN2B and Park7 datasets
In this study, we evaluated the phenotypic risks associated with off-target transcripts using the D-index.The number of significant D-indices of the GRIN2B result was $32-fold larger than that of the park7 result in the optimized dTPM.Furthermore, the DANGER analysis found off-target genes with four MMs in the GRIN2B dataset, which is common in genome-wide off-target studies, such as GUIDE-seq and Digenome-seq (Kim et al. 2015, Tsai et al. 2015) and numerous deleterious off-target sites on transcript sequences annotated with genes.Thus, Cas9n-GRIN2B-REV sgRNA may have side effects on the phenotype of differentiated human iPSC.A follow-up study is required to assess the edited GRIN2B LOF clones using WGS or alternative genome-wide methods, such as GUIDE-seq (Tsai et al. 2015).Researchers can identify some clarified points in the future using the result table of the significant D-index (Supplementary Tables S3 and  S4).For example, the GO term "nucleoside monophosphate metabolic process" (GO ID: GO: 0009123) recorded a top significant D-index for the GRIN2B result of optimized dTPM (Supplementary Table S3).The GO terms of "cell differentiation" (GO ID: GO: 0030154), "cell population proliferation" (GO ID: GO: 0008283), and "cell cycle" (GO ID: GO: 0007049) were considered as the phenotype of GRIN2B knock-out in the previous study (Bell et al. 2018).However, these GO terms were listed in the significant D-indices of optimized dDE (Supplementary Table S4 and Table 2).The previous study showed that the gene expression changes of these GO terms resulted from on-target editing of the GRIN2B locus (Bell et al. 2018).However, the results of the DANGER analysis suggested that off-target editing of additional genes belonging to GO: 0030154, GO: 0008283, and GO: 0007049 partially contributed to expression changes.Followup studies should include off-target gene analysis of the associated off-targets with the GO terms.On the other hand, The GO terms of "central nervous system development" (GO ID: GO: 0007417), "brain development" (GO ID: GO: 0007420), "cell division" (GO ID: GO: 0051301), and "chromosome segregation" (GO ID: GO: 0007059) were not listed in the significant D-indices of optimized dTPM and optimized dDE, which suggested the GO terms were obvious phenotypes in the GRIN2B knock-out cells.Therefore, DANGER analysis would help reach a reasonable conclusion in genome editing studies.

Limitations
In this section, we discuss the major limitations of the proposed pipeline.First, our method depends on the quality of de novo transcriptome assembly using Trinity.Pair-end RNAseq data with sufficient length and read number must be used to guarantee high-quality assembly (see Section 2).If we fail to build an exemplary assembly, producing reliable data for the following analyses, such as on/off-target analysis and expression profiles, becomes difficult.Second, the annotation analysis step in our pipeline may fail to annotate transcripts adequately due to limited information from databases on genes, transcripts, proteins, and gene function.When researchers apply our pipeline to RNA-seq data from organisms with limited genomic knowledge and evidence, we recommend using a database of a model organism with a strong genomic relationship with the organism being analyzed.DANGER analysis can analyze genome-edited samples without a reference genome, but studies of a related model organism with a well-annotated genome are still required.As a third limitation, DANGER analysis cannot strictly distinguish the effects of modifications to on-target genes on other genes from the impacts of off-target gene modifications.Of course, an on-target gene is excluded from the DANGER analysis.
Still, it is difficult to distinguish the influence by considering off-target genes whose expression is controlled by the ontarget genes.Such an evaluation can only be elucidated through protein interaction and genome analysis conducted using more specialized knowledge by researchers in each field.Therefore, it is appropriate to complete the follow-up studies mentioned in the previous section using a comprehensive analysis.Traditional studies have discussed results based on RNA-seq analysis under the premise that they are solely derived from the effects of on-target gene modifications.However, our DANGER analysis contradicts this assumption, sounding an alarm about the necessity for more specialized investigations and providing the off-target gene information needed for such follow-up analyses.Additionally, gene network analyses using found off-target genes can help users exclude false detection if the target organism of DANGER Analysis is a model organism whose gene database is well established.

Possibility of DANGER analysis as a simplified screening tool, contributing to more rigorous reference-based phenotypic risk assessment
In this study, we developed DANGER analysis as an initial screening tool for maximizing the evaluation of risks to phenotypes.Meanwhile, it is conceivable that we will also need a more rigorous evaluation system for assessing risks to phenotypes.In constructing such an evaluation system, it is believed that a system utilizing information other than RNA-seq data, such as reanalysis of sample genomes by resequencing the genomic DNA rather than de novo transcriptome assembly, would be appropriate.Although our DANGER analysis is a tool that only takes RNA-seq data as input to ensure convenience, there is room to apply the partial algorithm (association between off-target genes and GO terms, phenotype risk calculations using the D-index) into such a rigorous referencebased evaluation system for phenotype risks.We believe there is a high possibility that DANGER analysis could become a foundational presence in this new field of phenotype risk assessment.

DANGER analysis provides a new perception of the conventional genome editing process in medicine, agriculture, and biological research
We demonstrated our DANGER analysis pipeline, as it allows for (i) the detection of potential DNA on/off-target sites in the mRNA-transcribed region on the genome using RNA-seq data, (ii) evaluation of phenotypic effects by deleterious offtarget sites based on the evidence provided by gene expression changes, and (iii) quantification of the phenotypic risk at the GO term level, without a reference genome.Thus, DANGER analysis can be performed on various organisms, personal human genomes, and atypical genomes created by diseases and viruses (Vogelstein et al. 2013).The CRISPRroots is expected to be only effective in samples with high similarity to the wellcharacterized reference genome.In general, DANGER analysis holds superiority over CRISPRroots in terms of versatility.We believe that the perception resulting from our DANGER analysis has not been observed in the conventional scheme for genome editing.We illustrate a new scheme using DANGER analysis in organisms (i) with and (ii) without a reference genome (Fig. 8).For example, (i) model organisms, such as humans have a reference genome whose information has been generally used for potential on/off-target searches and postanalysis.However, the reference genome is a representative genome and not the personal genome of the patient or cell lines.Reference-based genome editing does not consider unique single nucleotide polymorphisms or spontaneous genomic rearrangements.DANGER analysis can supply a personal transcriptome-based on/off-target profile to ensure the phenotypic risk of unexpected off-target mutations.The new workflow of genome editing would be helpful for ex vivo gene therapy and cancer research because the genome of a cancer cell is generally characterized by widespread somatic genomic rearrangements (Vogelstein et al. 2013).(ii) An organism whose genome has never been comprehensively sequenced and well characterized is not considered a reasonable subject for genome editing, as site-specific genome editing is guaranteed without a reference genome.Some groups, however, have used incomplete genomic information to construct mutants of non-model organisms (Sun et al. 2018, Chang et al. 2020) that have never been well characterized in genomics.Such a conventional scheme is hit-or-miss due to the risk of erroneous knock-out phenotypes in combination with off-targeting of other genes.DANGER analysis can provide transcriptome phenotype-aware on-/off-target profiles as well as sequence information of the expressed genes.This information can be used for the safer design of genome editing, which enables the optimized design of CRISPR editing by repeatedly looping back through genome-editing experiments without a reference genome.Furthermore, the DANGER analysis devised in this study employs a simple algorithm based on MM count for identifying off-target candidates.The type of algorithm has also been utilized in off-target investigations for other programmable nucleases, such as CRISPR-Cas12a, TALEN, and ZFN.DANGER analysis is open-source and freely adjustable.Thus, the algorithm of this pipeline could be repurposed for the analysis of various genome editing systems beyond the CRISPR-Cas9 system.Moreover, it is also possible to enhance the specificity of DANGER Analysis for CRISPR-Cas9 by incorporating a CRISPR-Cas9-specific offtarget scoring algorithms.We believe that the DANGER analysis pipeline will expand the scope of genomic studies and industrial applications using genome editing.

Figure 2 .
Figure2.Overview of DANGER analysis and on-target region constructed by de novo transcriptome assembly.(A) Bioinformatic workflow of DANGER analysis.Our analysis requires RNA-seq data derived from WT and edited (each n !3).DANGER analysis has two steps in the workflow: (i) de novo transcriptome assembly (upper background box) and (ii) annotation analysis (lower background box).The de novo transcriptome assembly step is processed with Trinity and preprocessing tools, such as cutadapt and bbduk.sh.Crisflash performs the search of on/off-target sequences.The RSEM quantifies gene expression in edited RNA-seq samples in comparison to the WT de novo transcriptome (dot allow).The step of annotation analysis was involved processing with TransDedoder, ggsearch, org.XX.eg.db (e.g.org.Hs.eg.db in the transcriptome related to humans), and topGO.We implemented specific modules, colored in pink, for considering the phenotypic effect of deleterious off-targets.(B) Comparison between the hg38 reference genome and transcript sequence constructed by de novo assembly of RNA-seq samples derived from WT iPSC-derived cortical neurons on the GRIN2B on-target region.The on-target region of the hg38 reference genome is illustrated with annotations of the GRIN2B CDS, the protospacer, and the NGG PAM sequence of SpCas9.The detected GRIN2B isoforms(1)(2)(3)(4)(5) are lined up in the box.The Cas9-sgRNA binding sites are highlighted.(C) Genome completeness of de novo transcriptome assembly RNA-seq data derived from WT iPSC-derived cortical neurons was assessed using conserved mammal BUSCO genes (mammalia_odb10).The result was 79.1% of "complete," 20.7% of "single-copy," 58.4% of "duplicated," 3.2% of "fragmented," and 17.7% of "missing" (n ¼ 9226).

Figure 3 .
Figure 3.The benchmark for expression analysis methods compared with reference-based RNA-seq analysis using RNA-seq data derived from WT and GRIN2B edited iPSC-derived cortical neurons.(A) Comparison of different expression analyses.A Venn diagram comparing the de novo transcripts (duplicate counts on a predicted ORF basis), which had potential off-target sites with up to 8 nt MMs, was detected by the dTPM and dDE approaches.dTPM indicates that the expression is decreased based on the ratio of TPM counts between WT and edited samples (left callout).dDE means the expression is reduced based on DEG analysis between WT and edited samples (right callout).(B) Comparison of de novo transcriptome assembly-and reference-based analysis on the deleterious off-target detection.A Venn diagram comparing the off-target genes identified from de novo transcriptome analysis [dTPM (t ¼ 0.4) and dDE (a ¼ 0.001) approaches] and reference-based RNA-seq analysis (CRISPRroots, "RISK: CRITICAL").(C) Genomic sequence map of off-target located outside of GALR2 mRNA.The sequence is a part of the hg38 reference genome with annotations of GALR2 mRNA (XM_047436984.1) and the de novo transcript (TRINITY_DN86617_c0_g1_i1) and an off-target site with three MMs compared to the on-target sequence.(D) Summary of deleterious off-target sites detected by dTPM and reference-based RNA-seq analysis (CRISPRroots, "RISK: CRITICAL").The counts of off-target sites are annotated with genes and classified by MM number related to the on-target sequence.The brackets indicate the number of transcripts, including those with and without identified gene annotations.

Figure 4 .
Figure 4.The result of risk assessment in DANGER analysis using RNA-seq data derived from WT and GRIN2B edited iPSC-derived cortical neurons.(A) An example of the annotation table for DANGER analysis.The table includes GO ID, GO term, number of MMs (n), and the counts of n-MM off-target genes belonging to a specific GO term.(B) The formula for phenotypic off-target risk (D-index).An example of the calculation is shown on a lower box.(C) Distribution of the D-index of each GO term.The sum of all D-indexes and the number of D-indices (N) were labeled on the top right. 8

Figure 5 .
Figure 5.A scheme for permutation testing to evaluate the validity of the D-index.The thin arrow indicates the manipulation of rearranging values from the original expression and off-target profile to the permutation data.The cross represents the computation for applying the D-index formula to the above expression profile and the below off-target profile data.The workflow is shown as the bold arrows.

Figure 6 .
Figure 6.Evaluation of permutation test for DANGER analysis and comparison between dTPM and dDE.(A) Comparison of false detection rates among approximate dTPM (up to 11-MM NRR PAM, t ¼ 0.4, L ¼ 5E-1), optimized dTPM (up to 8-MM NGG PAM, t ¼ 0.4, L ¼ 1E-15), and optimized dDE (up to 8-MM NGG PAM, a ¼ 0.001, L ¼ 1E-15) in GO categories.BP, CC, and MF indicate GO categories of Biological Process, Cellular Component, and Molecular Function, respectively.Error bars represent SEM; asterisk indicates the statistical significance of two-sided Welch's t-test; cross indicates statistical power (1-ß)>0.8.Mean6SD of n ¼ 10 permutation data set.(B) Comparison of amount of GO terms of all D-index, significant D-index, and expected true D-index among approximate dTPM (up to 11-MM NRR PAM, t ¼ 0.4, L ¼ 5E-1), optimized dTPM (up to 8-MM NGG PAM, t ¼ 0.4, L ¼ 1E-15), and optimized dDE (up to 8-MM NGG PAM, a ¼ 0.001, L ¼ 1E-15), respectively.(C) Comparison of D-index and significant D-index between optimized dTPM and optimized dDE.A Venn diagram comparing the counts of D-index and significant D-index between optimized dTPM and optimized dDE.(D) The list of the top 16 significant D-indices in the optimized dTPM.The D-index values are indicated by bar graphs adjacent to the GO terms.

Figure 7 .
Figure 7. DANGER analysis result using RNA-seq data derived from WT and park7 (dj1) edited brains of Danio rerio.(A) Comparison between the GRCz11 reference genome and transcript sequence constructed by de novo assembly of RNA-seq samples derived from WT brain on park7 on-target region.The on-target region of the GRCz11 reference genome is illustrated with annotations of the park7 CDS, the protospacer, and the NGG PAM sequence of SpCas9.The detected park7 isoforms (1-2) are lined up in the box.The Cas9-sgRNA binding sites are highlighted.(B) Comparison of TPM values of park7.The TPM was measured from WT and edited RNA-seq samples (each n ¼ 3); data were expressed as the means6SEM.***P-value <.001 of twosided Welch's t-test.(C) The gene counts are classified by MM number related to the on-target sequence.The brackets indicate the number of transcripts, including those with and without identified gene annotations.(D) Distribution of the D-index of each GO term associated with Biological Process.The sum of all D-indices and the number of D-indices (N) is labeled on the top right.(E) The list of all significant D-indices in the optimized dTPM.The D-index values are indicated by bar graphs adjacent to the GO terms.

Figure 8 .
Figure 8.Our proposal for the usage of DANGER analysis in organisms with and without a reference genome.The workflow is shown as black arrows.The dotted black arrows indicate the front of the arrow and refer to the arrow base information.The image of the book is from TogoTV (V C 2016 DBCLS TogoTV, CC-BY-4.0, https://creativecommons.org/licenses/by/4.0/).

Table 1 .
RNA-seq datasets evaluated for testing the DANGER analysis pipeline.

Table 2 .
D-index summary of GO terms related to the enrichment analysis of previous GRIN2B research.