Fusion-Bloom: fusion detection in assembled transcriptomes

Abstract Summary Presence or absence of gene fusions is one of the most important diagnostic markers in many cancer types. Consequently, fusion detection methods using various genomics data types, such as RNA sequencing (RNA-seq) are valuable tools for research and clinical applications. While information-rich RNA-seq data have proven to be instrumental in discovery of a number of hallmark fusion events, bioinformatics tools to detect fusions still have room for improvement. Here, we present Fusion-Bloom, a fusion detection method that leverages recent developments in de novo transcriptome assembly and assembly-based structural variant calling technologies (RNA-Bloom and PAVFinder, respectively). We benchmarked Fusion-Bloom against the performance of five other state-of-the-art fusion detection tools using multiple datasets. Overall, we observed Fusion-Bloom to display a good balance between detection sensitivity and specificity. We expect the tool to find applications in translational research and clinical genomics pipelines. Availability and implementation Fusion-Bloom is implemented as a UNIX Make utility, available at https://github.com/bcgsc/pavfinder and released under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0/. Supplementary information Supplementary data are available at Bioinformatics online.


Introduction
Gene fusions have long been known as drivers for both development and progression in various tumour. Over the years, a number of software tools have been developed to detect gene fusions from RNA sequencing (RNA-seq) data (Kumar et al., 2016). The algorithms of many of these tools typically involve detection of clusters of split single-read and discordant read-pair alignments against the reference genome or transcriptome. Alternatively, other tools use de novo assembly methods to produce sequences longer than the raw reads for more accurate sequence mapping before fusion detection. Although developments in long read technologies may alter this assessment in the future, the current cost-benefit-value balance still favours short reads for many applications.
Here, we describe a pipeline called Fusion-Bloom, which combines the use of a new de novo transcriptome assembler, RNA-Bloom (Nip et al., 2019) and a versatile assembly-based structural variant caller, PAVFinder (Chiu et al., 2018), for fusion detection. We demonstrate the performance of Fusion-Bloom on simulated and experimental RNA-seq datasets. We benchmarked its estimation accuracy and computational resource requirements in comparison to those of six RNA-seq fusion detection tools.

Materials and methods
Fusion-Bloom is implemented as a UNIX Make utility, which automates three analysis stages: assembly, alignment and analysis ( Supplementary Fig. S1). In the first stage, paired RNA-seq reads are assembled by RNA-Bloom with the option '-chimera-extend-stratum 01' to improve its reconstruction of full-length chimeric transcripts in low abundance. To expedite processing, Fusion-Bloom only retains RNA-Bloom contigs longer than the first quartile length of the entire assembly for downstream analysis. Contigs are then aligned against both the reference genome and annotated transcripts. Reference transcript alignment provides a computationally inexpensive yet useful complement to the genome alignment for chimera identification. Raw RNA-seq reads are also aligned to the contigs for: (i) filtering of mis-assembled chimeric junctions and (ii) estimating the expression levels of putative fusions. Based on these alignments, PAVFinder detects potential fusions and reports its results in BEDPE format (Supplementary Table S1).

Results
We took a commonly-used benchmarking dataset consisting of 50 fusions to compare the performance of Fusion-Bloom against 5 other fusion detection tools: deFuse (McPherson et al., 2011), STAR-Fusion (Haas et al., 2017), JAFFA (Davidson et al., 2015), pizzly (Melsted et al., 2017), SQUID (Ma et al., 2018) and EricScript (Benelli et al., 2012) (Supplementary Table S2). Fusion-Bloom out-performed other tools by detecting the largest number of fusions (48) with zero false-positive ( Supplementary Fig. S2). To better mimic data from tumour transcriptomes, we repeated the benchmarking experiment by combining the fusion-only dataset with an additional dataset comprising similar number of reads simulated from the ENSEMBLE annotation. We generated sensitivity-versusprecision plots of the tools (Fig. 1A) by filtering reported events with different read support levels represented by breakpoint-spanning reads and flanking read pairs (Supplementary Table S3). Fusion-Bloom was the best performer in this test; it does not produce any false-positives within the entire range of support levels (hence a vertical line). At the other end of the spectrum, pizzly's false positives remained high at all minimum support levels evaluated and thus produced a consistently high false discovery rate (FDR). The other tools displayed a more gradual linear relationship between true positive rate and FDR in response to the range of minimum support levels tested.
A publicly available dataset consisting of synthetic fusion transcripts spiked in at a wide range of molarity levels to total RNA provides another useful benchmarking test of sensitivity (Tembe et al., 2014). The dataset is composed of 20 samples, each harbouring 9 fusions spiked in at 10 different molarities to total RNA in duplicate. Fusion-Bloom and STAR-Fusion were the most sensitive tools as they were capable of detecting all fusions at all molarities in both replicates (Fig. 1B, Supplementary Table S4).
To assess the tools' specificity in experimental data, we analyzed three RNA-seq samples that are technical replicates of a wholeblood sample pooled from five healthy donors (Zhao et al., 2015). While we cannot assume all fusions detected in healthy individuals are false positives without validation, we expect the majority of reported events are likely false-positives. We made a plot of the total number of fusions at different levels of minimum support to determine an optimal cutoff for comparison (Fig. 1C). Using a minimum of 4 spanning reads as the cutoff, JAFFA consistently reports the fewest number of fusions (5), whereas EricScript (301) and deFuse (227) report the most. SQUID, STAR-Fusion, Fusion-Bloom and pizzly report an average number of 16, 20, 26 and 46 fusions, respectively.
We benchmarked the computing performance of the tools using the 20 spike-in samples which contained 73 to 180 million read pairs (Fig. 1D). On average, Fusion-Bloom requires 10-12 h to process one hundred million read pairs. Although this is slower than alignment-based methods such as pizzly and STAR-Fusion, we think that de novo assembly is a valuable approach in that it provides base-pair precision of fusion breakpoints, and can also be used for detecting other long-range transcriptome rearrangement such as tandem-duplications and splice variants (Chiu et al., 2018).