Abstract

Motivation

The discontinuous transcription mechanism of coronaviruses contributes to their adaptation to different host environments and plays a critical role in their lifecycle. Accurate assembly of coronavirus transcripts is vital for understanding the virus’s biological traits and developing precise prevention and treatment strategies. However, existing de novo assembly algorithms are primarily designed for alternative splicing events in eukaryotes and are not suitable for assembling coronavirus transcriptome, which consists of both genomic RNA and subgenomic mRNAs. Coronavirus transcriptome reconstruction from short reads remains a challenging problem.

Results

In this work, we present VirDiG, a de novo transcriptome assembler specifically designed for coronaviruses. VirDiG utilizes a discontinuous graph to facilitate accurate transcript assembly by incorporating information from paired-end reads, sequence depth, and start and stop codons. Experimental results from both simulated and real datasets show that VirDiG exhibits significant advantages in reconstructing the transcriptome of coronaviruses when compared to traditional de novo assemblers tailored for classical eukaryotic transcriptome assembly.

Availability and implementation

VirDiG is freely available at https://github.com/Limh616/VirDiG.git.

Supplementary information

https://github.com/Limh616/data.git.

Information Accepted manuscripts
Accepted manuscripts are PDF versions of the author’s final manuscript, as accepted for publication by the journal but prior to copyediting or typesetting. They can be cited using the author(s), article title, journal title, year of online publication, and DOI. They will be replaced by the final typeset articles, which may therefore contain changes. The DOI will remain the same throughout.
This content is only available as a PDF.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
Associate Editor: Anna-Sophie Fiston-Lavier
Anna-Sophie Fiston-Lavier
Associate Editor
Search for other works by this author on:

Supplementary data