Divergent hepaciviruses, delta-like viruses, and a chu-like virus in Australian marsupial carnivores (dasyurids)

Abstract Although Australian marsupials are characterised by unique biology and geographic isolation, little is known about the viruses present in these iconic wildlife species. The Dasyuromorphia are an order of marsupial carnivores found only in Australia that include both the extinct Tasmanian tiger (thylacine) and the highly threatened Tasmanian devil. Several other members of the order are similarly under threat of extinction due to habitat loss, hunting, disease, and competition and predation by introduced species such as feral cats. We utilised publicly available RNA-seq data from the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) database to document the viral diversity within four Dasyuromorph species. Accordingly, we identified fifteen novel virus sequences from five DNA virus families (Adenoviridae, Anelloviridae, Gammaherpesvirinae, Papillomaviridae, and Polyomaviridae) and three RNA virus taxa: the order Jingchuvirales, the genus Hepacivirus, and the delta-like virus group. Of particular note was the identification of a marsupial-specific clade of delta-like viruses that may indicate an association of deltaviruses with marsupial species. In addition, we identified a highly divergent hepacivirus in a numbat liver transcriptome that falls outside of the larger mammalian clade. We also detect what may be the first Jingchuvirales virus in a mammalian host—a chu-like virus in Tasmanian devils—thereby expanding the host range beyond invertebrates and ectothermic vertebrates. As many of these Dasyuromorphia species are currently being used in translocation efforts to reseed populations across Australia, understanding their virome is of key importance to prevent the spread of viruses to naive populations.


Introduction
Australian wildlife has evolved in isolation for approximately 45 million years, resulting in a unique mammalian fauna, of which 87 per cent are endemic (Chapman 2009).This includes carnivorous marsupials of the order Dasyuromorphia (Kealy and Beck 2017).Dasyuromorphs have experienced extinction as a direct result of human activity, with the Tasmanian tiger (Thylacinus cynocephalus) an iconic symbol of human-mediated mammalian extinction (Feigin, Frankenberg, and Pask 2022).Indeed, Australia is experiencing one of the highest rates of mammalian extinction globally due to changing land use, disease, and competition and predation from introduced species (Woinarski et al. 2011).Despite these threats, we know little about the viruses that infect these unique and at-risk species and how these viruses might impact population health.
Marsupials are an infraclass of mammals, characterised by their distinctive pouch in which they carry live young.The order Dasyuromorphia represent all carnivorous marsupials in Australia apart from the omnivorous bandicoots (Peramelemorphia) (Zemann et al. 2013) and are found only on the mainland of Australia and its surrounding islands, such as the Australian island state of Tasmania and Papua New Guinea (Kealy and Beck 2017).The largest marsupial carnivore was the extinct thylacine, followed by the now-endangered Tasmanian devil (Sarcophilus harrisii) that has replaced the tiger as the apex predator in Tasmania.Through a combination of hunting, habitat loss, competition for resources with introduced species and, potentially, the introduction of exotic pathogens, the last known thylacine died in 1936 and the species was declared officially extinct in 1986 (Paddle 2000).The Tasmanian devil is also threatened by a contagious cancer that has contributed to an average population decline of 77 per cent in the past 20 years (Lazenby et al. 2018).
Beyond these two iconic species, lesser-known marsupial carnivores are also at risk of extinction.The numbat (Myrmecobius fasciatus) is the last remaining member of the family Myrmecobiidae, one of the few diurnal marsupials, and the only one that feeds exclusively on termites (Zemann et al. 2013).Due to habitat loss and predation by introduced species such as cats and foxes, the numbat only persists in two natural locations in southwestern Australia and it is estimated that there are less than 1,000 numbats remaining in the wild (Peel et al. 2022).In addition, other species that had been believed to be of little concern, such as the fat-tailed dunnart (Sminthopsis crassicaudata), have recently been listed as threatened by the state of Victoria due to population decline (Scicluna, Gill, and Robert 2021).To combat these dramatic declines in populations of marsupial carnivores across Australia, programmes have been developed to relocate individuals from thriving populations in protected locations, such as the quoll populations in Tasmania, as well as from captive breeding populations (Portas et al. 2020).Similarly, the yellow-footed antechinus (Antechinus flavipes) has been proposed as a candidate species for translocation as part of a rewilding project in an area near Canberra, Australia (Manning 2021).These programmes have experienced varied success, and an often-overlooked risk with the translocation of wildlife is the introduction of pathogens into naive populations already struggling under the burden of anthropogenic activities and introduced species (French et al. 2022).
We know little about the viruses circulating within dasyuromorphs.To date, only a single virome project has been performed on these animals-in this case, the faecal virome of Tasmanian devils (Chong et al. 2019).There have been a small number of studies of marsupial carnivores experiencing overt signs of disease, including the characterisation of Dasyurid herpesvirus 1 in Antechinus (Barker, Carbonell, and Bradley 1981), and a chimeric papilloma-polyomavirus has been identified in bandicoots (Woolford et al. 2007).There have also been a small number of serological and bioinformatic screening studies targeting or including marsupial carnivores.For example, the molecular and serological study of marsupial tissues for the identification of herpesviruses led to the identification of a novel herpesvirus in a Tasmanian devil (Stalder et al. 2015), while an analysis of available transcriptomic data and reference host genomes characterised the endogenous viral elements (EVEs) of marsupial carnivores (Harding et al. 2021).
Herein, we present the first unbiased virus discovery analysis of marsupial carnivores.Because of the inherent challenges in acquiring samples from dasyuromorphs that are generally protected species, we instead mined the available transcriptomes present on the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA), followed by genomic and phylogenetic analysis.
De novo assembly was conducted using MEGAHIT with default parameters (v1.2.9) (Li et al. 2015).The assembled contigs were then compared to the RdRp-scan RNA-dependent RNA polymerase (RdRp) core protein sequence database (v0.90) (Charon et al. 2022) and the protein version of the Reference Viral Databases (v23.0)(Goodacre et al. 2018;Bigot et al. 2019) using Diamond BlastX (v2.0.9) with an e-value cut-off of 1 × 10 −5 (Buchfink, Reuter, and Drost 2021).To remove potential false positives, contigs with hits to virus sequences were used as a query against the NCBI nucleotide database (as of May 2023) using Blastn, and all contigs with sequence identity to non-virus nucleotide sequences were removed from the query set (Camacho et al. 2009).The remaining contigs were then aligned against the NCBI non-redundant protein database (as of March 2023) using Diamond BlastX, and contigs with hits to virus proteins were further examined.The de novo assembler and FindORFs tool, available in Geneious (Kearse et al. 2012), were used to further assemble genomes where necessary and to identify ORFs in potential virus sequences, respectively.EVEs were identified by referencing Harding et al. (2021) and removed manually.NCBI Web BLAST (https:// www.ncbi.nlm.nih.gov/BLAST) was then used to check for false positives, disrupted open reading frames (ORFs), and to manually assess alignment to virus motifs.

Virus abundance
The abundance of virus contigs was measured using the RNAseq by Expectation Maximisation (v1.3.0)program (Li and Dewey 2011).The expected count of viral contigs was used to calculate viral abundance as a percentage of total reads.All calculations and graphing were performed using R (R Core Team 2021).

Phylogenetic analysis
Sequences representing novel virus species, defined as those with <95 per cent nucleotide sequence identity to their closest relative, were assigned to a taxonomic family based on their identity to previously characterised virus species.For each family, a reference dataset was downloaded from NCBI Virus (Hatcher et al. 2017) and the completeness of this dataset was assessed by comparing it to the International Committee on Virus Taxonomy (ICTV)-recognised species for each family.Unclassified species related to the viruses discovered here were added using a Web BLAST search using the novel virus sequence as the query.Due to the high level of divergence within the reference dataset, amino acid alignments were estimated for all families except the Kolmioviridae (Deltavirus), in which case a nucleotide alignment was used.Multiple sequence alignments were inferred using MAFFT (v 7.402) (Katoh and Standley 2013) with local pair alignment for amino acid sequences and global pair alignment for the deltavirus nucleotide sequences.All alignments were then trimmed to remove ambiguous regions using trimAl (v1.4.1) (Capella-Gutierrez, Silla-Martinez, and Gabaldon 2009) with a gap threshold of 0.8 and a similarity threshold of 0.005 and then manually assessed using AliView (Larsson 2014).Phylogenetic trees were estimated using IQ-TREE 2 (v2.2.2) (Minh et al. 2020) with the appropriate substitution model selected using ModelFinder (Kalyaanamoorthy et al. 2017).Branch support was assessed with 1,000 bootstrap replicates using ultrafast bootstrapping (Hoang et al. 2017).

Mapping sequences to herpesvirus
As only one reference sequence for a single gene (DNA-dependent DNA polymerase) was available for the species of Antechinus herpesvirus previously identified, Bowtie2 (v2.2.5) (Langmead and Salzberg 2012) was used to align the non-host reads to the  2017).Species with SRA RNA-seq data available are indicated by a blue circle, species in which viruses were identified in this study are indicated with dark blue circles, while species in which no viruses were found are indicated with a light blue circle.Animal silhouettes are sourced from PhyloPic (https://www.phylopic.org/)produced by Sarah Werning, Gabriela Palomo-Munoz.Creative Commons licence is found at https://creativecommons.org/licenses/by/3.0/.
Dasyurid herpesvirus 1 DNA-dependent DNA polymerase reference sequence (MF576269.1).The resulting alignment was viewed in Geneious (Kearse et al. 2012) and used to assess the percentage sequence identity between the novel sequence and the available reference.

Library composition assessment
To assess the taxonomic composition of each library, contigs were aligned to a custom NCBI nucleotide database without environmental and artificial sequences (https://researchdata.edu.au/indexed-reference-databases-kma-ccmetagen/1371207) using the KMA aligner (v1.3.9a) and the CCMetagen program (v1.1.3)(Clausen et al. 2018;Marcelino et al. 2020).The abundance of each taxonomic group was determined by counting the number of nucleotides that matched the reference sequence, with an additional correction for template length using the default parameter in KMA.For data visualisation, CCMetagen was used to generate Krona graphs, which were subsequently edited in Adobe Illustrator (https://www.adobe.com).

Screening Dasyuromorphia for viruses
As of March 2023, the NCBI SRA database contained 446 RNAseq libraries from Dasyuromorphia (Supplementary Table S1 Virus sequence similarity screening of the available libraries resulted in potential positive hits in 206 of the screened libraries, including Tasmanian devil, yellow-footed antechinus, numbat, and fat-tailed dunnart.Following secondary assembly and removal of contigs with hits to endogenous viruses, twenty-two partial or full virus genomes were identified in forty-three libraries including four host species (Fig. 1).These virus sequences were taxonomically assigned to five DNA virus families: Adenoviridae, Anelloviridae, Herpesviridae, Papillomaviridae, and Polyomaviridae, and three RNA virus taxa: the order Jingchuvirales, the genus Hepacivirus, and the delta-like virus group (Table 1).Of these, two were partial sequences of novel herpesviruses in the Tasmanian devil and the fat-tailed dunnart for which no gene that could be used for phylogenetic confirmation could be assembled, and were thus excluded from further study.There was also a fragment of the ORF2 of an anellovirus present in a Tasmanian devil library, but as the ORF1 fragment that is used for phylogenetic classification of these viruses was absent, and ORF2 is rarely published for this virus species, it was similarly excluded.
Within the virus-positive dataset, further analysis revealed that four Antechinus libraries from the same study (SRR11306636, SRR11306642, SRR11306648, and SRR11306672) appeared to contain viruses identified as likely contaminants.Specifically, the human-associated viruses Gammapapillomavirus 19 and Betapapillomavirus 1 were identified with over 99 per cent nucleotide sequence identity, and human reads were found in three of these libraries (Supplementary Fig. S1).These viruses were excluded  from further analysis leaving fifteen novel virus species identified in forty libraries (Fig. 2).No other libraries contained contaminating reads (Supplementary Fig. S1).

Marsupial delta-like viruses
Of particular note, we identified the first marsupial-associated delta-like viruses.These were detected in a fat-tailed dunnart and a Tasmanian devil, provisionally named fat-tailed dunnart deltavirus and Tasmanian devil deltavirus, respectively.These viruses shared 71 per cent amino acid sequence identity with their nearest relative, Rodent deltavirus, but were also distinct from each other with 84 per cent nucleotide identity across the entire assembled contigs.Notably, these viruses formed a distinct clade within the Deltavirus small delta antigen-like protein phylogeny (Fig. 3), clustering most closely with viruses sampled from rodents.
The fat-tailed dunnart deltavirus was identified in a library of eye tissue from a study of gene expression in mammalian embryos (Royall et al. 2019).The assembled contig was 643 nucleotides in length, thereby representing only a fragment of the expected 1.6kb viral genome, and was found at remarkably low abundance, with only 4.06e-7 per cent of total reads (expected count: 272) aligning to the contig (Fig. 2).Tasmanian devil deltavirus was identified in a sample of Tasmanian devil facial tumour disease (DFTD) taken from a female Tasmanian devil.The assembled contig was again fragmentary, at 741 nucleotides in length, and the abundance was also low with 3.98e-7 per cent of reads (expected count: twenty-seven) aligning to the contig.In both cases, no other viruses were identified in the sequencing libraries.

RNA viruses
We identified three species of novel RNA viruses in this study: two hepaciviruses (positive-sense RNA, family Flaviviridae), one in antechinus and one in a numbat, and a chu-like virus (negativesense RNA virus, order Jingchuvirales) here named Tasmanian devil chu-like virus.The chu-like virus, which has an unsegmented, non-circular genome, was identified in a DFTD cell line that was cultured from primary tissue (Kozakiewicz et al. 2021).The genome contained four predicted ORFs, one of these being the RdRp, while the remaining three had no sequence identity to any reference sequence.This was at the highest abundance of all viruses identified in this study, representing 0.2 per cent of total reads (expected read count: 253,672) aligning to the viral genome (Fig. 2) of 12,118 nucleotides.A phylogenetic analysis of the RdRp revealed that Tasmanian devil chu-like virus fell outside of the Chuviridae and was highly divergent from known viruses including the only known vertebrate clade of Chuviridae, the genus Piscichuvirus, although it did cluster within the order Jingchuvirales (Fig. 4A).The high abundance in the cultured Tasmanian devil tissues and the absence of any sequencing reads of a non-target source in this library (BioProject SRR6380970) tentatively suggest that it does replicate in these cells and was not the result of contamination (Supplementary Fig. S1).
Antechinus hepacivirus was detected in multiple libraries from a single BioProject (PRJNA565840) in which multiple tissues of thirteen antechinus individuals were sequenced.The assembled virus genome was 9,094 nucleotides in length and was detected in five of the thirteen individuals and in at least two tissue types from each positive individual, with libraries of liver tissue consistently having the highest abundance (1.18e-04-6.44e-04per cent of total reads).The genome structure was a single polyprotein, consistent with the genus Hepacivirus.The full virus genome was assembled from library SRR11306639, a library of male liver tissue and the sequence identity across the other five individuals varied between 93 and 98 per cent.The virus was also identified in spleen, kidney, stomach, and cerebrum.Based on the NS5 (RdRp) protein, this virus was most closely related to a hepacivirus identified in an Ixodes holocyclus tick engorged with the blood of a long-nosed bandicoot in Australia (Harvey et al. 2019;Porter et al. 2020).In turn, these two viruses were related to rodent hepacivirus (Fig. 4B).
Finally, a highly divergent hepacivirus was identified in a transcriptome of numbat liver tissue (Peel et al. 2022) with an abundance of 9.9e-5 per cent of total reads and a length of 10,486 nucleotides with a single polyprotein consistent.This virus, here named numbat hepacivirus, had 37 per cent amino acid sequence identity to the closest blast hit (Norway rat hepacivirus 2, YP_009325411.1)over the NS5 protein and 29 per cent amino acid sequence identity across the whole genome to the closest blast hit (Duck hepacivirus, QKT21547.1).Numbat hepacivirus clustered phylogenetically with a clade of avian-and reptileassociated hepaciviruses based on the NS5 protein (Fig. 4B), but on a long branch characterised by low bootstrap support (63 per cent), suggesting that its phylogenetic position is uncertain due to high-sequence divergence.

DNA viruses
Although RNA-seq data were analysed here, our dataset contained a surprising diversity of novel DNA virus transcripts.Herpesviruses were present in three species-antechinus, Tasmanian devil, and fat-tailed dunnart-but only two antechinus libraries contained sufficient gene sequences for phylogenetic analysis and therefore only these species were analysed further.A very small fifty-five nucleotide fragment of glycoprotein H of the herpesvirus identified in Tasmanian devils exhibited 100 per cent nucleotide sequence identity to Dasyurid herpesvirus 3 (Chong et al. 2019).
Orthoherpesvirus sequences were identified in four Antechinus libraries-two from female spleen libraries containing two contigs each-although none were appropriate for phylogenetic analysis and did not overlap with the genes present in the other two libraries.These sequences were not investigated further.Within the remaining two libraries, one library from male stomach tissue (SRR11306624) contained two short contigs that included the glycoprotein B gene (as well as one hypothetical protein gene sequence), which could be utilised in phylogenetic analysis, here named Dasyurid herpesvirus 5.The other, a library of prostate tissue, contained twenty-four contigs with sequence identity to gammaherpesvirus genes, also including the glycoprotein B sequence, here named Dasyurid herpesvirus 4. A comparison of these two glycoprotein B sequences revealed that they shared only 25 per cent amino acid sequence identity.We used Bowtie2 to align reads from the herpesvirus-positive libraries to the Dasyurid herpesvirus 1 reference sequence (MF576269.1),for which only the DNA-dependent DNA polymerase is currently available.A single read from SRR11306664 (containing Dasyurid herpesvirus 4) aligned with 92 per cent nucleotide sequence identity to the Dasyurid herpesvirus 1 reference, while no reads aligned from the other three Antechinus libraries containing herpesvirus contigs.Based on phylogenetic analysis of glycoprotein B, Dasyurid herpesvirus 4 was most closely related to an elephant herpesvirus (Fig. 5A).Dasyurid herpesvirus 5 fell as a sister lineage to this group, although the sequence is very short and the node is characterised by low bootstrap support.Without the DNA polymerase, the most commonly sequenced orthoherpesvirus gene used for taxonomic demarcation, the taxonomic relationship of these viruses was uncertain.
The most diverse viral family identified was the family Papillomaviridae.Specifically, we discovered transcripts from five species of novel papillomavirus in six libraries of Tasmanian devil lip tissue from a study of DFTD (Kozakiewicz et al. 2021).These viruses were named Tasmanian devil papillomavirus 3-7 to remain consistent with previously identified Tasmanian devil-associated papillomaviruses (Chong et al. 2019).Partial gene transcripts were identified in all six libraries, although they were not consistent across all libraries.The L2 gene (minor capsid protein) was found in all six libraries and thus used for comparison.Five of these were distinct papillomavirus L2 genes, with amino acid sequence identity ranging from 36 to 59 per cent.As the ICTV species demarcation for papillomavirus genera is <70 per cent, these were deemed to represent five distinct virus species.As the L1 (major capsid) protein was present in five libraries with distinct L2 transcripts, L1 and L2 were used for phylogenetic analysis.These sequences formed a distinct clade of marsupial papillomaviruses with a bettong papillomavirus and the bandicoot papillomavirus that is the sole member of the genus Dyolambdapapillomavirus (Fig. 5B).The presence of papillomaviruses did not correlate with the DFTD status of the animal as both DFTD positive and negative samples contained papillomaviruses and there was no phylogenetic distinction based on the DFTD status.
As the study for which these data were generated was designed to determine geographical patterns in DFTD expression profiles (Kozakiewicz et al. 2021), we assessed whether geographic location affected the phylogenetic clustering by manual inspection of the tree.No obvious pattern was observed.Two papillomavirus species were previously identified in Tasmanian devil faecal samples-Tasmanian devil-associated papillomavirus 1 and Tasmanian devil-associated papillomavirus 2-although only the E1 protein was published for these viruses.Of the virus sequences discovered here, only a small fragment (317 nt) of the E1 region could be recovered for Tasmanian devil papillomavirus 6.This exhibited 48 per cent amino acid (aa) identity to Tasmanian devil-associated papillomavirus 1, 55 per cent aa identity to Tasmanian devil-associated papillomavirus 2, and 28 per cent aa identity to Bettongia penicillata papillomavirus 1.
Of interest, a library that contained the E2 protein of Tasmanian devil papillomavirus 5 (SRR13765816) also contained a contig showing sequence identity to the large T antigen of a novel polyomavirus.A second library, SRR13765794, contained an identical polyomavirus large T antigen but no other virus sequences.No other polyomavirus genes were detected in these libraries.As this large T antigen sequence was related to but distinct from Tasmanian devil polyomavirus 1 and 2 (Fig. 5C), we named this novel virus Tasmanian devil polyomavirus 3.
Transcripts of two novel anelloviruses were detected, although only one of these-here named Antechinus anellovirus-contained the ORF1 sequence and hence was analysed further.Phylogenetically, Antechinus anellovirus was extremely divergent in the ORF1 protein from the current diversity of Anelloviridae and was closely related to a group of anelloviruses detected in seals, cats, and a giant panda (Fig. 6A).Antechinus anellovirus sequences with 100 per cent sequence identity were detected in two libraries of spleen, co-occurring with Antechinus hepacivirus in one of these libraries.
Similarly, the adenovirus transcripts identified in this study were also found in the same samples as Antechinus hepacivirus: in the kidney tissue of two individuals, as well as in eight other libraries (seven individuals) where it was the only virus identified.Across libraries, the virus sequences were 99 per cent similar at the nucleotide level.The virus was mainly found in libraries of kidney tissue (8/10), as well as in one library of spleen and one of prostate tissue.Antechinus adenovirus was most closely related to members of the genus Mastadenovirus, but due to the high degree of divergence, the phylogenetic position of this virus is still unclear (Fig. 6B).

Discussion
We identified twenty-two full or partial virus genomes from eight virus taxa within 446 SRA transcriptome libraries of tissue samples from species within the order Dasyuromorphia.Of these virus genomes, fifteen were subjected to phylogenetic analysis.The remaining seven virus contigs either were viruses identified as contaminants (in four cases discussed later) or were too short for meaningful analysis; this was the case with two herpesviruses identified in a fat-tailed dunnart library and a Tasmanian devil library, as well as the ORF2 segment of a novel anellovirus in a Tasmanian devil.Human papillomavirus sequences were identified in four antechinus libraries, three of which also contained contaminating human sequences.Although this is a risk when analysing SRA data, the use of taxonomic classification software such as CCMetagen (Marcelino et al. 2020) can be used to identify contaminating host sequences.We can also be confident that the viruses identified here are exogenous, rather than endogenous, as high-quality host genomes are available for three of the four Dasyurid species studied here, and a comprehensive analysis of EVEs in marsupials has previously been undertaken (Harding et al. 2021).
Until 2018, deltaviruses were believed to be exclusive to humans.However, this was disproven by the identification of delta-like viruses in birds and snakes (Wille et al. 2018;Hetzel et al. 2019).Since this time, additional deltaviruses have been identified through SRA screening (Bergner et al. 2021).Of note, we describe the first marsupial deltaviruses that also form what appears to be a marsupial-specific clade.Specifically, Tasmanian devil deltavirus and fat-tailed dunnart deltavirus cluster in a sister clade to a group of rodent-associated deltaviruses also identified in an SRAscreening study (Bergner et al. 2021).Further exploration of the marsupial virome is needed to determine if this putative marsupial cluster of deltaviruses holds true, such that these viruses have associated with marsupial hosts for their entire 160-million-year history (Luo et al. 2011), or if there has been frequent cross-species transmission on more recent timescales as in other deltaviruses (Bergner et al. 2021).
Two distinct marsupial carnivore hepaciviruses were also identified in this study.Hepaciviruses are associated with liver disease in humans, although in most animals their pathogenicity is unknown.Three marsupial hepaciviruses have previously been identified: Koala hepacivirus, Possum hepacivirus, and Collins beach virus, all of which were identified metagenomically (Chang et al. 2019;Harvey et al. 2019;Porter et al. 2020).Antechinus hepacivirus was most closely related to a suspected bandicoot-associated hepacivirus, denoted Collins beach virus, while Koala hepacivirus and Possum hepacivirus cluster together.These two marsupial clades are paraphyletic, with the Antechinus hepacivirus and Collins beach virus grouping with rodent hepaciviruses and the Possum and Koala hepaciviruses clustering with a broader range of mammals.This phylogenetic pattern may be a result of bandicoots and antechinus sharing a similar ecological niche-ground-dwelling foragers (along with rodents)-thereby enabling cross-species transmission, while Koala and Possum hepaciviruses are mainly tree-dwelling herbivores.The lack of a marsupial-specific clade, as well as the grouping with rodent-associated hepaciviruses, is also indicative of multiple introductions of hepaciviruses to marsupials, perhaps via rodent species, although this will again need to be resolved through more extensive sampling.Of note, Antechinus hepacivirus was identified in a relatively large proportion of individuals (5/13), which could suggest a high prevalence in the population and hence be of particular concern if populations are being used for translocation in population reseeding programmes (Manning 2021).In addition, we identified a highly divergent hepacivirus in a numbat liver transcriptome.This virus shared only 29 per cent amino acid sequence identity to the closest blast hit and did not fall within the larger mammalian/marsupial clade, suggesting the presence of a third, highly divergent lineage of marsupial hepaciviruses.Further study of marsupial-associated hepaciviruses is needed to shed light on the emergence and evolution of these viruses in Australian marsupials and to determine what effect these infections may have on their hosts.
A highly divergent chu-like virus was identified in a transcriptome of a DFTD cell line (SRR6380970) grown from primary tissues (Kosack et al. 2019).The Chuviridae and chu-like viruses belong to the order Jingchuvirales, which was until recently believed to be invertebrate specific (Di Paola et al. 2022).The discovery of a chulike virus in the brain of a snake with neurological disease (Argenta et al. 2020) and in a meta-transcriptomic study of reptiles (Shi et al. 2018) led to the identification of a fish-reptile-associated genus (Piscichuvirus) within the Chuviridae.More recently, three novel Piscichuvirus species were identified and associated with encephalitis in turtles (Laovechprasit et al. 2023), suggesting that this genus could be associated with neurological disease.Tasmanian devil chu-like virus is the sister group to the Chuviridae within the order Jingchuvirales and is highly divergent from the characterised species.Interestingly, it has been suggested that DFTD is of Schwann cell origin (Murchison et al. 2010), a cell type associated with the peripheral nervous system, such that the novel chu-like virus identified here could similarly be associated with the nervous system.However, as this virus was identified from SRA data and from a library of cultured cells, we cannot be certain of the true host of this virus.Further work is needed to determine if this virus is capable of infecting DFTD cells and what effect this infection has on the cancer cells.
Transcripts from five families of DNA viruses were identified in these samples.Papilloma-, polyoma-, and herpesviruses are often associated with disease and, in some cases cancer, while anellovirus and adenovirus have less definitive links to disease.We identified a novel anellovirus in antechinus spleen transcriptomes, which co-occurred with Antechinus hepacivirus in one library.These viruses are ubiquitous and their disease association is still debated, although it has been suggested that they may play a role as co-infecting viruses (Webb, Rakibuzzaman, and Ramamoorthy 2020).Similarly, Antechinus adenovirus was identified in kidney, prostate, and spleen tissue transcriptomes and co-occurred with hepacivirus in two of ten libraries.This virus clustered with a group of mastadenoviruses identified in mice, cows, pigs, bats, and a polar bear.Mastadenoviruses are in some cases associated with diseases including encephalitis, respiratory disease (Chen and Tian 2018), and gastrointestinal symptoms (Dayaram et al. 2018).In this case, no disease was reported in the individuals sequenced, but this is difficult to assess in SRA-screening studies.
Sequences from two distinct species of the family Gammaherpesvirinae were identified in antechinus transcriptomes of stomach and prostate tissue, along with partial herpesvirus sequences in Tasmanian devil peripheral blood mononuclear cells and fattailed dunnart mitogen-stimulated splenocytes.Herpesviruses are relatively well characterised in marsupials, with Dasyurid herpesvirus 1 identified in Antechinus in 2014 (Amery-Gale et al. 2014) and subsequent studies revealing two Tasmanian devilassociated herpesviruses (Dasyurid herpesvirus 2 and Dasyurid herpesvirus 3) (Stalder et al. 2015;Chong et al. 2019).However, the species demarcation of viruses within the family now termed Orthoherpesviridae is unclear, with no defined genetic distance or biological features used for classification (Gatherer et al. 2021).Given that very few genomic sequences are available for the related viruses, we suggest 92 per cent nucleotide identity across the single read of the DNA-dependent DNA polymerase (the only gene available for Dasyurid herpesvirus 1) as sufficient genetic distance to represent a novel species.DNA sequencing would likely lead to the recovery of the full genome, including gene sequences necessary to determine the relatedness of these species to each other and to other Dasyurid herpesviruses, but this is beyond the scope of this study.
We identified five distinct but related species of papillomavirus in Tasmanian devil lip tissue transcriptomes within the libraries of a single SRA BioProject.The DFTD status of the sampled individual did not appear to be associated with the presence of papillomavirus, nor did the DFTD status correlate with any clustering in the phylogeny of these species.Two Tasmanian devil-associated papillomaviruses were previously identified in a faecal meta-transcriptome, but as these animals are carnivores, it was impossible to determine their true host (Chong et al. 2019).Phylogenetic analysis revealed that all these sequences were most closely related to each other and more broadly to Bettongia penicillata papillomavirus 1, isolated from papillomatous lesions on a brush-tailed bettong in Australia (Bennett et al. 2010), and more distantly with Bandicoot papillomatosis carcinomatosis virus types 1 and 2. Bandicoot papillomatosis carcinomatosis virus uniquely exhibits genomic features of both papillomavirus and polyomavirus proteins with the L1 and L2 proteins of a papillomavirus and the large and small T antigen proteins of a polyomavirus (Woolford et al. 2007).Interestingly, the large T antigen protein sequence of a novel polyomavirus was identified in one library in this study, which also contained the E2 protein of Tasmanian devil papillomavirus 5, although this is not consistent with the genomic structure of the previously identified hybrid virus.Phylogenetically, the large T antigen sequence of Tasmanian devil polyomavirus 3 was most closely related to the large T antigen sequence of the hybrid Bandicoot papillomatosis carcinomatosis virus, with Tasmanian devil-associated polyomavirus 1 and Tasmanian devil-associated polyoma-like virus 2 being more divergent.Interestingly, Tasmanian devil-associated polyomavirus 1 and Tasmanian devil-associated polyoma-like virus 2 were identified in the same faecal meta-transcriptome study as Tasmanian devil-associated papillomaviruses 1 and 2. DNA sequencing will be required to confirm the genomic structure and phylogenetic relationships of these viruses.
Although our study identified evidence of a relatively high diversity of DNA viruses within transcriptomic data, the lack of complete genomes limits the value of these sequences in phylogenetic comparison and genomic characterisation of these species.In contrast, there was a relatively limited diversity of RNA viruses in these data, which may be a product of the sequencing methodology used.For example, poly-A selection during library preparation will remove many virus sequences (Visser et al. 2016).However, despite being prepared with a poly-A selection, we identified a high diversity of viruses within the yellow-footed Antechinus libraries, with 10/13 sequenced individuals being associated with at least one virus.Antechinus hepacivirus was particularly prevalent, and as these species are being proposed for species reintroduction programmes, it is of particular importance that the viromes of these individuals are investigated to prevent the spread of these viruses to naive populations.From the data provided here, PCR primers could be designed to screen for these viruses in individuals being used in translocation programmes and in cases of undiagnosed disease in marsupial species to determine if there is some disease association.
In light of the diversity of marsupial carnivore viruses identified in this small study of transcriptomic data not generated for the purposes of virus discovery, it is evident that considerable viral diversity exists in these species and that further study is required to understand the viromes and viral evolutionary history of these unique species.Given the threat that these species face as a result of human activity, it is imperative that we understand the risk that viral disease poses to what are often small, isolated populations.

Figure 1 .
Figure 1.A cladogram depicting the phylogenetic relationships of marsupials.Adapted from Duchêne et al. (2017).Species with SRA RNA-seq data available are indicated by a blue circle, species in which viruses were identified in this study are indicated with dark blue circles, while species in which no viruses were found are indicated with a light blue circle.Animal silhouettes are sourced from PhyloPic (https://www.phylopic.org/)produced by Sarah Werning, Gabriela Palomo-Munoz.Creative Commons licence is found at https://creativecommons.org/licenses/by/3.0/.

Figure 2 .
Figure 2. Virus abundance in marsupial carnivores as a percentage of total reads.The animal from which libraries were obtained is indicated by an animal silhouette below the library ID.

Figure 3 .
Figure 3. Phylogeny of mammalian deltavirus small delta antigen-like protein nucleotide sequence.The phylogeny was estimated using the GTR+F+I+4 nucleotide substitution model.The nucleotide alignment used to generate this tree was 679 positions in length.Sequences identified in this study are shown in red text and the library ID is indicated in the taxa name.Branch support >90 per cent is indicated with a black dot at the node.The scale bar indicates the number of nucleotide substitutions per site and the tree is midpoint rooted for clarity.This alignment is based on that provided by Bergner et al. (2021).Animal silhouettes indicate the host species.

Figure 4 .
Figure 4. Phylogenetic trees of the RNA viruses identified in this study.(A) Phylogeny of the order Jingchuvirales based on the RdRp amino acid sequence using the LG+F+I+4 substitution model.The alignment used to generate this tree was 1,160 positions in length.(B) Phylogeny of the hepacivirus NS5 protein sequence estimated using the LG+F+I+4 substitution model.The alignment used to generate this tree was 594 positions in length.Scale bars indicate the number of amino acid substitutions per site.Trees are midpoint rooted for clarity.Branch support values >90 per cent are indicated with a black dot at the node.Viruses identified in this study are highlighted with red taxa labels.Animal silhouettes indicate the host species of the viruses identified here and tip branches are coloured according to broader host species.

Figure 5 .
Figure 5. Phylogenies of Papillomaviridae, Polyomaviridae, and Gammaherpesvirinae.(A) Phylogeny of the family Gammaherpesvirinae estimated using the LG+I+4 amino acid substitution model.The alignment used to generate this tree was 318 positions in length.(B) Phylogeny of the Papillomaviridae L1 and L2 proteins estimated using the LG+F+I+4 substitution model.The alignment used to generate this tree was 809 positions in length.(C) Phylogeny of the Polyomaviridae large T antigen protein estimated using the LG+I+4 substitution model.The alignment used to generate this tree was 446 positions in length.Trees are midpoint rooted for clarity.Scale bars indicate the number of amino acid substitutions per site.Viruses identified in this study are indicated with red text and animal silhouettes indicate the host species.Branch support values >90 per cent are indicated with a black dot at the node.

Figure 6 .
Figure 6.Phylogenies of Adenoviridae and Anelloviridae.(A) Phylogeny of the Anelloviridae hexon protein estimated using the LG+F+4 substitution model.The alignment used to generate this tree was 440 positions in length.(B) Phylogeny of the Adenoviridae hexon protein estimated using the LG + I + 4 substitution model.The alignment used to generate this tree was 545 positions in length.Viruses identified in this study are indicated with red text and animal silhouettes indicate the host species.Trees are midpoint rooted for clarity.Branch support values >90 per cent are indicated with a black dot at the node.Scale bars indicate the number of amino acid substitutions per site.