Hepatitis delta virus-like circular RNAs from diverse metazoans encode conserved hammerhead ribozymes

Abstract Human hepatitis delta virus (HDV) is a unique infectious agent whose genome is composed of a small circular RNA. Recent data, however, have reported the existence of highly divergent HDV-like circRNAs in the transcriptomes of diverse vertebrate and invertebrate species. The HDV-like genomes described in amniotes such as birds and reptiles encode self-cleaving RNA motifs or ribozymes similar to the ones present in the human HDV, whereas no catalytic RNA domains have been reported for the HDV-like genomes detected in metagenomic data from some amphibians, fish, and invertebrates. Herein, we describe the self-cleaving motifs of the HDV-like genomes reported in newts and fish, which belong to the characteristic class of HDV ribozymes. Surprisingly, HDV-like genomes from a toad and a termite show conserved type III hammerhead ribozymes, which belong to an unrelated class of catalytic RNAs characteristic of plant genomes and plant subviral circRNAs, such as some viral satellites and viroids. Sequence analyses revealed the presence of similar HDV-like hammerhead ribozymes encoded in two termite genomes, but also in the genomes of several dipteran species. In vitro transcriptions confirmed the cleaving activity for these motifs, with moderate rates of self-cleavage. These data indicate that all described HDV-like agents contain self-cleaving motifs from either the HDV or the hammerhead class. Autocatalytic ribozymes in HDV-like genomes could be regarded as interchangeable domains and may have arisen from cellular transcriptomes, although we still cannot rule out some other evolutionary explanations.


Introduction
The hepatitis delta or D virus (HDV) is a unique infectious agent whose genome is composed of a small ($1,700 nt) singlestranded circRNA genome with high self-complementarity (Botelho-Souza et al. 2017). This satellite virus encodes only two proteins, the small and large hepatitis delta antigen (S-and L-HDAg), both derived from a single open reading frame. In addition, the HDV contains a characteristic self-cleaving RNA motif, the hepatitis delta virus ribozyme (HDVR), both in the genomic and antigenomic polarities (Kuo et al. 1988;Wu et al. 1989;Riccitelli and Luptá k 2013). HDV depends on the hepatitis B virus (HBV) for assembly of viral particles, release from the host cell, and entry into new cells (Botelho-Souza et al. 2017).
Until recently, the HDV circRNA had only been detected in humans, and it is the sole representative of the Deltavirus genus. However, new data have revealed the presence of highly divergent HDV-like circRNAs in samples from diverse metazoan species, ranging from amniotes (reptiles, birds, and mammals) to amphibians (a newt and a toad) and invertebrates (a termite) (Wille et al. 2018;Chang et al. 2019;Hetzel et al. 2019;Paraskevopoulou et al. 2020;Bergner et al. 2021) indicating that this atypical virus has a longer and more complex evolutionary history than previously thought. Contrary to the human HDV, none of the newly described deltavirus-like genomes has been found associated with a coinfecting hepadnavirus. In that way, it has been proposed that HDV and metazoan HDV-like agents could use diverse helper viruses, such as arenavirus, vesiculovirus and other HBV-unrelated viruses, to acquire the envelope proteins necessary for transmission (Hetzel et al. 2019;Perez-Vargas et al. 2019). The newly described HDV-like agents share many characteristics with their human counterpart. They all have single-stranded circular RNA genomes of approximately 1,500-1,700 nt that fold into rod-like structures, and code for a putative delta antigen with 13-55 per cent amino acid identity to the human one. However, in the case of the HDVR motif, its presence has been reported for the HDV-like agents from amniotes (Wille et al. 2018;Hetzel et al. 2019;Paraskevopoulou et al. 2020), but not for those of amphibians, fish, and invertebrates (Chang et al. 2019). This fact is striking and worth further examination, since ribozymes are essential domains for processing and replication of the HDV circRNA through a rollingcircle mechanism (Flores et al. 2012;Harichandran et al. 2019).
The ribozyme of the human hepatitis delta virus (HDV) or HDVR belongs to the family of small self-cleaving ribozymes, a group of RNA motifs that catalyse an intramolecular transesterification reaction in a sequence-specific manner (Ferré-D'Amaré and Scott 2010;Jimenez et al. 2015). Despite their very different structures, all nine classes of this family of ribozymes produce RNA fragments with 2 0 -3 0 -cyclic phosphate and 5 0 -hydroxyl ends after a nucleophilic attack by a 2 0 -oxygen group. In the case of the HDVR, both the genomic and antigenomic ribozymes show a characteristic nested double pseudoknot structure with five helical regions (Fig. 1A). A paradigmatic and wellstudied member of the family of small self-cleaving RNAs is the hammerhead ribozyme (HHR) (Hutchins et al. 1986;Prody et al. 1986;de la Peña et al. 2017). The HHR is composed of three double helixes (I to III) that surround a core of 15 conserved nucleotides, and folds into a c-shaped structure where the loops of helix I and II interact (de la Peña et al. 2003;Martick and Scott 2006). Depending on the open-ended helix, three circularly permuted topologies are possible for the HHR (type I, II, or III) (Fig. 1B).
Initially described in subviral agents with circRNA genomes-the HDV and some plant viral satellites and viroidsboth HDVR and HHR motifs are now known to be pervasive in genomes throughout the tree of life (Webb et al. 2009;de la Peña and García-Robles 2010). In many instances, these ribozymes are associated with retrotransposable elements, and are thought to play essential roles in their life cycle. Diverse non-LTR transposons, such as R2, RTEs, and other LINEs, possess active HDVR motifs encoded in their 5 0 termini (Eickbush and Eickbush 2010;Ruminski et al. 2011;Sá nchez-Luque et al. 2011). HHR motifs, on the other hand, have been found embedded within the terminal repeats of Penelope-like elements (Cervera and de la Peña 2014), Terminon giant retrotransposons (Arkhipova et al. 2017), and non-autonomous LTR and non-LTR retrozymes of plants and animals, respectively (Cervera et al. 2016;Cervera and de la Peña 2020). Interestingly, the retrotransposition intermediates of both LTR and non-LTR retrozymes are small circular RNAs that accumulate at high levels in different organisms, and share many characteristics with infectious HHR-containing circRNAs of plants (de la Peña and Cervera 2017). The conserved presence of diverse self-cleaving ribozymes in infectious agents as well as in retrotransposons hints at a complex evolutionary relationship among all these mobile elements .
In this paper, we analyse the metagenomic HDV-like sequences of newts, toads, fish, and termites reported by Chang et al. (2019) for the presence of small self-cleaving ribozymes.

Bioinformatics and sequence analysis
Sequence homology searches through BLAT (Kent 2002), BLASTX, and BLAST (Altschul et al. 1990) tools were carried out in individual or grouped genomes, and in specific BioSample accessions (SAMN11445145 and SAMN11445146). Sequence alignments were performed with ClustalX (Larkin et al. 2007), MUSCLE (Edgar 2004), and Jalview (Waterhouse et al. 2009) software. The RNAmotif (Macke et al. 2001) and InfeRNAl (Nawrocki and Eddy 2013) software was used for the detection of HDVR and HHR motifs with an extended type III architecture. Secondary RNA structures of minimum free energy were

Sequence cloning and transcription
Sequences corresponding to the newt and fish HDV-like regions of interest (Fig. 2) and to the toad and termite HDV-like antigenomic HHRs preceded by the T7 RNA polymerase promoter ( Fig. 6) were purchased as gBlock Gene Fragments (Integrated DNA Technologies). DNA fragments were cloned into a properly linearized pBlueScript KS or pUC18 vectors by cohesive-end ligations. RNAs of the cloned sequences were synthesized by in vitro run-off transcription of the linearized plasmids containing either the HDV-like fragment or the HHR motifs. The transcription reactions contained 40 mM Tris-HCl, pH 8, 6 mM MgCl 2 , 2 mM spermidine, 0.5 mg/ml RNase-free bovine serum albumin, 0.1% Triton X-100, 10 mM dithiothreitol, 1 mM each of ATP, CTP, GTP, and UTP, 2 U/ml of Ribonuclease Inhibitor (Takara Inc), 20 ng/ml of plasmid DNA, and 4 U/ml of T7 or T3 RNA polymerases. After incubation at 37 C during the indicated time, the products were fractionated by polyacrylamide gel electrophoresis (PAGE) in 5 per cent (HDV-like fragments) or 15 per cent gels (HHRs) with 8 M urea.

Self-cleavage analyses
Analyses of HHR self-cleavage activity under co-transcriptional conditions were performed as previously described (Long and Uhlenbeck 1994). Appropriate aliquots of the transcription reactions (smaller volumes were taken at longer incubation times) were removed at different time intervals, quenched with a fivefold excess of stop solution at 0 C, and analysed as previously described (Long and Uhlenbeck 1994). Briefly, the uncleaved and cleaved transcripts were separated by PAGE in 15 per cent denaturing gels and detected by Sybr Gold staining (Thermo Fisher Scientific). The product fraction at different times, F t , was determined by quantitative scanning of the corresponding gel bands and fitted to the equation F t ¼ F 1 (1 À e -kt ), where F 1 is the product fraction at the reaction endpoint, and k is the first-order rate constant of cleavage (k obs ).

Bioinformatic detection of self-cleaving motifs in HDV-like circRNAs
It has been previously reported the existence of divergent HDVlike agents in metagenomic data obtained from disparate metazoans such as a newt (the Chinese fire belly newt Cynops orientalis), a toad (the Asiatic toad Bufo gargarizans), fish (a mixture from classes Actinopterygii, Chondrichthyes, and Agnatha), and a termite (Schedorhinotermes intermedius) species (Chang et al. 2019). The sequences obtained corresponded to circRNAs with similar sizes to the human HDV virus ($1,500-1,700 nt) and are also predicted to adopt classical rod-like secondary structures. Moreover, the four metazoan HDV-like genomes encode highly divergent copies of the delta antigen ORF. However, the presence of the HDV self-cleaving domains or HDVRs were not reported for any of these genomes, and it has been suggested that these novel HDV-like circRNAs may lack the ribozyme motifs (Paraskevopoulou et al. 2020).
To confirm the presence or absence of self-cleaving motifs in these HDV-like genomes, homology-based searches were initially performed. Based on the typical location of genomic and antigenomic ribozymes present in other HDV agents from humans and amniotes ( Fig. 2A), we focused on the regions ($400 bp) immediately following the predicted delta antigen coding sequences (Fig. 2B). BLAST-searches showed that a portion at the 5 0 region of the newt HDV-like sequence clearly mapped to the 5 0 end of the antigenomic Human HDVR motif, strongly suggesting the presence of HDVR-like motifs in the newt sequence ( Supplementary Fig. S1A). Sequence alignments followed by manual inspection of the regions of interest from the newt HDV-like sequence confirmed the presence of HDVRs in both the genomic and antigenomic strands (Supplementary Fig. S1B and Fig. 3B). Similar sequence analysis revealed the presence of analogous HDVR motifs in the fish HDV-like circRNA ( Supplementary Fig. S1C and Fig. 3C). Apart from the antigenomic newt and human HDVR motifs (Fig. 3), the rest of the detected ribozymes did not show sequence homology with other sequences in the GeneBank database. However, newt and fish HDVRs show similar architecture and helix sizes to those described for the human HDVR motifs (Figs 1A and 3): five paired regions forming two coaxial stacks, P1($7 bp)/P4($12 bp) and P2($8 bp)/P3(3 bp), linked by short single-stranded joining strands J1/2(2 nt) and J4/2(4-6 nt).
Analysis of the 400 nt regions from the toad and termite HDV-like circRNAs did not reveal any HDVR motif. However, alignments of the genomic and antigenomic regions from toad and termite surprisingly showed the presence of the short conserved sequences compatible with the well-known HHR motif (Supplementary Fig. S1D and Fig. 1B). The secondary structure of the genomic and antigenomic toad and termite HHRs have a very similar architecture, which corresponds to an atypical class of type III HHR characterized by larger than usual helixes I and II ( Fig. 4A and B). Most type III HHRs, as those present in plant retrozymes and in subviral agents, have a helix I with a 6 base paired stem capped by a 6-8 nt loop, which interacts with a 4 nt loop of helix II (typically with a 4-bp stem) through a set of conserved non-Watson-Crick tertiary interactions (Fig. 1B) (de la Peña et al. 2017). However, the atypical type III HHRs detected in toad and termite HDV-like circRNAs show a much longer helix I with a first shorter stem (5 bp instead of 6 bp), followed by an internal loop (6-10 nt), and capped by an extra RNA hairpin of about 20 extra nt. Interestingly, the internal loop in all four hammerheads show a conserved 5 0 sequence (consensus sequence GCCR, where R stands for a purine), which can interact through Watson-Crick/wobble base pairs with the loop of helix II (a 6-bp stem capped by a loop carrying the consensus sequence YGGC). At least two examples of similar type III HHRs with longer helixes and stable pseudoknots between loops 1 and 2 have been previously described in the literature (Fig. 4D). Interestingly, one of these motifs corresponds to a genomic HHR detected in Drosophila pseudoobscura (Perreault et al. 2011), which shows a highly similar architecture to the toad and termite HDV-like HHRs, despite a low sequence homology among them.
Another feature observed in these HDV-like HHR sequences is the atypical cleavage sites of both genomic and antigenomic ribozyme motifs of the toad HDV-like agent. Most HHRs described in the literature have a cleavage site with the consensus sequence RUH (where R stands for a purine and H for any nucleotide excepting G), whereas the toad HDV-like HHRs show the unusual 'cUC' and 'cUA' cleavage sites, which are, in any case, properly closed with a perfect stem III ('GAAAg' box in the toad HDV-like HHRs, instead the consensus 'GAAAY' box sequence conserved in most HHRs.

Phylogenetically related HDV-like HHRs encoded in the DNA genomes of diverse termite and dipteran species
The sequences of the HHR motifs detected in the toad and termite HDV-like genomes were used as seed queries for homology-based searches in genomic databases. No clear hits were obtained using any of the toad HHR motifs. However, BLAST searches with the antigenomic HHR of the termite HDVlike genome revealed the presence of two highly similar HHR sequences in the genome of the termite Coptotermes formosanus (Fig. 4C, left). Similarly, BLAST searches with the genomic HDVlike HHR of the termite identified another two HHR motifs in the genome of the termite Cryptotermes secundus (Fig. 4C, right). These genomic HHRs from both Coptotermes and Cryptotermes spp. were found as isolated motifs within different large contigs ($20 kb to 2 Mb). In the surrounding 5 0 -and 3 0 -regions (2 kb) of the genomic HHRs of termites we did not detect any sequences homologous to HDV-like sequences nor even any recognizable ORFs, but rather only similarities to genomic sequences from other termite species. Structure-based searches based on this atypical type III architecture did not reveal additional examples of similar HHRs in termite genomes, nor in some other related insects such as hymenopterans, hemipterans, and a cockroach. However, more extensive bioinformatic searches among distant invertebrates unveiled the occurrence of different examples of type III HHRs structurally similar to the ones present in HDVlike agents and D. pseudoobscura genome. These new motifs were mostly found in the genomes of dipterans, such as the common housefly or diverse fruit fly spp., among others ( Fig. 4E. de la Peña et al., in preparation).

Cloning and self-cleavage analysis
Regions of about 400 bp corresponding to the newt and fish HDV-like genomes harboring both genomic and antigenomic HDV ribozymes (Fig. 2B) were cloned under the control of T7/T3 transcriptional promoters (Fig. 5A and B). RNA products obtained after run-off transcriptions (1 h) of each polarity and animal species showed the fragments with the expected sizes due to the ribozymes cleavage (Fig. 5C). Both newt HDVRs reached a final $40 per cent self-cleavage of the primary transcript, whereas fish HDRVs showed lower levels of cleavage (15 and 8 per cent cleavage of the primary genomic and antigenomic transcript sequences, respectively). The observed selfcleavage products confirm the presence of the HDVRs in both polarities of newt and fish HDV-like circRNAs, whereas the differences in the cleavage ratios may indicate distinct ribozyme efficiencies, but also differences in the effect of 3 0 competing sequences (see below and Supplementary Fig. S2) (Diegelman-Parente and Bevilacqua 2002).
Minimal constructs carrying the antigenomic HHR sequence of toad and termite HDV-like genomes were cloned under the control of the T7 RNA promoter. As shown in Figure 6, both HHR motifs show clear self-cleaving activity during transcription. However, the analysis of co-transcriptional cleavage of the HHR motif from the toad HDV-like RNA indicates a much lower selfcleavage efficiency (k obs ¼0.05 6 0.01 min À1 ) compared with the values obtained for the termite HHR (k obs ¼0.96 6 0.09 min À1 ). This difference in the catalytic efficiency of toad and termite HHRs (20-fold) is in agreement with the observed sequence differences in the site of self-cleavage of the toad motif (see above), which is known to deeply reduce the self-cleaving capability of the HHR (Ruffner et al. 1990). Moreover, in both cases we observed that HHR cleavage reached lower completion values (50-60%) compared with the levels observed for typical type III HHRs (above 90%) (de la Peña et al. 2003).

Discussion
In contrast with the ubiquitous circular DNAs such as most prokaryotic, plastid, and bacteriophage genomes, covalently closed circular RNAs had been regarded as exceptional nucleic acids until very recently. We know now the existence of many types of circRNAs, from the splicing-derived circRNAs (Li et al. 2018) to the eukaryotic retrozymes (Cervera et al. 2016; Cervera and de la Peña 2020), or the heterogenous family of infectious circRNA genomes (Flores et al. 2016;de la Peña et al. 2020). Diverse members of this latter group of infectious agents are characterized by the presence of autocatalytic ribozymes, which are required for the replication of their circular RNA genomes through a rolling-circle mechanism. In plant viral satellites and some viroids, a common ribozyme is the HHR, although three viral satellites also show the presence of a rare class of small catalytic RNA, the hairpin ribozyme (Fedor 2000). On the other hand, human HDV and novel HDV-like agents from amniotes, fish, and newts contain a third class of small ribozyme, the HDVR. All these small ribozyme motifs do not seem to show any clear sequence or structural relationship among them, and for that reason, the presence of conserved type III HHRs in the HDV-like agents of a toad and a termite is surprising, and somehow connects a bit more the infectious circRNAs of plants (viroidal agents) and animals (Deltavirus agents). However, although HHRs in HDV-like circRNAs can be regarded as analogous motifs to type III HHRs of plant subviral agents and retrozymes, their different topologies and tertiary interactions do not allow us to confirm a direct evolutionary relationship among them. Alternatively, these exceptional HDV-like HHRs may represent a case of convergence and, as proposed for some subviral circRNA agents of plants, the origin of these self-cleaving motifs could be in the cell transcriptomes (de la Peña and Cervera 2017). Previous data have confirmed that both HDVR and HHR are ubiquitous ribozymes encoded in DNA genomes from prokaryotes to humans. These motifs are commonly involved in diverse families of mobile genetic retroelements, which in some cases are expressed as abundant circRNAs. We can envisage that this population of ribozyme-containing RNAs could be a suitable source for the appearance of novel infectious agents of circRNA, either as viral satellites (through virus encapsidation of the ribozyme-containing RNA during cell infection) or as autonomous agents. In this line, other widespread self-cleaving ribozymes, such as the twister ribozyme (Roth et al. 2014) among others, may also occur in new HDV-like or any similar infectious circRNAs to be discovered. The presence of phylogenetically related HHRs in both HDV-like agents and invertebrate genomes (such as termites and dipterans) could support these hypotheses. Nevertheless, and in the absence of any clear evolutionary origin for the small ribozymes present in eukaryotes, we cannot rule out the opposite possibility where infectious   PAGE with a kinetic analysis and quantification graph of each HHR co-transcriptional cleavage is shown on the right. circRNAs from unknown natural reservoirs would be the source of the self-cleaving ribozymes widespread among eukaryotic genomes.
Our data indicate that, from an evolutionary point of view, both HHR and HDVR can provide the cleavage function essential for double rolling circle replication of HDV-like circRNAs and could be regarded as interchangeable motifs among members of this family. Previous data reported that human HDV can be circularized in host cells after being processed by any selfcleaving motif capable of producing 5 0 -OH and 2 0 -3 0 -cyclic phosphate ends, either HDVR or HHR (Reid and Lazinski 2000). Moreover, our co-transcriptional self-cleaving analyses suggest that the intrinsic catalytic activity of the HHR ribozymes would not be crucial for the agent viability, and that a wide range of values can be tolerated depending on the HDV-like genome. Cotranscriptional self-cleavage analysis of the human HDVR performed under similar conditions resulted in a k obs $0.4 min À1 (Diegelman-Parente and Bevilacqua 2002), which is in the same order than the termite HDV-like HHR and one order of magnitude faster than the toad HDV-like HHR. It has to be pointed out, however, that both HDVRs and HHRs occur in very similar genomic regions of the circRNAs (Fig. 2). This conserved arrangement together with the sequence homology between genomic and antigenomic ribozymes for each HDV-like genome, indicates that self-cleavage activity could be somehow regulated by the 3 0 -competing structures, as previously proposed for the human HDV (Diegelman-Parente and Bevilacqua 2002) ( Supplementary Fig. S2).
The newly discovered HDV-like circRNAs in vertebrate and invertebrate species are also opening an intriguing door regarding whether these novel elements are either new virus satellites or viroid-like agents, and their possible origins. Future research will allow us to understand the biology and evolutionary relationships of this fascinating family of minimal infectious agents with circular RNA genomes.

Data availability
All the sequences analyzed in this study were downloaded from the NCBI GenBank.

Funding
Funding for this work was provided by the Ministerio de Economía y Competitividad of Spain and FEDER funds (BFU2017-87370-P) to M.d.l.P., and by NIH NIAID R21 AI149049 grant to J.L.C.