A Chromosome-Level Genome Assembly of the Reef Stonefish (Synanceia verrucosa) Provides Novel Insights into Stonustoxin (sntx) Genes

Abstract Reef stonefish (Synanceia verrucosa) is one of the most venomous fishes, but its biomedical study has been restricted to molecular cloning and purification of its toxins, instead of high-throughput genetic research on related toxin genes. In this study, we constructed a chromosome-level haplotypic genome assembly for the reef stonefish. The genome was assembled into 24 pseudo-chromosomes, and the length totaled 689.74 Mb, reaching a contig N50 of 11.97 Mb and containing 97.8% of complete BUSCOs. A total of 24,050 protein-coding genes were annotated, of which metalloproteinases, C-type lectins, and stonustoxins (sntx) were the most abundant putative toxin genes. Multitissue transcriptomic and venom proteomic data showed that sntx genes, especially those clustered within a 50-kb region on the chromosome 2, had higher transcription levels than other types of toxins as well as those sntx genes scatteringly distributed on other chromosomes. Further comparative genomic analysis predicted an expansion of sntx-like genes in the Percomorpha lineage including nonvenomous fishes, but Scorpaenoidei species experienced extra independent sntx duplication events, marking the clear-cut origin of authentic toxic stonustoxins. In summary, this high-quality genome assembly and related comparative analysis of toxin genes highlight valuable genetic differences for potential involvement in the evolution of venoms among Scorpaeniformes fishes.


Introduction
It is estimated that there are over 2,900 species of venomous fish distributed throughout the world, and some of them such as stingrays, catfishes, and stonefishes have evolved venom glands in fins (Smith et al. 2016).Order Scorpaeniformes contains the most toxic species including stonefishes, scorpionfishes, and rockfishes.Among them, stonefishes, usually referring to the genus Synanceia with five extant species, are of highest venomous, and the representative reef stonefish (Synanceia verrucosa) is the most widely distributed species (Saggiomo et al. 2021).It has a complex venom composition with hemolytic (Ueda et al. 2006), coagulotoxic, and neurotoxic effects (Harris et al. 2021); thus the fish sting may result in a potentially lethal envenomation in human beings (Maillaud et al. 2020).Many experiments have also proved the toxicity of the fish venom in mice, rats, and dogs (Gail and Rageau 1956;Garnier et al. 1995;Khalil et al. 2018).
The 2 subunits of each SNTX toxin share sequence identify of ∼50%, but they both contain four domains, namely, membrane attack complex-perforin/cholesterol-dependent cytolysin (MACPF/CDC), focal adhesion-targeting (FAT), thioredoxin (THX), and PRY SPla and the Ryanodine Receptor (PRYSPRY).The αand β-MACPF/CDC domains at the N-terminus interact to form a prepore-like complex, indicating that SNTXs may form pores in cells, and such a structure is essential for SNTX-induced pathologies and physiological effects (Ellisdon et al. 2015).SNTXs usually have hemolytic, myotoxic, vasorelaxant, and muscarinic activities in vitro and in vivo (Saggiomo et al. 2021), and inactivation studies show that these multiple activities can be inhibited by modification of anionic lipids, thiol groups, and some amino acid residues (such as lysine, arginine, and tryptophan) within the protein sequences (Saggiomo et al. 2021).
Besides the SNTX and its analogues, other toxins such as natterins (Hatakeyama et al. 2021), metalloproteases (Ziegman et al. 2019), and C-type lectins (Andrich et al. 2015) have also been identified in stonefishes and others in the same order, although most studies focused on only 1 or a few toxin(s) (Saggiomo et al. 2021).Recently, many omic approaches have been applied to comprehensively examine whole venom components of these species.For example, the venom composition of estuarine stonefish (S. horrida) has been investigated using an integrated transcriptomic and proteomic method (Ziegman et al. 2019).Xie et al. (2019) performed transcriptome analysis of venom glands from three scorpionfishes.However, none of these reports have investigated the venom component from a whole-genome view.Although some genomes of Scorpaeniformes species are available (He et al. 2019;Kolora et al. 2021), all of them are rockfishes (Scorpaeniformes: Sebastidae).
Our present study sequenced and assembled a chromosome (Chr)-level genome of the reef stonefish (Fig. 1), which is the first stonefish sequenced so far.This high-quality genome assembly makes inferring of the phylogeny of reef stonefish and its relatives available.Most importantly, combined with additional multitissue transcriptome and venom proteome data, this genomic resource prompts us to study the detailed repertoires, gene structures, and evolution of sntx/ sntx-like toxins throughout the fish tree of life.

Summary of the Genome Assembly and Annotations
After integrating 57-Gb Illumina short paired-end reads (Table S1) and 128-Gb Nanopore long reads (Table S2), we obtained a 689.52-Mb genome assembly of the reef stonefish containing 2,799 contigs with an N50 of 11.97 Mb (Table S3).This primary contig-level assembly was then improved with 131-Gb Hi-C reads (Table S4), resulting in 1,437 scaffolds with an N50 of 27.11 Mb (Table S5).
The final Chr-level genome assembly reaches 689.74 Mb in length (Table 1), consistent with the 682.53-Mb genome size estimated by a k-mer analysis (Fig. S1).
A total of 628.82 Mb of the assembled sequences were clustered into 24 pseudo-Chr (Fig. 2A), ranging from 11.35 Mb to 35.82 Mb in length (Table S6).The GC (G for guanine and C for cytosine) content of the assembled reef stonefish genome was 40.8% (Table 1) (Kolora et al. 2021;Liu et al. 2019).Genome completeness analysis identified 97.8% complete BUSCOs, consisting of 94.9% single-copy orthologs and 2.9% duplicates (Table S7).This high complete BUSCO score and the large contig N50 value indicate that the final genome assembly of the reef stonefish is of high quality, thereby qualifying for downstream analyses.

MBE
Subsequent repeats annotation identified 211 Mb of repetitive sequences, accounting for 30.59% of the reef stonefish genome (Table S8).The combined homology (using eight genomes in Table S9) and de novo annotation pipeline generates a total of 24,050 protein-coding genes, with an average mRNA length of 10,929 bp and nine exons for each gene (Table 1, Fig. S2).After blasting against various public databases (including SwissProt, TrEMBL, KEGG,  A Chromosome-Level Genome Assembly of the Reef Stonefish (S. verrucosa) • https://doi.org/10.1093/molbev/msad215MBE and InterPro), 98.87% of the annotated genes have been predicted with at least 1 match in these databases (Table S10).Details of the Chr, gene density, GC content, repeat distribution, and internal synteny blocks are summarized in Fig. 2B to provide an overview of the reef stonefish genome.

Chromosomal Evolution and Rearrangements
Most Scorpaeniformes species, including the reef stonefish are diploids with 24 pairs (2n = 48) of Chr; therefore, only few internal synteny links were observed in the assembled haploid genome of the reef stonefish (Fig. 2B).However, some related fishes have more or fewer Chr according to previous reports.For example, the long-spined sea scorpion

Construction of the Species Tree
Using the single-copy genes (Tables S11 to S12) identified from the whole genomes of reef stonefish and other 17 ray-finned fishes (Table S9) including spotted gar (as the outgroup), known venomous fishes, model species, and species in Scorpaeniformes and sister groups, we constructed a robust phylogenetic tree (Fig. 3A).Both the maximum likelihood (ML) and Bayesian inference (BI) methods established the same topology (Figs.S4 and S5), which is consistent with previous reports based on mitogenomes (Cui et al. 2019;Jia et al. 2020).Reef stonefish and honeycomb rockfish were grouped together, making up the Scorpaenoidei clade that is a sister group to the Cottoidei branch.The fossil-calibrated tree shows that the divergence time of reef stonefish and honeycomb rockfish was estimated to be around 84.5 million years ago (mya) in the Cretaceous period, which is similar to a previous estimate (Rabosky et al. 2018).
Interestingly, 8 of the 28 sntx genes were clustered within a 50-kb region on Chr2 (Fig. 4B).They were similar in gene length, gene structure (with three exons), and encoded proteins with an amino acid length of around 700 (694 to 704), which are consistent with the length of known SNTX toxins (Yoshinaga-Kiriake et al. 2020).
However, the other annotated 20 sntx-like genes, scattered in different Chr, varied in both length and sequence.Most of them encoded only some of the 4 domains of a complete SNTX; thus these sntx-like genes were usually shorter than the normal one.However, there were few genes encoding proteins longer than a normal SNTX.For example, on the Chr6, there were 4 neighboring sntx-like genes, but all of them lost the third exon, with the second exon extended till the stop codon (Fig. 4C).In case they were false genes caused by incorrect annotation, we further checked the Iso-seq transcriptome and determined that the full-length transcripts of these genes existed.

Expression of Toxin Genes in Various Tissues
Among the entire list of 294 putative toxin genes, 247 were detected by transcriptome sequencing (RNA-seq; see Table S13) with transcriptive evidence in at least one of the examined tissues (Fig. 4E), along with 126 novel toxin transcripts, which were found in the RNA-seq data, but did not exist in the gene annotation of the assembled genome.In detail, a total of 221 transcripts corresponding to 150 toxin genes were identified in the venom.Some putative toxin genes were also transcribed in nonvenom tissues, with 155, 192, and 180 toxin genes expressed in muscle, skin, and visceral mixture, respectively (Fig. 4G).
As expected, most of the detected sntx genes had high transcription levels in the venom gland (Fig. 4F), whereas other toxin gene families did not present such a pattern (Fig. S6).Interestingly, it is notable that a few sntx genes were also highly expressed in other tissues especially in the skin and visceral mixture, most likely because that spine-associated venom is hypothesized to have originated from the skin toxins (Harris and Jenner 2019) and that the liver and kidney in the visceral mixture are important organs for toxification and detoxification of xenobiotic compounds (Wolf and Wolfe 2005).It also suggests that these toxins may have other roles outside of envenomation.In addition, 6 of the 8 sntx genes clustered on the Chr2 generally had higher transcription levels in comparison with those on other Chr (in a scattered way), suggesting that these toxins encoded by sntx genes clustered on the Chr2 might be the core components of the reef stonefish venom.
This hypothesis was further supported by a proteomic analysis of the venom gland.A total of 3,587 peptides, corresponding to 1,076 proteins, were identified using the deduced protein sequences from the assembled genome as the reference.Among these detected peptides, 194 sequences were toxin fragments, including 127, 17, 11, and 39 peptides from SNTXs, lectins, metalloproteinases, and other toxins, respectively.As expected, the majority (22 peptides) A Chromosome-Level Genome Assembly of the Reef Stonefish (S. verrucosa) • https://doi.org/10.1093/molbev/msad215MBE of the top 30 peptides with the highest intensity were digested from the SNTX proteins (Table S14).In addition, up to 191 peptides (including multiple mapped peptides) were mapped to the eight SNTXs encoded from the sntx genes located on the Chr2.Similar to the transcription pattern in the venom transcriptome (Fig. 4F), almost all of these peptides biasedly hit the 6 genes with high expression levels (Fig. 4H), with only 1 peptide mapped to each of the 2 SNTX-likes.Such finding not only proves that SNTXs coming from the Chr2 probably serve as the major bioactive toxins in the venom but also illuminates that some of them might be active while others are not for the toxic effects.

Stonustoxin Evolution Throughout the Fish Tree of Life
To investigate sntx evolution throughout the fish tree of life, we also predicted the sntx repertoires from other 17 representative species in addition to the reef stonefish, using the reviewed SNTX subunits as references to search against the whole genomes of those corresponding fishes.
The retrieved protein sequences that were too long (>120%) or too short (<80%) were eliminated.The results show that, with a sntx cluster of 8 genes on the Chr2, the reef stonefish encodes the largest repertoire of SNTXs as expected, followed by the honeycomb rockfish (encoding 8 SNTXs in total, 5 of which were clustered together as well).Such clusters could only be identified in the 2 Scorpaenoidei species, while the sntx genes were tandemly located in other fish genomes (Fig. 3B).It seems that there might be an expansion of sntx genes in the Scorpaenoidei fishes, which has been discussed in previous reports (such as Ellisdon et al. 2015).
The phylogeny of SNTX proteins split them into 2 major clades, using the SNTX of spotted gar as the outgroup.A small clade (the αβ clade) includes the 2 reported SNTX subunits, the 6 highly expressed stonefish SNTXs from the Chr2, and all 5 clustered SNTXs in the rockfish.It was further divided into 2 groups, 1 represents the α subunit, and the other denotes the β subunit.As a sister to the small clade, the big clade consists of all the SNTXs from other non-Scorpaenoidei species, the 2 lowly expressed stonefish SNTXs from the Chr2, as well as the 3three unclustered SNTXs in the rockfish (Fig. 3C).This interesting phylogeny implies that although SNTXs exist in both venomous and nonvenomous fishes (Ellisdon et al. 2015), the SNTX-α and SNTX-β subunits could only be well defined in Scorpaenoidei fishes (Ellisdon et al. 2015;Hatakeyama et al. 2021;Yoshinaga-Kiriake et al. 2020), suggesting that the neofunction of duplicated SNTXs as toxins might have occurred after the divergence of Scorpaenoidei and Cottoidei.

Valuable Genome Assembly of the Venomous Reef Stonefish
Rapid development of sequencing technologies and bioinformatics facilitates whole-genome assembling and downstream analysis, and there are more than 900 fish records deposited in the NCBI Genome database so far.However, venomous fishes have been given little attention, with only a few species such as yellow catfish (Zhang et al. 2018) and prehistoric monster fish were sequenced till now.Scorpaenoidei contains the most venomous fishes, but only the mildly toxic rockfish genomes are available (He et al. 2019;Kolora et al. 2021), while their venoms are scarcely studied (Francisco et al. 2021).Our present study constructed a Chr-level genome assembly of the reef stonefish, representative of the venomous stonefishes whose venom components have been widely studied (Ellisdon et al. 2015).This high-quality genome assembly not only makes the extensive genetic research of stonefish toxin genes available but also provides a fundamental genetic resource for indepth biomedical study of this venomous species.

Unique Venom Components of the Reef Stonefish
Fish venom systems are supposed to be evolved convergently 19 times (Harris and Jenner 2019), while bioactive compounds in venom glands are different.For instance, the yellow catfish venom contains veficolin, ink toxin, adamalysin, Za2G, and CRISP toxin (Xie et al. 2016;Zhang et al. 2018), but the major components of the toadfish venom are natterin (Lima et al. 2021) and tetrodotoxin, a wellknown toxin found in pufferfish (Katikou et al. 2022).None of these venomous fishes have been reported to contain any SNTX toxin, which is consistent with our current analysis of the SNTX repertoires throughout fish tree of life (Fig. 3B).
In addition to SNTXs, there are hundreds of other toxins such as metalloproteinases and C-type lectins in the reef stonefish genome (Fig. 4), similar to those reports in estuarine stonefish using both transcriptomic and proteomic methods (Ziegman et al. 2019).However, sntx genes, especially those clustered on the Chr2, generally had higher expression levels in the venom gland, compared to other types of toxin genes (Fig. 4E and S6), as well as the rest of the sntx genes scattered on other Chr (Fig. 4F), both at the transcriptional and translational levels (Table S14).Taken together, this special toxin gene distribution and expression pattern suggest that SNTXs are potentially the pivotal bioactive compounds in the venom of reef stonefish and even of all the venomous Scorpaenoidei species when combined with previous reports on the venoms of other stonefishes (Ziegman et al. 2019), scorpionfish (Xie et al. 2019), waspfish (Kiriake et al. 2013), and rabbitfish (Kiriake et al. 2017).

Multiple Copies of sntx Genes in the Reef Stonefish Genome
Unexpectedly, there were more than 1 pair of SNTX subunits (α and β) in the reef stonefish genome.The number totaled 28 (Fig. 4A), of which 8 genes were clustered within a 50 kb-region on the Chr2 (Fig. 4B).In contrast to the rest of these SNTXs with variable lengths, all the 8 SNTXs were composed of about 700 amino acid residues, resembling those reported SNTXs (Ziegman et al. 2019).

MBE
Both phylogeny (Fig. 3C, Table S12) and protein sequences (Fig. S7, Table S15) classified the 8 SNTXs into 3 SNTX-α subunits, 3 SNTX-β subunits, and 2 SNTX-like proteins.The SNTX-α and -β copies had higher expressions in the venom than the SNTX-like proteins (Fig. 4F to H).Although SNTX-like proteins had similar gene structures and the same 4 domains as normal SNTXs (Ziegman et al. 2019), some conserved sites mutated into other amino acids, such as the SNTX-α/β-interacting residues in the focal adhesion-targeting domain (Fig. S7).In summary, our results imply that the 3 pairs of SNTX-α/β might function as bioactive toxins in the venom gland, while the SNTX-like proteins probably do not; this potential is similar to those SNTXs identified in other nonvenomous species (Fig. 3C; Ellisdon et al. 2015).
The 3D structure of SNTX from stonefish S. horrida was described before (Ellisdon et al. 2015) using X-ray crystallography (as shown in Fig. S8A).The 8 SNTXs found in our present study have similar 3D structural characteristics with SNTX-α and SNTX-β from the stonefish S. horrida using homologous modeling (Fig. S8B).In particular, the sequence identity of 2 SNTX-like proteins is only about 48.16% and 46.80%, respectively (Table S15), but they still present similar 3D structures, suggesting that SNTX-like and SNTX-αβ proteins have the same origin.

Interesting SNTX Evolution Throughout the Fish Tree of Life
It is estimated that SNTXs were generated by gene duplication multiple times throughout the evolutionary history of Percomorpha fish (Ellisdon et al. 2015).Similarly, we detected an expansion of sntx genes in the Percomorpha lineage, while with gene loss evidence in several other branches (Fig. 3B).In addition, extra gene duplications occurred in the Scorpaenoidei fish, and these events may have led to the generation of authentic toxic SNTXs by neofunctionalization (Magadum et al. 2013).Our phylogenetic analysis also implies that these extra gene duplication events happened after the divergence of the 2 genera of Synanceia and Sebastes, since the subunits from the same genus grouped together to form sister groups within each clade of both subunits (Fig. 3C).However, it is unclear whether these SNTX duplicates including those less conserved SNTX-like proteins form right dimers for function or not, and the detailed process of the duplication events is still unknown yet.More experiments and additional genomic studies are required to address these unresolved questions.

Sample Collection, Preparation, and Species Identification
A wild reef stonefish was collected from Xisha Islands in the South China Sea with official permission.The fish was temporarily cultured in a seawater tank and then anesthetized before sample preparation.Genomic DNA was extracted from the muscle for whole-genome sequencing, and total RNA was extracted from muscle, venom gland, skin, and liver for transcriptome sequencing.DNA and RNA quantity, purity, and integrity were assessed by qubit fluorometer, NanoDrop spectrophotometer and gel electrophoresis, respectively.
Species identification was carried out by both morphologic determination and complete mitochondrial genome assembled from raw genomic reads using SPAdes v3.13.0 with parameters -k 71,73,75,77 (Prjibelski et al. 2020).Animal experiments in this study were reviewed and approved by the Institutional Review Board on Bioethics and Biosafety of BGI (Shenzhen, China).

Genome and Transcriptome Sequencing
We integrated Illumina short reads, Nanopore long reads, and Hi-C data to establish a high-quality genome assembly of the reef stonefish.For the short-read sequencing, a library with an insert size of 350 bp was constructed and sequenced on an Illumina Hiseq X-Ten platform (Illumina Inc., San Diego, CA, USA) to generate 2 × 150-bp paired-end raw reads, which were further filtered by SOAPnuke v2.1.0(−z -p -M 2 -trim [5,0,5,0]) (Chen et al. 2018).For the long-read sequencing, a library with a 20-kb insert size was prepared and sequenced by MinION of Oxford Nanopore Technologies (ONT; Nanopore Technologies, Oxford, UK).For the Hi-C sequencing, a library with a 400-bp insert size was constructed using cross-linked chromatin fragments (Ghurye et al. 2019) and sequenced on an Illumina Hiseq X-Ten platform.
For transcriptome sequencing, RNA-seq libraries were constructed separately for each of the 4 different samples including venom gland, muscle, skin, and visceral mixture (refers to the mixture of internal tissues, including the liver, spleen, kidney, and heart) and sequenced on an Illumina Hiseq X-Ten platform.A full-length transcriptome sequencing of the venom gland was parallelly conducted on an Iso-seq PacBio platform (Pacific Biosciences, Menlo Park, CA, USA).

Genome Size Estimation and De Novo Genome Assembly
To estimate the genome size of the reef stonefish, the k-mer frequencies were calculated for those genomic short reads with GCE v1.0.2 (Liu et al. 2013).To obtain a highquality Chr-level genome assembly, we first assembled clean short reads into contigs using Platanus v1.2.4 (-k 29 -d 0.3 -t 16 -m 300) (Kajitani et al. 2019) and then aligned them against the ONT long reads with DBG2OLC (Ye et al. 2016) to generate a consensus assembly, which was subjected to 3 iterations of polishing by NextPolish v1.4.0 (Hu et al. 2020).Subsequently, HiC-Pro v2.2 (Servant et al. 2015) was employed for quality control of the Hi-C raw data, and those valid reads were mapped to the polished assembly for scaffolding contigs by using Juicer v1.5 (Durand, Shamim, et al. 2016) plus the 3D-DNA v180922 pipeline (Dudchenko et al. 2017).Finally, the pseudo-chromosomal assembly was visualized and polished manually with Juicebox (Durand, Robinson, A Chromosome-Level Genome Assembly of the Reef Stonefish (S. verrucosa) • https://doi.org/10.1093/molbev/msad215MBEet al. 2016) according to the contact maps and boundaries.Completeness of the final genome assembly was assessed by BUSCO v5.2.2 with genome mode and the database ac-tinopterygii_odb9 (Simão et al. 2015).

Genome Annotation and Gene Prediction
Repeats in the genome were annotated by both de novo and homology methods.For the de novo prediction, RepeatModeler v1.0.11(Flynn et al. 2020) and LTR-FINDER v4.09.1 (Xu and Wang 2007) were applied to construct 2 independent repeat libraries, which were subsequently combined and aligned to the final genome assembly with RepeatMasker v1.331 (Tarailo-Graovac and Chen 2009).For the homology prediction, RepeatMasker and RepeatProteinMask were employed with the known Repbase library (Jurka et al. 2005).In addition, Tandem Repeats Finder (TRF) v4.07b (Benson 1999) was employed to detect tandem repeats.Finally, overlapping repeats from different methods were integrated to generate the final nonredundant repeat repertoire.
To annotate gene structures, we applied the following three approaches.First, for homology search, protein sequences of 8 representative bony fishes were downloaded from the NCBI database and then used as references to get the reef stonefish's gene structures using TBLASTn (Gertz et al. 2006) followed by GeneWise (Birney et al. 2004).Second, for transcriptome-based prediction, the RNA-seq short reads were mapped onto the assembled genome by using HISAT2 v2.0.4 (Kim et al. 2019), and then the alignments were used to predict gene structures with Cufflinks v2.2.1 (Trapnell et al. 2010).Third, for ab initio prediction, a transcriptome assembly was de novo assembled by Trinity v2.13.2 (Grabherr et al. 2011), which was used as the training set for gene prediction by Augustus v3.3.1 (Stanke et al. 2006).Finally, MAKER 3.01.01(Cantarel et al. 2008) was employed to integrate all data generated from the abovementioned 3 methods so as to obtain the final consistent gene set.Gene function and protein domain/motif information were then obtained by blasting the deduced protein sequences against several public databases including NCBI nonredundant protein (nr), SwissProt (Boeckmann et al. 2003), Kyoto Encyclopedia of Genes and Genomes (KEGG) (Kanehisa 2000), gene ontology (GO) (Harris et al., 2004), andInterPro (Jones et al. 2014).

Intraspecific and Interspecific Chromosomal Comparisons
We performed both intraspecific and interspecific chromosomal comparisons between the reef stonefish and other related fish species.For the intraspecific comparison, the protein set of reef stonefish was used as both the subject and the query for BLASTp search, and the alignments were fed into MCscan (Wang et al. 2012) with parameters "-a -e 1e-5 -u 1 -s 8" to construct genome internal syntenic blocks.For the interspecific comparison, the assembled genomes of reef stonefish and other bony fishes were aligned by using Lastz v1.04.15 with parameters "T = 2 C = 2 H = 2,000 Y = 3,400 L = 6,000 K = 2,200."The Circos plots were drawn by Circos v0.69.3 (Krzywinski et al. 2009), and the dot plots were prepared by the LAST v 973 package (Kiełbasa et al. 2011).

Annotation of sntx/sntx-Like Genes and Other Toxin Genes
Toxin annotation was based on the above-described functional gene annotation procedure.We retrieved those genes annotated as "toxin" or "venom" but without "receptor."Subsequently, the most predominant venom proteins (including metalloproteinases, C-type lectins, and stonustoxins) reported in stonefish before (Ziegman et al. 2019) were also selected.Meanwhile, genes enriched into the GO term "toxin activity" (GO:0090729) were picked out.Finally, all results were merged without duplicates to generate the final nonredundant toxin gene set.In this study, putative toxin genes refer to those identified in the assembled genome, no matter they existed in the transcriptome or proteome of the venom or not, since some toxin transcripts or peptides will certainly be lost during sample and data processing or due to low sensitivity of detection machines.
Gene structures of the SNTX-like toxins were further verified by Exonerate v2.2.0 (Slater and Birney 2005).A synteny analysis of these sntx-like genes and their neighboring genes (both upstream and downstream) was first performed by blasting against each target genome independently with the stonefish proteins as the references.The best hit for each blast search was selected, and then the gene arrangement and location were plotted by using the SVG module in Perl.

Evolutionary Analysis
To construct the phylogenetic tree, we firstly applied OrthoFinder v2.2.7 (Emms and Kelly 2019) to identify single-copy gene families from the assembled genomes of reef stonefish and 17 other fishes (including spotted gar as the outgroup), which were downloaded from the NCBI Genome database.Subsequently, coding sequence (CDS) of each gene family was aligned with MUSCLE v5.1 (Edgar 2004) and concatenated, and then the 1st codon site was selected and the final alignment was trimmed with Gblocks (Castresana 2000).A ML tree was constructed using PhyML v3.0 (Guindon et al. 2010) with the GTR substitution model selected by ModelFinder implemented in IQ-TREE v1.6.12 (Nguyen et al. 2015).Branch supports were evaluated by the approximate likelihood ratio test (aLRT) (Anisimova and Gascuel 2006).We also built a BI tree using MrBayes v3.2.2 (Ronquist et al. 2012) with the GTR + I + G model selected by MrModeltest v2.4 implemented in PAUP v4 (www.paup.phylosolutions.com),by running 100,000 generations with sampling every 100 generations.The initial 20% runs were discarded as burn-in.Finally, the 2 species trees were compared, and divergence times were estimated by the Bayesian method implemented by MCMCtree in PAML v4.9 (Yang 2007), using a Tang et al. • https://doi.org/10.1093/molbev/msad215MBE relaxed-clock model with the following 2 fossil times (may), including (i) Otophysi-Batrachoidiformes: 180.0 to 264.0, and (ii) Scorpaenoidei-Cottoidei: 64.7 to 110.2, from TimeTree (Kumar et al. 2017) as constraints.We ran a total of 100,000 samples for the MCMC analysis, of which the first 20% were discarded as burn-in.
Evolution analysis of the sntx-like toxins was performed using their protein sequences.Those SNTX-like toxin protein sequences were aligned using MAFFT v7.490 (Rozewicki et al. 2019). ModelFinder (Kalyaanamoorthy et al. 2017) was applied to determine the best-fit substitution model (JTT + F + R4), and then a ML tree was constructed using IQ-TREE v1.6.12 (Nguyen et al. 2015) with node support assessments by running 1,000 bootstrap replications.

Gene Transcriptional Quantification and Differential Expression Analysis
To calculate gene transcription levels, bowtie2 v2.4.0 (Langmead and Salzberg 2012) was applied to map the quality-controlled RNA-seq clean reads to the assembled reef stonefish genome.Then the aligned reads were counted, and the gene transcription was quantified by RSEM v1.3.3 (Li and Dewey 2011).Subsequently, differently expressed genes (DEGs) were identified by DESeq2 v1.26.0 (Love et al. 2014) with an adjusted P < 0.05 and a log-fold change ≥ 2. Finally, the toxin DEGs across all samples were visualized by the pheatmap package v1.0.12 in R with normalized values via Z score.

Proteome Profiling of the Venom
Crude venom (about 300 µL) was collected from all the dorsal spines of the same sequenced fish by squeezing the spines against the cling wrap sealed over a 1.5 mL tube.For proteome profiling, the sample was first prepared according to the filter aided sample preparation (FASP) procedure (Wiśniewski 2019).In brief, we used 10 µL of the crude venom and added dithiothreitol to dilute the solution to a final concentration of 100 mM.It was then centrifugated, and the pellet was washed twice with 8 M urea in 0.1 M Tris/HCl pH 8.5 (UA buffer), incubated with 100 mM iodoacetamide (IAA buffer) at room temperature for 30 min in dark, and then washed twice by UA buffer again.Afterward, the pelleted was resolved in 100 µL of 25 mM ammonium bicarbonate for centrifugation, followed by trypsinolysis at 37 °C for 18 h with 40 µL of trypsin buffer (4 µg trypsin in 40 µL of 100 mM ammonium bicarbonate).Finally, 10 µL of acetic acid was added to stop the reaction, and the solution was further submitted to solid-phase extraction on C18 cartridges to obtain the final desalted peptide fragments.
LC-MS/MS detection was performed on a prominence nano-HPLC system (Shimadzu, Tokyo, Japan).Peptides were initially trapped on a column (100 µm × 2 cm, 5 µm-C18), and then isolated on a column (75 µm × 100 mm, 3 µm-C18) at a flow rate of 300 nL/min.Next, spectral data were analyzed by using Q-Exactive (Thermo Fisher Scientific, Waltham, MA, USA), with settings of analyzing time (120 min), parent ion scan range (300 to 1800 m/z), ms1 resolving power (70,000 at m/z 200), AGC target (3e6), and ms1 maximum IT (50 ms).Identified ions were further fragmented and analyzed with settings of ms2 resolving power (17,500 at m/z 200), isolation window (2 m/z), ms2 maximum IT (60 ms), ms2 activation type (HCD), normalized collision energy (27 eV), dynamic exclusion (60.0 s), and underfill ratio (0.1%).Raw files were fed into MaxQuant (Tyanova et al. 2016) for searching the protein library predicted from the assembled genome and transcriptomes of reef stonefish with parameters (enzyme: trypsin, missed cleavage sites: 2, fixed modification: C, variable modification: M and protein N-term acetyl), and those peptides with false discovery rates (FDR) ≤ 0.01 were considered reliable.Peptide intensities were calculated as the sums of extracted ion current (XIC) from all isotopic clusters associated with the identified peptide; in other words, they show the sums of all identified individual (both unique and razor) peptide intensities.

3D-homology Model of SNTX-Like Toxins
The 3D structure models of SNTX-like toxins were developed from primary amino acid sequences via homology modeling on a SWISS-MODEL server (Waterhouse et al. 2018).Models of the SNTX-like toxins were built using Stonustoxin (PDB:4WVM) including subunit alpha and beta Stonustoxin structures as templates in the SWISS-MODEL server, and related QMEAN scores were applied to evaluate the model quality.
FIG. 1.The sequenced reef stonefish and its venom gland.a) Lateral view of the fish.b) Head view of the fish.c) A representative venom gland dissected from the dorsal spine.
FIG. 2.A Chr-level genome assembly of the reef stonefish with comparison to other related species.a) 24 distinct blocks were visualized in the Hi-C contact matrixes.b) The Chr-level genome assembly of the reef stonefish.From outside to inside: a) the 24 pseudo-chromosomes, b) gene density, c) repeats, and d) GC content.The links inside refer to internal syntenic blocks within the genome.c) Synteny between the Chr of the reef stonefish (2n = 48) and rockfish (2n = 48).d) Synteny between the Chr of the reef stonefish and sea scorpion (2n = 42).e) Synteny between the Chr of the reef stonefish and lumpfish (2n = 50).

FIG. 3 .
FIG. 3. Phylogeny of 18 fish species and their stonustoxin family repertoires.a) Species tree with venomous species (highlighted in red) and their venomous spines.Each suborder branch in Perciformes is set to a different color background.b) Stonustoxin repertoires in various fishes.Each filled pentagon refers to a complete gene, while blank ones represent pseudogenes due to frameshifts (#) or pre-stop (*), and the dotted-lined ones denote partial genes.The 2 boxes refer to gene clusters.c) Phylogeny of the stonustoxins.Toxins from the reef stonefish were highlighted.The number at each node denotes the clade support, and those nodes without numbers have 100% bootstrap support.
FIG. 4.Chromosomal location and expression levels of the putative toxin genes of the reef stonefish.a) Localization of the putative toxin genes on the 24 Chr.Each dash represents a toxin gene.b) A cluster of 8 sntx genes located within a 50-kb region of the Chr2.Gene structure of each sntx is presented.c) Four sntx-like genes located on the Chr6.d) Classification of protein families for other 103 toxins.e) A heatmap of gene transcription with hierarchical clustering for all the 294 putative toxin genes.Three asterisks mark the highly expressed sntx genes in the venom.f) A heatmap of gene transcription with hierarchical clustering for the 28 sntx genes.Eight sntx genes clustered on the Chr2 were highlighted.The shared color key is provided, which shows low to high transcription levels from left to right.g) A Venn plot showing toxin genes transcribed in different tissues.h) Mapping of peptides from the venom proteome to the stonustoxins encoded from those sntx genes clustered on the Chr2.Long thick lines represent the stonustoxin proteins, and short thin lines stand for those mapped peptides in the proteome data.

Table 1
Summary of the genome assembly and gene annotation of the reef stonefish