Abstract

It is known that some endangered species have persisted for thousands of years despite their very small effective population sizes and low levels of genetic polymorphisms. To understand the genetic mechanisms of long-term persistence in threatened species, we determined the whole genome sequences of akame (Lates japonicus), which has survived for a long time with extremely low genetic variations. Genome-wide heterozygosity in akame was estimated to be 3.3 to 3.4 × 10−4/bp, one of the smallest values in teleost fishes. Analysis of demographic history revealed that the effective population size in akame was around 1,000 from 30,000 years ago to the recent past. The relatively high ratio of nonsynonymous to synonymous heterozygosity in akame indicated an increased genetic load. However, a detailed analysis of genetic diversity in the akame genome revealed that multiple genomic regions, including genes involved in immunity, synaptic development, and olfactory sensory systems, have retained relatively high nucleotide polymorphisms. This implies that the akame genome has preserved the functional genetic variations by balancing selection, to avoid a reduction in viability and loss of adaptive potential. Analysis of synonymous and nonsynonymous nucleotide substitution rates has detected signs of positive selection in many akame genes, suggesting adaptive evolution to temperate waters after the speciation of akame and its close relative, barramundi (Lates calcarifer). Our results indicate that the functional genetic diversity likely contributed to the long-term persistence of this species by avoiding the harmful effects of the population size reduction.

Significance

Population size decline and loss of genetic diversity in threatened species have been considered to increase the risk of extinction. However, akame is known to have persisted for thousands of years despite low genetic diversity. In this study, we sequenced the draft genome of akame and analyzed its genomic diversity. In akame, heterozygosity of single nucleotide variants was one of the lowest in teleost fishes. However, several genomic regions in the akame genome showed high levels of polymorphism. Our results indicate that, in akame, some functional genetic variations have been preserved by selective forces, to avoid a reduction in viability and loss of adaptive potential.

Introduction

In naturally outbreeding organisms, the decrease in genetic diversity owing to a reduction in population size can result in a decline of mean phenotypic values across various fitness-related traits, thereby limiting the evolutionary potential of the population. Population size reduction also results in the accumulation of deleterious genetic variants (i.e. genetic load) due to the dominance of genetic drift over purifying selection and the elevated level of inbreeding. Consequently, population decline in threatened species has been considered to increase the risk of extinction in the wild (Frankham et al. 2010; Allendorf et al. 2013). Indeed, the negative effects of population size reduction such as inbreeding depression, loss of adaptive potential, and elevated genetic load, have been documented in numerous endangered wild species that have experienced severe population decline and low genetic diversity (Frankham et al. 2010; Allendorf et al. 2013; de Villemereuil et al. 2019; Khan et al. 2021).

However, it is also known that certain endangered species are able to sustain their populations for extended periods of time, despite experiencing population declines and reduction in genetic diversity. For instance, the eastern subspecies of mountain gorilla Gorilla beringei beringei has undergone a prolonged population decline over 100,000 years (Xue et al. 2015). Similarly, certain island populations of channel island fox Urocyon littoralis, have persisted for thousands of years despite extremely small population size (Robinson et al. 2016, 2018). Conversely, some species, such as the western lowland gorilla Gorilla gorilla and Sumatran orangutan Pongo abelii, are now highly endangered despite maintaining relatively high genetic diversity (Robinson et al. 2016; Teixeira and Huber 2021). Thus, a traditionally accepted concept in conservation genetics that the extinction risk of a species can be assessed in terms of genetic diversity or effective population size (Ne), has become controversial (Teixeira and Huber 2021; Mathur et al. 2023; Wilder et al. 2023).

Akame, also known as the Japanese lates (Lates japonicus), is a rare and endangered teleost fish species distributed on the Pacific coast of Western Japan, primarily in Kochi and Miyazaki Prefectures (Iwatsuki et al. 1993). Akame is considered to be a notable example of a species that has persisted for an extended periods with a very small population size. Our previous analyses utilizing mtDNA, AFLP, and ddRAD-Seq markers have indicated that the level of genetic diversity in akame is comparable to that observed in some seriously endangered freshwater fishes that exhibit the lowest genetic diversity among vertebrates (Takahashi et al. 2015; Naito et al. 2023). If the long-term persistence in akame is not merely a result of chance, it is likely that akame possesses genetic mechanisms to avoid adverse effects resulting from increased genetic drift and inbreeding, such as (i) purging deleterious recessive alleles by natural selection (Crnokrak and Barrett 2002; Larsen et al. 2011; Robinson et al. 2018; Khan et al. 2021); (ii) maintaining genetic diversity in specific genomic regions that require high polymorphism, such as those containing genes associated with the immune system, through strong selective pressure (Hedrick 2002; Alonso et al. 2008); and (iii) utilizing phenotypic plasticity through epigenetic regulation (Vergeer et al. 2012). Thus, conducting a genome-wide analysis of the genetic diversity in akame would yield valuable insights into the genetic mechanisms, preventing the harmful effects of inbreeding and the loss of adaptive potential in species with low genetic diversity. Moreover, by providing empirical data on both neutral and functional genetic variations, studies on the genetic mechanisms of long-term persistence in akame are likely to contribute to a recent debate on the relative importance of neutral and functional genetic variations in the assessment of extinction risk (Kardos et al. 2021; Teixeira and Huber 2021).

Akame has a wide-ranging and abundant counterpart, the barramundi or Asian seabass (Lates calcarifer). There are notable biological and ecological differences between akame and barramundi. Akame is the only species of the genus Lates in temperate regions (Katayama and Taki 1984; Iwatsuki et al. 1993) and does not enter freshwater environments throughout its life history (Iwatsuki et al. 1993; Gonzalvo et al. 2021). By contrast, barramundi is a tropical marine species that is widely distributed in the Asia-Pacific region, inhabiting coastal marine and estuarine to freshwater habitats (Larson 1999). Genome-wide genetic diversity in barramundi is known to be relatively high (Vij et al. 2016), similar to other marine teleost fishes (Jones et al. 2012; Wu et al. 2014; Barry et al. 2021). A chromosome-level genome assembly has been published for barramundi (Vij et al. 2016) and thus, comparative genomics analysis of akame and barramundi may elucidate the impact of population size reduction on the genetic diversity, genetic load, and adaptive potential in akame populations.

In the present study, we sequenced the draft genomes of two akame individuals collected from the Kochi and Miyazaki regions in southwestern Japan (Fig. 1), to understand the genetic mechanisms involved in avoiding the adverse effects of low genetic diversity in wild organisms. Analysis of genetic diversity and historical population demography showed that akame has persisted from 30,000 years ago to the recent past with a very small effective population size. We also analyzed the genomic distribution of the neutral and functional genetic variations in akame, to assess the relative importance of the genome-wide genetic diversity for the long-term persistence of this species. In addition, we compared the evolutionary rates of akame and barramundi genes to infer the efficacy of selection in akame after the speciation of the two species. Our results indicate that wild populations of akame have persisted for extended periods under selective pressure to retain functional genetic variations required for resistance to pathogens and adaptive responses to changing environments.

Sampling locations of the two akame individuals (Kochi and Miyazaki) used in this study.
Fig. 1.

Sampling locations of the two akame individuals (Kochi and Miyazaki) used in this study.

Results

Draft Genome Assemblies of the Kochi and Miyazaki Individuals of Akame

De novo assembly of the akame genome (Kochi individual) was generated by “hybrid” genome assembly approach that combined data from Chromium-linked Illumina reads and Nanopore long reads. In the Kochi individuals, 134.3 Gb raw reads were generated by sequencing the Chromium linked-read library using Illumina HiSeq X Ten, and 33.44 Gb raw reads were generated using Nanopore PromethION (number of reads: 3,708,639 and mean read length: 12.78 kb). The haploid genome size was estimated to be 632.9 Mb by the k-mer frequency analysis under the best k-mer length of 91 (supplementary fig. S1, Supplementary Material online). Assembly statistics of the Kochi akame genome are summarized in Table 1. Our hybrid assembly approach generated a genome sequence with high contiguity for akame (10,039 scaffolds, scaffold N50: 3.71 Mb, and assembly size: 837.9 Mb). The genome effective length (=total genome length − number of N sites) of the scaffold-level assembly was 790.1 Mb, which was substantially larger than the genome size estimated by k-mer frequency analysis. BUSCO completeness score (=percentage of the presence of core genes in the genome assembly) for the scaffold-level assembly was 93.4% (89.5% single and 3.9% duplicated copies). Contamination of the sequences derived from the microbiome or other organisms in the scaffold-level assembly was virtually undetected by GC content and BLASTX-based genome screening (supplementary fig. S2, Supplementary Material online).

Table 1

Statistics of the akame reference genome assembly (Kochi individual)

StatisticsPrimary genome assembly (contig)Primary genome assembly (scaffold)Scaffold-level assembly (scaffolding with Nanopore long reads and RNA-Seq data)
Number of contigs/scaffolds55,86338,16510,039
Contig/scaffold N5026,2461,090,7863,710,658
Contig/scaffold L505,82417252
Largest contig/scaffold size297,6856,847,19718,195,970
Total size of the assembly598,661,182706,963,282837,949,969
StatisticsPrimary genome assembly (contig)Primary genome assembly (scaffold)Scaffold-level assembly (scaffolding with Nanopore long reads and RNA-Seq data)
Number of contigs/scaffolds55,86338,16510,039
Contig/scaffold N5026,2461,090,7863,710,658
Contig/scaffold L505,82417252
Largest contig/scaffold size297,6856,847,19718,195,970
Total size of the assembly598,661,182706,963,282837,949,969

“Primary genome assembly” indicates the assembly generated using only chromium-linked illumina reads. More detailed statistics of the genome assemblies are summarized in supplementary table S1, Supplementary Material online.

Table 1

Statistics of the akame reference genome assembly (Kochi individual)

StatisticsPrimary genome assembly (contig)Primary genome assembly (scaffold)Scaffold-level assembly (scaffolding with Nanopore long reads and RNA-Seq data)
Number of contigs/scaffolds55,86338,16510,039
Contig/scaffold N5026,2461,090,7863,710,658
Contig/scaffold L505,82417252
Largest contig/scaffold size297,6856,847,19718,195,970
Total size of the assembly598,661,182706,963,282837,949,969
StatisticsPrimary genome assembly (contig)Primary genome assembly (scaffold)Scaffold-level assembly (scaffolding with Nanopore long reads and RNA-Seq data)
Number of contigs/scaffolds55,86338,16510,039
Contig/scaffold N5026,2461,090,7863,710,658
Contig/scaffold L505,82417252
Largest contig/scaffold size297,6856,847,19718,195,970
Total size of the assembly598,661,182706,963,282837,949,969

“Primary genome assembly” indicates the assembly generated using only chromium-linked illumina reads. More detailed statistics of the genome assemblies are summarized in supplementary table S1, Supplementary Material online.

Chromosome-level assembly was also generated by ordering the scaffolds along the L. calcarifer genome assembly. Chromosome-level assembly comprised 24 chromosomes (the mitochondrial genome was not included), and the total assembly size was 778.6 Mb. BUSCO completeness score for the chromosome-level assembly was 91.9% (90.3% single and 1.6% duplicated copies). In the remainder of this paper, the scaffold- and chromosome-level assemblies of the Kochi individual are termed “akame (Kochi) genome” and used as references for subsequent analyses. Assembly statistics of the akame (Kochi) genome at each analysis step are shown in supplementary table S1, Supplementary Material online.

We obtained the de novo assembly for the Miyazaki individual of akame by Chromium-linked Illumina reads. In the Miyazaki individual, 125.3 Gb of raw reads were generated by sequencing the Chromium-linked read library using Illumina HiSeq X Ten. The genome of the Miyazaki individual was assembled into 17,019 scaffolds totaling 657.1 Mb in size. The N50 of the assembly was 1,526,648 bp. BUSCO completeness score for the Miyazaki akame assembly was estimated as 95.8% (91.6% single and 4.2% duplicated copies). Similar to the akame (Kochi) genome, sequences derived from potential contaminant organisms were almost negligible in the Miyazaki akame genome assembly (supplementary fig. S3, Supplementary Material online). For the Miyazaki akame individual, details of assembly statistics are shown in supplementary table S2, Supplementary Material online. In the remainder of this paper, this assembly has been termed “akame (Miyazaki) genome.”

Genome Annotation

In the scaffold-level assembly of the akame (Kochi) genome, we predicted 48,291 protein-coding genes based on transcriptome evidence. DIAMOND-BLASTP searches revealed that 61% of the predicted genes (29,610 genes) had homologs in the NCBI L. calcarifer and RefSeq databases (vertebrates: Excluding mammals). This low percentage may be attributed to the fact that the akame gene set contains many short sequences of <300 bp. These were likely partial gene sequences caused by incompleteness of the genome assembly and/or insufficient transcriptome evidence in functional annotation. After the short sequences were excluded, 36,907 predicted genes remained, of which 74% of them (27,175 genes) possessed homologs in the databases. BLASTN searches against the ENSEMBL barramundi transcriptomes showed that 56% (27,333/48,291) of the akame genes had corresponding barramundi homologs. Similarly, the proportion increased to 69% (25,581/36,907) after excluding sequences of <300 bp from the gene set. BUSCO completeness of the annotated protein-coding gene set was 79.1% (75.5% single and 3.6% duplicated copies).

In repeat sequences, transposable elements (TEs) comprised ∼14% (118 Mb) of the akame (Kochi) genome, with DNA transposons being the most abundant (4.12%, 34.5 Mb). Comparison of repeat contents between the akame (Kochi) and barramundi genomes (Vij et al. 2016), which were estimated by the same method (see Materials and Methods), revealed no clear differences. The numbers of TEs, satellites, and simple repeats in the genomes of the two species are shown in supplementary table S3, Supplementary Material online.

Demographic History of Akame

The inferred Ne history by pairwise sequentially Markov coalescent (PSMC) analysis revealed that the Ne of the two akame populations has been maintained around 1,000 from 30,000 years ago to the recent past (Fig. 2), indicating their long-time persistence despite a small population size. By contrast, the Ne of barramundi populations in Thailand and Indonesia were estimated to be 3 to 10 times larger than that of akame, indicating that the population size expansion and decline had occurred 103 to 104 years ago (Fig. 2). It is interesting to note that the Ne of the West India barramundi population was constantly low and the population size expansion in the Holocene was not observed (Fig. 2).

Historical Nes inferred by PSMC analysis for akame (Kochi and Miyazaki) and barramundi (India, Thailand, and Indonesia). Fine lines denote bootstrap replicates. The green and white bars below indicate the last three interglacial and glacial periods, respectively. LGP, last glacial period.
Fig. 2.

Historical Nes inferred by PSMC analysis for akame (Kochi and Miyazaki) and barramundi (India, Thailand, and Indonesia). Fine lines denote bootstrap replicates. The green and white bars below indicate the last three interglacial and glacial periods, respectively. LGP, last glacial period.

Heterozygosity Levels, Genetic Load, and Runs of Homozygosity of the Akame Genome

The numbers of single nucleotide variant (SNV) sites in the two individuals of akame were counted by mapping the genomic reads to the chromosome-level assembly. In the Kochi individual, 251,220 heterozygous single nucleotide variants (SNVs) were identified and genome-wide heterozygosity was estimated as 3.4 × 10−4/bp. In the Miyazaki individual, 245,311 heterozygous SNVs were identified, and genome-wide heterozygosity was 3.3 × 10−4/bp, suggesting that the proportions of SNVs were not different between the Kochi and Miyazaki populations. In the barramundi individual from Singapore (SG), 1,100,370 heterozygous SNVs were identified and genome-wide heterozygosity was 16.0 × 10−4/bp. Heterozygosity in the akame genomes was not >20% of the SG individual of barramundi.

To assess the genetic load in akame, we estimated the relationship of the genomic heterozygosity (a proxy for Ne) and the ratio of heterozygosity in 0-fold relative to 4-fold degenerate sites within protein-coding regions (a proxy for the efficacy of selection) in akame and compared these to those in barramundi. Genomic heterozygosity was much lower and the ratio of 0-fold/4-fold heterozygosity was much higher in akame than in barramundi (Fig. 3a). Also, the proportions of heterozygous SNVs with “High” and “Moderate” functional effects (see Materials and Methods for detail) were higher in the akame genomes than in the barramundi genomes (Fig. 3b). Similarly, the SIFT missense predictions showed that the ratio of “deleterious” and “tolerated” variants was higher in akame than in barramundi (supplementary fig. S4, Supplementary Material online). The increased proportions of putatively deleterious mutations and the reduction in neutral heterozygosity implied the genomic consequences of enhanced drift relative to selection by long-term small Ne in akame.

a) A relationship between genomic heterozygosity and ratio of heterozygosity at 0-fold relative to 4-fold degenerate sites in the individuals of akame and barramundi. b) Ratios of functional effect categories of SNVs in the individuals of akame and barramundi. Functional effect categories in each SNV were identified using the SnpEff program (Cingolani et al. 2012). Functional effect categories: High, nonsense and splicing donor/acceptor site variants; Moderate, missense variants; Low, synonymous and stop retained variants. c) Boxplot of the ROH lengths in Kochi and Miyazaki akame individuals. d) Cumulative fraction of the genome made up of ROHs at least 100 kb long.
Fig. 3.

a) A relationship between genomic heterozygosity and ratio of heterozygosity at 0-fold relative to 4-fold degenerate sites in the individuals of akame and barramundi. b) Ratios of functional effect categories of SNVs in the individuals of akame and barramundi. Functional effect categories in each SNV were identified using the SnpEff program (Cingolani et al. 2012). Functional effect categories: High, nonsense and splicing donor/acceptor site variants; Moderate, missense variants; Low, synonymous and stop retained variants. c) Boxplot of the ROH lengths in Kochi and Miyazaki akame individuals. d) Cumulative fraction of the genome made up of ROHs at least 100 kb long.

The numbers of runs of homozygosity (ROH) >100 kb were 251 in the Kochi and 616 in Miyazaki individuals. Mean length of the ROH (>100 kb) was 150.8 kb in the Kochi individual and 176.8 kb in the Miyazaki individual (Fig. 3c). In the akame genome, almost all ROHs were <1 Mb in length. Two ROHs > 1 Mb were identified in the Miyazaki individual and not found in the Kochi individual (Fig. 3c). The FROH of the Kochi and Miyazaki individuals was estimated as 0.05 and 0.14, respectively (Fig. 3d). In barramundi (SG individual), ROHs > 100 kb were not found.

Genome-Wide Distribution of Genetic Variations in the Akame Genomes

After stringent window filtration by mean mappability and depth of coverage (supplementary figs. S5 and S6, Supplementary Material online; see Materials and Methods), multiple peaks of nucleotide diversity that exceeded the mean + four standard deviation (4 SD) were identified. These peaks were scattered across the multiple genomic regions (Fig. 4a). The number of windows of nucleotide diversity-peaks was 231, and 271 protein-coding genes were detected within them (listed in supplementary table S4, Supplementary Material online).

a) A sliding-window plot of nucleotide diversity (red: left y axis) and mean FST (light blue: right y axis) for 10-kb nonoverlapping windows in the akame genome. The mitochondrial genome was excluded from the analysis. Dashed line indicates the cutoff value of nucleotide diversity (mean + 4 SD: 0.0020). Abbreviations of the representative polymorphic genes in the nucleotide diversity peak regions (nucleotide diversity within the coding region >0.01) are shown in the figure (see Table 2 for details). In addition, the genomic locations of the MHC, OR, and two protocadherin gene clusters within the nucleotide diversity-peak regions are indicated by blue arrows. b) A violin plot of FST values in the background and nucleotide diversity-peak regions. Distribution of FST values in the nucleotide diversity-peak regions was significantly shifted toward lower values compared with background regions (P < 2.2 × 10−16, two-tailed Mann–Whitney test). Numbers of windows in each category are shown in parentheses. c) Summary of the functional annotation clustering of the 221 genes within the nucleotide diversity-peak regions. The number of genes in each cluster is shown in the graph. Numbers on the right of the bars indicate fold enrichment. The names of significantly enriched (FDR-adjusted P-value < 0.05) clusters are indicated in bold. A full list of the annotation clusters is provided in supplementary table S6, Supplementary Material online.
Fig. 4.

a) A sliding-window plot of nucleotide diversity (red: left y axis) and mean FST (light blue: right y axis) for 10-kb nonoverlapping windows in the akame genome. The mitochondrial genome was excluded from the analysis. Dashed line indicates the cutoff value of nucleotide diversity (mean + 4 SD: 0.0020). Abbreviations of the representative polymorphic genes in the nucleotide diversity peak regions (nucleotide diversity within the coding region >0.01) are shown in the figure (see Table 2 for details). In addition, the genomic locations of the MHC, OR, and two protocadherin gene clusters within the nucleotide diversity-peak regions are indicated by blue arrows. b) A violin plot of FST values in the background and nucleotide diversity-peak regions. Distribution of FST values in the nucleotide diversity-peak regions was significantly shifted toward lower values compared with background regions (P < 2.2 × 10−16, two-tailed Mann–Whitney test). Numbers of windows in each category are shown in parentheses. c) Summary of the functional annotation clustering of the 221 genes within the nucleotide diversity-peak regions. The number of genes in each cluster is shown in the graph. Numbers on the right of the bars indicate fold enrichment. The names of significantly enriched (FDR-adjusted P-value < 0.05) clusters are indicated in bold. A full list of the annotation clusters is provided in supplementary table S6, Supplementary Material online.

In many cases, balancing selection can reduce population differentiation (measured by FST) around the selected genes or genomic regions (Brandt et al. 2018). If the high polymorphisms in the nucleotide diversity-peak regions in akame are maintained by balancing selection, thus, these regions are likely less divergent to the barramundi genome than the background. To test this, mean FST values between akame (Kochi and Miyazaki) and barramundi (SG, India, Thailand, and Indonesia) were estimated across the 10 kb nonoverlapping window in the chromosome-level assembly. FST values in the nucleotide diversity-peak regions appear to much lower than those in the background regions (Fig. 4a). Indeed, the distribution of FSTs in the nucleotide diversity-peak regions (median FST = 0.632) was significantly shifted toward lower values, compared to the background regions (median FST = 0.831) (Fig. 4b). To examine whether the nucleotide diversity-peak regions in akame were common in barramundi, nucleotide diversity per 10 kb nonoverlapping window was also calculated using the four barramundi individuals, and chromosomal positions of the nucleotide diversity-peaks were compared between the two species (supplementary fig. S7, Supplementary Material online). In barramundi, 131 windows of nucleotide diversity-peaks were identified. Of these, 44 were common to akame, and 58 genes were contained in these windows. The number of windows with high nucleotide diversity common to the two species was significantly larger than that of randomly expected (P < 1 × 10−6, Monte Carlo simulation; see Materials and Methods for detail).

Enrichment analysis of the 221 genes within the nucleotide diversity-peak regions that had corresponding ENSEMBL barramundi transcripts (>90% nucleotide sequence identity) showed significant enrichment of the GO terms “translation elongation factor activity” (P = 0.0287, GO:0003746), “Cell adhesion” (P = 0.0417, GO:0007155), “Membrane” (P = 6.0 × 10−4, GO:0016020), “MHC protein complex” (P = 0.0137, GO:0042611), and “MHC Class II protein complex” (P = 0.0137, GO:0042613). Detailed results of the enrichment analysis are shown in supplementary table S5, Supplementary Material online. In 98 of the 271 genes located within the nucleotide diversity-peak regions, the proportion of SNV sites within the protein-coding region was >0.01. In this paper, we refer to these genes as “polymorphic genes.” The top 30 of the polymorphic genes are listed in Table 2.

Table 2

A list of the top 30 polymorphic genes located within the nucleotide diversity-peak regions

Linkage groupStartEndOri.Length of_CDSTranscript ID (ENSEMBL barramundi)Number of variants (LOW)Number of variants (MODERATE)Number of variants (HIGH)Proportion of variants within CDSGene name (abbreviation)Description (BLASTP-DIAMOND best hit)
LG032644945326465165456ENSLCAT0001004926282930.0877CD209ECD209 antigen-like protein E isoform X1 [L. calcarifer]
LG14127991611280000030916900.0809DAN4Cell wall protein DAN4-like isoform X2 [L. calcarifer]
LG242515306525153736+672ENSLCAT0001005564284100.0729GIMAP7GTPase IMAP family member 7-like [L. calcarifer]
LG1529521662954872+885ENSLCAT0001001979185310.0701CLEC12BC-type lectin domain family 12 member B-like isoform X3 [L. calcarifer]
LG022274186422742424561ENSLCAT00010041310181800.0642UGT2C1UDP-glucuronosyltransferase 2C1-like isoform X2 [L. calcarifer]
LG1243469314348396+474ENSLCAT0001000246191700.0549OR4K1Olfactory receptor 4K1-like [L. calcarifer]
LG1418061006180631562151259200.0544SAMD9Sterile alpha motif domain-containing protein 9-like [Epinephelus lanceolatus]
LG032629462526296711+456ENSLCAT0001002765491400.0504eEF1a1Elongation factor 1-alpha-like [L. calcarifer]
LG07_112374191240000666ENSLCAT0001004007562700.0495IFI44Interferon-induced protein 44-like [L. calcarifer]
LG2094717149473135+1227213810.0489TCHHTrichohyalin-like [L. calcarifer]
LG032646601326468750309ENSLCAT0001001196411310.0485CD209ECD209 antigen-like protein E Isoform X2 [L. calcarifer]
LG1517396770173977861017ENSLCAT00010017757153100.0452C3AR1C3a anaphylatoxin chemotactic receptor-like [L. calcarifer]
LG0428235232833145+954ENSLCAT00010002680142510.0419DMBT1Deleted in malignant brain tumors 1 protein-like [L. calcarifer]
LG141733000217334406+1986ENSLCAT00010036890215020.0368OBSCNObscurin-like [L. calcarifer]
LG0585971568600000+1293ENSLCAT00010025942102810.0302PRF1Perforin-1-like [L. calcarifer]
LG04562611570777+834ENSLCAT0001001064222300.0300ART1T-cell ecto-ADP-ribosyltransferase 1-like isoform X1 [L. calcarifer]
LG101378212713786224+2142ENSLCAT00010023093263600.0289MUC2Mucin-2-like isoform X1 [L. calcarifer]
LG141738069717383457834ENSLCAT0001003689091500.0288FCRL5Fc receptor-like protein 5 [L. calcarifer]
LG202364852400001557ENSLCAT0001003870443810.0276LAMNLamin-L(II)-like [L. calcarifer]
LG191063251410633673297ENSLCAT000100607603500.0269H2EB1H-2 Class II histocompatibility antigen, E-S beta chain-like [L. calcarifer]
LG1243521444359193+747ENSLCAT00010002441101000.0268OR52K1Olfactory receptor 52K1-like [L. calcarifer]
LG171784615817849546+936ENSLCAT0001000613541830.0267POLRNA-directed DNA polymerase from mobile element jockey-like [Austrofundulus limnaeus]
LG151739000217391523+618ENSLCAT000100177816710.0227LIN37Protein lin-37 homolog isoform X2 [L. calcarifer]
LG142137100921373848810ENSLCAT0001006084881000.0222BTN2A2Butyrophilin subfamily 2 member A2-like isoform X1 [L. calcarifer]
LG091282229412823545+273ENSLCAT000100368142310.0220AMBPProtein AMBP-like [L. calcarifer]
LG2419030002190322432886ENSLCAT00010001000243810.0218TLR13Toll-like receptor 13 [L. calcarifer]
LG1910650959106519865076500.0217CTC1CST complex subunit CTC1-like [L. calcarifer]
LG032629150226292790+753ENSLCAT000100279407900.0212eEF1a1Elongation factor 1-alpha-like [L. calcarifer]
LG1823174362231771077146900.0210FCGR2CLow affinity immunoglobulin gamma Fc region receptor II-a-like isoform X2 [L. calcarifer]
LG032629679226297830+336ENSLCAT000100284291600.0208eEF1a1Elongation factor 1-alpha [L. calcarifer]
Linkage groupStartEndOri.Length of_CDSTranscript ID (ENSEMBL barramundi)Number of variants (LOW)Number of variants (MODERATE)Number of variants (HIGH)Proportion of variants within CDSGene name (abbreviation)Description (BLASTP-DIAMOND best hit)
LG032644945326465165456ENSLCAT0001004926282930.0877CD209ECD209 antigen-like protein E isoform X1 [L. calcarifer]
LG14127991611280000030916900.0809DAN4Cell wall protein DAN4-like isoform X2 [L. calcarifer]
LG242515306525153736+672ENSLCAT0001005564284100.0729GIMAP7GTPase IMAP family member 7-like [L. calcarifer]
LG1529521662954872+885ENSLCAT0001001979185310.0701CLEC12BC-type lectin domain family 12 member B-like isoform X3 [L. calcarifer]
LG022274186422742424561ENSLCAT00010041310181800.0642UGT2C1UDP-glucuronosyltransferase 2C1-like isoform X2 [L. calcarifer]
LG1243469314348396+474ENSLCAT0001000246191700.0549OR4K1Olfactory receptor 4K1-like [L. calcarifer]
LG1418061006180631562151259200.0544SAMD9Sterile alpha motif domain-containing protein 9-like [Epinephelus lanceolatus]
LG032629462526296711+456ENSLCAT0001002765491400.0504eEF1a1Elongation factor 1-alpha-like [L. calcarifer]
LG07_112374191240000666ENSLCAT0001004007562700.0495IFI44Interferon-induced protein 44-like [L. calcarifer]
LG2094717149473135+1227213810.0489TCHHTrichohyalin-like [L. calcarifer]
LG032646601326468750309ENSLCAT0001001196411310.0485CD209ECD209 antigen-like protein E Isoform X2 [L. calcarifer]
LG1517396770173977861017ENSLCAT00010017757153100.0452C3AR1C3a anaphylatoxin chemotactic receptor-like [L. calcarifer]
LG0428235232833145+954ENSLCAT00010002680142510.0419DMBT1Deleted in malignant brain tumors 1 protein-like [L. calcarifer]
LG141733000217334406+1986ENSLCAT00010036890215020.0368OBSCNObscurin-like [L. calcarifer]
LG0585971568600000+1293ENSLCAT00010025942102810.0302PRF1Perforin-1-like [L. calcarifer]
LG04562611570777+834ENSLCAT0001001064222300.0300ART1T-cell ecto-ADP-ribosyltransferase 1-like isoform X1 [L. calcarifer]
LG101378212713786224+2142ENSLCAT00010023093263600.0289MUC2Mucin-2-like isoform X1 [L. calcarifer]
LG141738069717383457834ENSLCAT0001003689091500.0288FCRL5Fc receptor-like protein 5 [L. calcarifer]
LG202364852400001557ENSLCAT0001003870443810.0276LAMNLamin-L(II)-like [L. calcarifer]
LG191063251410633673297ENSLCAT000100607603500.0269H2EB1H-2 Class II histocompatibility antigen, E-S beta chain-like [L. calcarifer]
LG1243521444359193+747ENSLCAT00010002441101000.0268OR52K1Olfactory receptor 52K1-like [L. calcarifer]
LG171784615817849546+936ENSLCAT0001000613541830.0267POLRNA-directed DNA polymerase from mobile element jockey-like [Austrofundulus limnaeus]
LG151739000217391523+618ENSLCAT000100177816710.0227LIN37Protein lin-37 homolog isoform X2 [L. calcarifer]
LG142137100921373848810ENSLCAT0001006084881000.0222BTN2A2Butyrophilin subfamily 2 member A2-like isoform X1 [L. calcarifer]
LG091282229412823545+273ENSLCAT000100368142310.0220AMBPProtein AMBP-like [L. calcarifer]
LG2419030002190322432886ENSLCAT00010001000243810.0218TLR13Toll-like receptor 13 [L. calcarifer]
LG1910650959106519865076500.0217CTC1CST complex subunit CTC1-like [L. calcarifer]
LG032629150226292790+753ENSLCAT000100279407900.0212eEF1a1Elongation factor 1-alpha-like [L. calcarifer]
LG1823174362231771077146900.0210FCGR2CLow affinity immunoglobulin gamma Fc region receptor II-a-like isoform X2 [L. calcarifer]
LG032629679226297830+336ENSLCAT000100284291600.0208eEF1a1Elongation factor 1-alpha [L. calcarifer]

Genes described as “uncharacterized protein” are not included in this table. A full list of the genes within the nucleotide diversity-peak regions (272 genes) is shown in supplementary table S4, Supplementary Material online. The numbers of SNVs are shown by annotation impacts (by SnpEff) as follows: LOW, synonymous or stop retained variants; MODERATE, missense variants; HIGH, nonsense, start lost, or splicing donor/acceptor variants.

Table 2

A list of the top 30 polymorphic genes located within the nucleotide diversity-peak regions

Linkage groupStartEndOri.Length of_CDSTranscript ID (ENSEMBL barramundi)Number of variants (LOW)Number of variants (MODERATE)Number of variants (HIGH)Proportion of variants within CDSGene name (abbreviation)Description (BLASTP-DIAMOND best hit)
LG032644945326465165456ENSLCAT0001004926282930.0877CD209ECD209 antigen-like protein E isoform X1 [L. calcarifer]
LG14127991611280000030916900.0809DAN4Cell wall protein DAN4-like isoform X2 [L. calcarifer]
LG242515306525153736+672ENSLCAT0001005564284100.0729GIMAP7GTPase IMAP family member 7-like [L. calcarifer]
LG1529521662954872+885ENSLCAT0001001979185310.0701CLEC12BC-type lectin domain family 12 member B-like isoform X3 [L. calcarifer]
LG022274186422742424561ENSLCAT00010041310181800.0642UGT2C1UDP-glucuronosyltransferase 2C1-like isoform X2 [L. calcarifer]
LG1243469314348396+474ENSLCAT0001000246191700.0549OR4K1Olfactory receptor 4K1-like [L. calcarifer]
LG1418061006180631562151259200.0544SAMD9Sterile alpha motif domain-containing protein 9-like [Epinephelus lanceolatus]
LG032629462526296711+456ENSLCAT0001002765491400.0504eEF1a1Elongation factor 1-alpha-like [L. calcarifer]
LG07_112374191240000666ENSLCAT0001004007562700.0495IFI44Interferon-induced protein 44-like [L. calcarifer]
LG2094717149473135+1227213810.0489TCHHTrichohyalin-like [L. calcarifer]
LG032646601326468750309ENSLCAT0001001196411310.0485CD209ECD209 antigen-like protein E Isoform X2 [L. calcarifer]
LG1517396770173977861017ENSLCAT00010017757153100.0452C3AR1C3a anaphylatoxin chemotactic receptor-like [L. calcarifer]
LG0428235232833145+954ENSLCAT00010002680142510.0419DMBT1Deleted in malignant brain tumors 1 protein-like [L. calcarifer]
LG141733000217334406+1986ENSLCAT00010036890215020.0368OBSCNObscurin-like [L. calcarifer]
LG0585971568600000+1293ENSLCAT00010025942102810.0302PRF1Perforin-1-like [L. calcarifer]
LG04562611570777+834ENSLCAT0001001064222300.0300ART1T-cell ecto-ADP-ribosyltransferase 1-like isoform X1 [L. calcarifer]
LG101378212713786224+2142ENSLCAT00010023093263600.0289MUC2Mucin-2-like isoform X1 [L. calcarifer]
LG141738069717383457834ENSLCAT0001003689091500.0288FCRL5Fc receptor-like protein 5 [L. calcarifer]
LG202364852400001557ENSLCAT0001003870443810.0276LAMNLamin-L(II)-like [L. calcarifer]
LG191063251410633673297ENSLCAT000100607603500.0269H2EB1H-2 Class II histocompatibility antigen, E-S beta chain-like [L. calcarifer]
LG1243521444359193+747ENSLCAT00010002441101000.0268OR52K1Olfactory receptor 52K1-like [L. calcarifer]
LG171784615817849546+936ENSLCAT0001000613541830.0267POLRNA-directed DNA polymerase from mobile element jockey-like [Austrofundulus limnaeus]
LG151739000217391523+618ENSLCAT000100177816710.0227LIN37Protein lin-37 homolog isoform X2 [L. calcarifer]
LG142137100921373848810ENSLCAT0001006084881000.0222BTN2A2Butyrophilin subfamily 2 member A2-like isoform X1 [L. calcarifer]
LG091282229412823545+273ENSLCAT000100368142310.0220AMBPProtein AMBP-like [L. calcarifer]
LG2419030002190322432886ENSLCAT00010001000243810.0218TLR13Toll-like receptor 13 [L. calcarifer]
LG1910650959106519865076500.0217CTC1CST complex subunit CTC1-like [L. calcarifer]
LG032629150226292790+753ENSLCAT000100279407900.0212eEF1a1Elongation factor 1-alpha-like [L. calcarifer]
LG1823174362231771077146900.0210FCGR2CLow affinity immunoglobulin gamma Fc region receptor II-a-like isoform X2 [L. calcarifer]
LG032629679226297830+336ENSLCAT000100284291600.0208eEF1a1Elongation factor 1-alpha [L. calcarifer]
Linkage groupStartEndOri.Length of_CDSTranscript ID (ENSEMBL barramundi)Number of variants (LOW)Number of variants (MODERATE)Number of variants (HIGH)Proportion of variants within CDSGene name (abbreviation)Description (BLASTP-DIAMOND best hit)
LG032644945326465165456ENSLCAT0001004926282930.0877CD209ECD209 antigen-like protein E isoform X1 [L. calcarifer]
LG14127991611280000030916900.0809DAN4Cell wall protein DAN4-like isoform X2 [L. calcarifer]
LG242515306525153736+672ENSLCAT0001005564284100.0729GIMAP7GTPase IMAP family member 7-like [L. calcarifer]
LG1529521662954872+885ENSLCAT0001001979185310.0701CLEC12BC-type lectin domain family 12 member B-like isoform X3 [L. calcarifer]
LG022274186422742424561ENSLCAT00010041310181800.0642UGT2C1UDP-glucuronosyltransferase 2C1-like isoform X2 [L. calcarifer]
LG1243469314348396+474ENSLCAT0001000246191700.0549OR4K1Olfactory receptor 4K1-like [L. calcarifer]
LG1418061006180631562151259200.0544SAMD9Sterile alpha motif domain-containing protein 9-like [Epinephelus lanceolatus]
LG032629462526296711+456ENSLCAT0001002765491400.0504eEF1a1Elongation factor 1-alpha-like [L. calcarifer]
LG07_112374191240000666ENSLCAT0001004007562700.0495IFI44Interferon-induced protein 44-like [L. calcarifer]
LG2094717149473135+1227213810.0489TCHHTrichohyalin-like [L. calcarifer]
LG032646601326468750309ENSLCAT0001001196411310.0485CD209ECD209 antigen-like protein E Isoform X2 [L. calcarifer]
LG1517396770173977861017ENSLCAT00010017757153100.0452C3AR1C3a anaphylatoxin chemotactic receptor-like [L. calcarifer]
LG0428235232833145+954ENSLCAT00010002680142510.0419DMBT1Deleted in malignant brain tumors 1 protein-like [L. calcarifer]
LG141733000217334406+1986ENSLCAT00010036890215020.0368OBSCNObscurin-like [L. calcarifer]
LG0585971568600000+1293ENSLCAT00010025942102810.0302PRF1Perforin-1-like [L. calcarifer]
LG04562611570777+834ENSLCAT0001001064222300.0300ART1T-cell ecto-ADP-ribosyltransferase 1-like isoform X1 [L. calcarifer]
LG101378212713786224+2142ENSLCAT00010023093263600.0289MUC2Mucin-2-like isoform X1 [L. calcarifer]
LG141738069717383457834ENSLCAT0001003689091500.0288FCRL5Fc receptor-like protein 5 [L. calcarifer]
LG202364852400001557ENSLCAT0001003870443810.0276LAMNLamin-L(II)-like [L. calcarifer]
LG191063251410633673297ENSLCAT000100607603500.0269H2EB1H-2 Class II histocompatibility antigen, E-S beta chain-like [L. calcarifer]
LG1243521444359193+747ENSLCAT00010002441101000.0268OR52K1Olfactory receptor 52K1-like [L. calcarifer]
LG171784615817849546+936ENSLCAT0001000613541830.0267POLRNA-directed DNA polymerase from mobile element jockey-like [Austrofundulus limnaeus]
LG151739000217391523+618ENSLCAT000100177816710.0227LIN37Protein lin-37 homolog isoform X2 [L. calcarifer]
LG142137100921373848810ENSLCAT0001006084881000.0222BTN2A2Butyrophilin subfamily 2 member A2-like isoform X1 [L. calcarifer]
LG091282229412823545+273ENSLCAT000100368142310.0220AMBPProtein AMBP-like [L. calcarifer]
LG2419030002190322432886ENSLCAT00010001000243810.0218TLR13Toll-like receptor 13 [L. calcarifer]
LG1910650959106519865076500.0217CTC1CST complex subunit CTC1-like [L. calcarifer]
LG032629150226292790+753ENSLCAT000100279407900.0212eEF1a1Elongation factor 1-alpha-like [L. calcarifer]
LG1823174362231771077146900.0210FCGR2CLow affinity immunoglobulin gamma Fc region receptor II-a-like isoform X2 [L. calcarifer]
LG032629679226297830+336ENSLCAT000100284291600.0208eEF1a1Elongation factor 1-alpha [L. calcarifer]

Genes described as “uncharacterized protein” are not included in this table. A full list of the genes within the nucleotide diversity-peak regions (272 genes) is shown in supplementary table S4, Supplementary Material online. The numbers of SNVs are shown by annotation impacts (by SnpEff) as follows: LOW, synonymous or stop retained variants; MODERATE, missense variants; HIGH, nonsense, start lost, or splicing donor/acceptor variants.

Functional annotation clustering using the DAVID web tools identified 19 clusters from the 221 genes (Fig. 4c). Similar to the GO enrichment analysis, the DAVID analysis also suggested that the genes involved in innate and adaptive immune systems (Ig-like domain, MHC Class I and II, C1q domain, C-type lectin-like, etc.) and cell adhesion (Cadherin-like) were significantly enriched in the nucleotide diversity-peak regions (see supplementary table S6, Supplementary Material online).

Selective Pressure and Evolutionary Rates in the Akame Genes

Conducting the ortholog-detection method by Yang and Smith (2014) (see Materials and Methods for detail), we identified 8,825 gene sets consisting of 6,847 one-to-one and 1,978 maximum inclusion (MI) orthologs in the five species examined (akame, barramundi, perch, medaka, and seahorse). A robust phylogenetic relationship of the five fishes was reconstructed using all one-to-one ortholog gene set (supplementary fig. S8, Supplementary Material online), to apply the following evolutionary analyses.

The level of the Ne determines the effectiveness of selection relative to drift (Charlesworth 2009). Thus, a reduction in Ne indicates the efficacy of selection and can cause fixations of mildly deleterious mutations. To investigate whether the relaxation of selective pressure has occurred in akame after speciation to barramundi, a selection intensity in akame and barramundi was inferred using the RELAX descriptive models (Wertheim et al. 2015) in all one-to-one and MI ortholog gene sets (see Materials and Methods for details of this analysis). P-value distributions of the RELAX tests for akame and barramundi are shown in Fig. 5a. In akame, contrary to our expectation, the number of genes under intensified selective pressures (1,754 genes, FDR-adjusted P < 0.05) was much larger than that under relaxation of selection (223 genes, FDR-adjusted P < 0.05). By contrast, in barramundi, the number of genes under intensified selective pressures (462 genes, FDR-adjusted P < 0.05) was comparable to that under relaxed selective pressures (417 genes, FDR-adjusted P < 0.05). A scatter plot of k in akame (Kochi) versus barramundi also indicated that the number of genes under intensified selective pressures was much larger in akame than in barramundi (supplementary fig. S9, Supplementary Material online).

Analysis of selection in akame genes. a) FDR-adjusted P-value distributions of the RELAX tests of akame (above) and barramundi (below). Red and blue bars indicate the numbers of genes with intensified (k > 1) and relaxed (k < 1) selective pressures, respectively. For visualization, the numbers of genes with a P-value of >0.95 were excluded from the histograms. b) Two branch-specific models to test differences in selective pressures between akame and barramundi. Phylogenetic relationship of the five fish species was reconstructed by the maximum-likelihood methods using concatenated one-to-one orthologous gene set (supplementary fig. S8, Supplementary Material online; see Materials and Methods). c) Scatter plot of the synonymous–nonsynonymous nucleotide substitution rate ratio (ω = dN/dS) in the branches of akame (ω2, x axis) and barramundi (ω1, y axis) estimated by Model 2 in each orthologous gene set. For visualization, genes with ω1 and/or ω2 values estimated as infinity were excluded from the plot. Genes with ω1 and/or ω2 estimated as 0 were included in the plot by adjusting them to replace 1 × 10−4. Diagonal line indicates ω1 = ω2.
Fig. 5.

Analysis of selection in akame genes. a) FDR-adjusted P-value distributions of the RELAX tests of akame (above) and barramundi (below). Red and blue bars indicate the numbers of genes with intensified (k > 1) and relaxed (k < 1) selective pressures, respectively. For visualization, the numbers of genes with a P-value of >0.95 were excluded from the histograms. b) Two branch-specific models to test differences in selective pressures between akame and barramundi. Phylogenetic relationship of the five fish species was reconstructed by the maximum-likelihood methods using concatenated one-to-one orthologous gene set (supplementary fig. S8, Supplementary Material online; see Materials and Methods). c) Scatter plot of the synonymous–nonsynonymous nucleotide substitution rate ratio (ω = dN/dS) in the branches of akame (ω2, x axis) and barramundi (ω1, y axis) estimated by Model 2 in each orthologous gene set. For visualization, genes with ω1 and/or ω2 values estimated as infinity were excluded from the plot. Genes with ω1 and/or ω2 estimated as 0 were included in the plot by adjusting them to replace 1 × 10−4. Diagonal line indicates ω1 = ω2.

We tested two branch-specific models to examine the evolutionary rate differences between akame and barramundi (Fig. 5b). In Model 1 (two-ratio), two ω values are defined in the branches connecting to akame and barramundi (ω1) and in other “background” branches (ω0). In Model 2 (three-ratio), three ω values are defined, in the branch connecting to akame (ω2), in the branch connecting to barramundi (ω1), and in other background branches (ω0). In each of the orthologous gene set, parameters and log-likelihood values were estimated, and the validity of Model 2 was tested by the likelihood ratio test (LRT). The branch-specific tests showed that selective pressure was significantly different between akame and barramundi in 238 genes (FDR-adjusted P-value < 0.05). In most of these genes (222 genes), ω values in the branch connecting to akame were larger than those in the branch connecting to barramundi (ω2 > ω1), and 131 and 2 of these genes were suggested to have evolved under intensified and relaxed selective pressures by RELAX analysis, respectively (k < 1 and FDR-adjusted P < 0.05; see supplementary table S8-(a), Supplementary Material online). On the other hand, genes that ω values in the barramundi branch were greater than those in the akame branch were only 16 (ω2 <ω1), and 4 and 1 of them were suggested to have evolved under intensified and relaxed selective pressures, respectively (k < 1 and FDR-adjusted P < 0.05; see supplementary table S9-(a), Supplementary Material online). The analysis also suggested that the ω2 values exceeded 1 in the 15 akame-specific fast-evolving genes, and 7 of them were suggested to have evolved under intensified selective pressure (supplementary table S10, Supplementary Material online). Distribution of ω1 (barramundi) versus ω2 (akame) of Model 2 in every orthologous gene pair clearly showed that a relatively larger number of genes were plotted below the diagonal, implying an elevated evolutionary rate in akame (Fig. 5c). The number of genes that showed higher ω values in akame was significantly greater than that of barramundi for all genes analyzed (P < 2.2 × 10−16, exact binomial test) and in the genes that Model 2 was supported statistically (P < 2.2 × 10−16, exact binomial test). GO enrichment analysis of genes with higher ω values in akame (222 genes: Model 2 was supported by LRT) showed that the terms “calcium ion transmembrane transporter activity” (P = 3.2 × 10−4, GO:0015085), “divalent inorganic cation transmembrane transporter activity” (P = 3.5 × 10−4, GO:0072509), “calcium channel activity” (P = 3.4 × 10−3, GO:0005262), “metal ion transmembrane transporter activity” (P = 0.037, GO:0046873), “calcium ion transmembrane transport” (P = 2.0 × 10−4, GO:0070588), “calcium ion transport” (P = 2.6 × 10−3, GO:0006816), “divalent metal ion transport” (P = 8.3 × 10−3, GO:0070838), “divalent inorganic cation transport” (P = 8.8 × 10−3, GO:0072511), and “regulation of localization” (P = 0.025, GO:0032879) were significantly enriched in the akame-specific rapidly evolving genes. By contrast, the terms “eye-pigmentation genes” (P = 1.1 × 10−3, GO:0048069) and “vacuolar proton-transporting V-type ATPase, V1 domain” (P = 0.050, GO:0000221) were enriched in the barramundi-specific rapidly evolving genes (16 genes). Detailed results of the enrichment analysis are described in supplementary tables S8-(b) and S9-(b), Supplementary Material online.

Discussion

Extremely Low Level of Heterozygosity in the Akame Genomes

In this study, the genome-wide single nucleotide heterozygosity levels of the Kochi and Miyazaki akame genomes were estimated to be 3.4 × 10−4/bp and 3.3 × 10−4/bp, respectively. These were consistent with the genetic diversity estimates based on AFLP (Takahashi et al. 2015) and ddRAD-Seq (Naito et al. 2023). The level of heterozygosity in akame is comparable to that in some critically endangered vertebrate species (Table 3). The heterozygosity of akame is substantially lower than those of brackish (three-spined stickleback) and marine (yellow croaker) fish species, slightly lower than that of sunfish Mola mola, and similar to that of the long-lived Pacific Ocean rockfish Sebastes levis, which exhibits the lowest genomic heterozygosity in teleost fishes (Kolora et al. 2021). Thus, akame is considered to be one of the marine teleost fishes with the lowest genetic diversity. The level of heterozygosity in akame is <20% of that of the barramundi (SG individual; see Results). The unusually low heterozygosity in akame is likely caused by the long-term suppression of Ne in this species. Estimation of historical population size using the PSMC model suggested that the small Ne of akame (Ne = 500 to 1,500) has been sustained from ∼3 × 104 years ago to the recent past (Fig. 2). By contrast, the Nes of the Thailand and Indonesia populations of barramundi have been maintained at an order of 103 from the last glacial period (LGP) to Holocene, and they have experienced a rapid population expansion, following a bottleneck in 103 to 104 years ago (Fig. 2).

Table 3

Levels of genome-wide SNP heterozygosity in akame, barramundi, four marine teleost fishes, and six endangered birds and mammals

Common nameScientific nameGroupSNP heterozygosity of whole genome (10−4 /bp)IUCN red list categoryaReferences
Akame (Kochi)L. japonicusTeleost fish3.4VUThis study
Akame (Miyazaki)L. japonicusTeleost fish3.3VUThis study
Barramundi (SG, Singapore)L. calcariferTeleost fish19.0LCEstimated from the data presented in Vij et al. (2016)
Three-spined sticklebackG. aculeatusTeleost fish14.3LCJones et al. (2012)
Yellow croakerLarimichthys croceaTeleost fish35.8CRWu et al. (2014)
Ocean sunfishM. molaTeleost fish7.8VUPan et al. (2016)
Ocean rockfishS. levisTeleost fish2.9Kolora et al. (2021)
Crested ibisN. nipponBird4.3ENLi et al. (2014)
White-tailed eagleHaliaeetus albicillaBird4LCLi et al. (2014)
Bald eagleHaliaeetus leucocephalusBird4.3LCLi et al. (2014)
Island foxU. littoralisMammal0.142 to 4.08NTRobinson et al. (2016)
Mountain gorillaG. gorillaMammal6.5ENXue et al. (2015)
Greater bamboo lemurP. simusMammal0.37CRHawkins et al. (2018)
Common nameScientific nameGroupSNP heterozygosity of whole genome (10−4 /bp)IUCN red list categoryaReferences
Akame (Kochi)L. japonicusTeleost fish3.4VUThis study
Akame (Miyazaki)L. japonicusTeleost fish3.3VUThis study
Barramundi (SG, Singapore)L. calcariferTeleost fish19.0LCEstimated from the data presented in Vij et al. (2016)
Three-spined sticklebackG. aculeatusTeleost fish14.3LCJones et al. (2012)
Yellow croakerLarimichthys croceaTeleost fish35.8CRWu et al. (2014)
Ocean sunfishM. molaTeleost fish7.8VUPan et al. (2016)
Ocean rockfishS. levisTeleost fish2.9Kolora et al. (2021)
Crested ibisN. nipponBird4.3ENLi et al. (2014)
White-tailed eagleHaliaeetus albicillaBird4LCLi et al. (2014)
Bald eagleHaliaeetus leucocephalusBird4.3LCLi et al. (2014)
Island foxU. littoralisMammal0.142 to 4.08NTRobinson et al. (2016)
Mountain gorillaG. gorillaMammal6.5ENXue et al. (2015)
Greater bamboo lemurP. simusMammal0.37CRHawkins et al. (2018)

aIUCN red list categories: CR, critically endangered; EN, endangered; VU, vulnerable; NT, near endangered; and LC, least concern.

Table 3

Levels of genome-wide SNP heterozygosity in akame, barramundi, four marine teleost fishes, and six endangered birds and mammals

Common nameScientific nameGroupSNP heterozygosity of whole genome (10−4 /bp)IUCN red list categoryaReferences
Akame (Kochi)L. japonicusTeleost fish3.4VUThis study
Akame (Miyazaki)L. japonicusTeleost fish3.3VUThis study
Barramundi (SG, Singapore)L. calcariferTeleost fish19.0LCEstimated from the data presented in Vij et al. (2016)
Three-spined sticklebackG. aculeatusTeleost fish14.3LCJones et al. (2012)
Yellow croakerLarimichthys croceaTeleost fish35.8CRWu et al. (2014)
Ocean sunfishM. molaTeleost fish7.8VUPan et al. (2016)
Ocean rockfishS. levisTeleost fish2.9Kolora et al. (2021)
Crested ibisN. nipponBird4.3ENLi et al. (2014)
White-tailed eagleHaliaeetus albicillaBird4LCLi et al. (2014)
Bald eagleHaliaeetus leucocephalusBird4.3LCLi et al. (2014)
Island foxU. littoralisMammal0.142 to 4.08NTRobinson et al. (2016)
Mountain gorillaG. gorillaMammal6.5ENXue et al. (2015)
Greater bamboo lemurP. simusMammal0.37CRHawkins et al. (2018)
Common nameScientific nameGroupSNP heterozygosity of whole genome (10−4 /bp)IUCN red list categoryaReferences
Akame (Kochi)L. japonicusTeleost fish3.4VUThis study
Akame (Miyazaki)L. japonicusTeleost fish3.3VUThis study
Barramundi (SG, Singapore)L. calcariferTeleost fish19.0LCEstimated from the data presented in Vij et al. (2016)
Three-spined sticklebackG. aculeatusTeleost fish14.3LCJones et al. (2012)
Yellow croakerLarimichthys croceaTeleost fish35.8CRWu et al. (2014)
Ocean sunfishM. molaTeleost fish7.8VUPan et al. (2016)
Ocean rockfishS. levisTeleost fish2.9Kolora et al. (2021)
Crested ibisN. nipponBird4.3ENLi et al. (2014)
White-tailed eagleHaliaeetus albicillaBird4LCLi et al. (2014)
Bald eagleHaliaeetus leucocephalusBird4.3LCLi et al. (2014)
Island foxU. littoralisMammal0.142 to 4.08NTRobinson et al. (2016)
Mountain gorillaG. gorillaMammal6.5ENXue et al. (2015)
Greater bamboo lemurP. simusMammal0.37CRHawkins et al. (2018)

aIUCN red list categories: CR, critically endangered; EN, endangered; VU, vulnerable; NT, near endangered; and LC, least concern.

The PSMC analysis also suggested that the historical Ne of akame was much smaller than those reported for other critically endangered vertebrate species. In crested ibis Nipponia nippon, Ne had been maintained at 2 to 6 × 104 from 1 MYA to the LGP and then fell to ∼ 1 × 104 at the end of the LGP. One wild population of crested ibis declined to only seven individuals from two breeding pairs in the middle of the 20th century (Li et al. 2014). The Ne of the mainland population of gray fox U. littoralis had been kept ∼ 1 × 104 from 105 to 103 years ago and decreased to ∼ 2 × 103 at 100 years ago (Robinson et al. 2016). Historical Ne of greater bamboo lemur Prolemur simus was estimated between 9 × 105 to 1 million individuals in 6 to 9 × 104 years ago and declined to 1 to 4 × 103 at present (Hawkins et al. 2018). It is important to note that these endangered species commonly experienced severe population decline during the LGP and the recent past (19th to 20th centuries). By contrast, the Ne of akame was small but stable for long periods and does not appear to have undergone a rapid decline during the LGP, although the recent population size changes (from the 19th to 20th century to the present) in this species are unknown. The dynamics of the historical population size in akame may be more similar to that of the mountain gorilla G. beringei beringei, which is thought to have maintained a small population size (Ne < 103) from 105 years ago to the recent past (Xue et al. 2015). It is revealed that the population of the mountain gorilla has remained relatively stable (Xue et al. 2015; van der Valk et al. 2019). In the mountain gorilla, a decrease in genetic diversity and an increase in genetic load have not been observed in recent centuries (van der Valk et al. 2019), which can be attributed to the purging of deleterious recessive alleles as a result of the constant small population size in this subspecies. On the other hand, the genetic load was suggested to be substantially higher in akame than in barramundi (Fig. 3a and b, and supplementary fig. S4, Supplementary Material online). Also, ROH length and FROH were relatively smaller in akame (Fig. 3c and d) than in species experiencing severe inbreeding (e.g. Xie et al. 2022). Taken together, in contrast to the case of the mountain gorilla, purging deleterious alleles is unlikely to occur in akame, despite the long-term small Ne in this species.

It must be emphasized that the relatively stable historical Ne in akame does not necessarily mean its robustness to rapid population size decline due to human activities. Recently, there has been increasing concern over the akame population decline due to strong fishing pressure and loss of eelgrass (Zostera japonica) bed habitats, which are essential for the larvae and juveniles of this species (Kinoshita et al. 1988). Population size changes in akame in recent centuries are not well understood because census population size has not been adequately estimated. Close-kin mark–recapture (CKMR) analysis is a new genetic marker-based method for assessing population abundance, by estimating census population size directly from population genomic data (Bravington et al. 2016). Recent studies have successfully applied CKMR to estimate population abundances in marine and freshwater fishes (Bravington et al. 2016; Ruzzante et al. 2019). Our results will provide basic genomic information for the application of CKMR in akame.

Genes in the Nucleotide Diversity-Peak Regions of the Akame Genomes

The sliding-window analysis revealed extremely low nucleotide diversity in most chromosomal regions in the akame genome (Fig. 4a). However, the analysis identified multiple genomic regions with significantly high nucleotide diversity, within which 271 genes were identified (supplementary table S4, Supplementary Material online). Analysis of window-FST between akame and barramundi genomes showed that the nucleotide diversity-peak regions were significantly less divergent than the background regions (Fig. 4b), suggesting that the polymorphisms in akame genome were retained by balancing selection. Enrichment analysis of the genes within the nucleotide diversity-peak regions indicated that genes involved in the immune systems were significantly enriched (Fig. 4c; supplementary table S5, Supplementary Material online). In many organisms including fishes, polymorphisms of immune genes are retained by balancing selection in order to keep resistance against diverse pathogens (Croze et al. 2016; Yamaguchi and Dijkstra 2019). For example, a gene cluster of the major histocompatibility complex (MHC) is a prime example of immune genes of which polymorphisms were retained by balancing selection (Spurgin and Richardson 2010), was found in the nucleotide diversity-peak region in LG03 (Fig. 4a). One nonsense variant was identified in CD209 antigen-like protein E (CD209E) gene in LG03 (Fig. 4a). CD209E is one of the homologs of C-type lectin 4 (CTL4) proteins that are used for recognizing pathogens in the innate immune system of fishes and mammals (Xue et al. 2018). Enrichment analysis also showed that genes involved in cell–cell adhesion are significantly enriched in the nucleotide diversity-peak regions. This is mainly caused by two protocadherin gene clusters located in LG08 and LG21 (Fig. 4a, supplementary table S4, Supplementary Material online). Protocadherins are members of the cadherin superfamily and are associated with synaptic development in the vertebrate brain (Morishita and Yagi 2007). In humans, signatures of balancing selection were detected in the promoter regions of protocadherin α gene clusters (Noonan et al. 2003). In addition, four olfactory receptor (OR) genes and one trace amine-associated receptor (TAAR) gene were included in the nucleotide diversity-peak regions (supplementary table S4, Supplementary Material online). In humans and teleost fishes, the diversity of OR genes is suggested to be maintained by balancing selection (Olender et al. 2012; Liu et al. 2021). In akame, it is likely that a part of the genetic polymorphisms of the genes and/or gene regulatory regions within the nucleotide diversity-peak regions are maintained by balancing selection, despite the low level of genetic diversity. Comparison of the sliding windows of nucleotide diversity between akame and barramundi revealed that 44 windows of the nucleotide diversity-peak regions in akame were common to barramundi (supplementary fig. S7, Supplementary Material online), suggesting that polymorphisms in these peaks were retained from the common ancestor of the two species. Genomic regions containing protocadherin (LG08) and OR (LG12) gene clusters were commonly polymorphic both in akame and barramundi (supplementary fig. S7, Supplementary Material online). Interestingly, akame-specific polymorphic genomic regions were also identified. For example, some of the polymorphic genes involved in immunity, such as CD209E in LG03, interferon-induced protein 44-like (IF44) in LG07_1, and butyrophilin subfamily 2 member A2-like (BTN2A2) in LG14, were located in these akame-specific polymorphic regions (supplementary fig. S7, Supplementary Material online). It seems that high polymorphisms in these regions have occurred after the speciation of akame and barramundi. Thus, both long- and short-term balancing selections likely contribute to the maintenance of functional genetic variations and the long-term persistence of the natural populations of akame. Our results also emphasize the importance of functional genetic variation for the conservation of endangered species (Teixeira and Huber 2021).

Selective Pressures for the Akame Genes

The RELAX analysis showed that the proportion of genes under intensified selective pressures was much larger in akame than in barramundi (Fig. 5a and supplementary fig. S9, Supplementary Material online), suggesting that selective pressures have intensified on many akame genes following the speciation of akame and barramundi. This is an unexpected result because the extremely small Ne of akame likely weakens purifying selection and could lead to the fixation of deleterious mutations in the akame genome. The branch-specific tests indicated that ω values tended to be larger in akame genes than in barramundi genes (Fig. 5c). Many of the genes that show higher ω values in akame supported by the branch-specific test (131/222 genes) were assigned to the genes under intensified selective pressure by the RELAX analysis, whereas only two genes were assigned to the genes under relaxed selective pressure. Given the increasing selective pressures on these akame genes, this trend, at least partially, could be explained by positive selection likely caused by some akame-specific rapid adaptations that occurred after the speciation of akame and barramundi. It is important to note that, however, intensified selective pressures in these genes may reflect past adaptive evolution in this species, not ongoing ones. Moreover, it seems very unlikely that most of the akame genes under intensified selective pressure (1,754 genes; Fig. 5a) have evolved under positive selection. At present, it is difficult to provide a comprehensive explanation for the trend of the intensified selective pressure on akame genes, and further studies, such as the population genetics-based approach, are required.

The genus Lates contains 11 extant species mainly distributed in tropical waters, 7 of which are restricted to freshwater environments (Nelson et al. 2016). Akame is the only latid species that is distributed in temperate waters (Katayama and Taki 1984; Iwatsuki et al. 1993). Following the speciation of akame and barramundi, akame likely underwent adaptation to estuarine conditions in temperate waters. Many akame genes may have evolved under positive selection during the speciation process. Genes responsible for cold tolerance and thermal sensing emerge as primary candidates for adaptive evolution, given the necessity for akame to adapt to low-temperature conditions in the winter months of temperate waters. GO analysis of the akame-specific fast-evolving genes (supplementary table S8-(b), Supplementary Material online) revealed significant enrichment of the genes involved in cation transport. Multiple voltage-dependent potassium and calcium channels and transient receptor potential melastatin (TRPM) channels were included within them. A potassium channel expressed in the cardiac muscle was shown to be activated in cold-acclimated goldfish, suggesting that potassium channels play important roles in cold tolerance (Ganim et al. 1998). In three-spined stickleback Gasterosteus aculeatus, the epithelial Ca2+ channel is expressed substantially in the freshwater ecotype, which exhibits higher cold tolerance than its marine counterpart (Gibbons et al. 2016). In addition, TRPM channels are also well-known as cold stimuli and menthol sensors (McKemy et al. 2002; Peier et al. 2002; Himmel et al. 2019). Thus, the observed accelerated evolution of cation channels in akame is plausibly linked to enhancing cold tolerance and/or sensory mechanisms in the context of temperate waters.

The synonymous–nonsynonymous substitution rate ratios in the akame branch (ω2) exceeded 1 in the 15 akame-specific fast-evolving genes and intensified selective pressure was suggested in 7 of these genes (supplementary table S10, Supplementary Material online). In the akame lineage, the seven genes might be involved in the adaptation of temperate waters. For example, Fatty acid desaturase 2 (Fads2) encodes an enzyme catalyzing desaturation in DHA biosynthesis, and is known to be a key metabolic gene for freshwater colonization of G. aculeatus (Ishikawa et al. 2019). Fads2 is also suggested to be involved in the freshwater adaptation of other euryhaline fishes (Ishikawa et al. 2019, 2022). In akame, the adaptive evolution of Fads2 might have played a role in the colonization of brackish waters in the southern part of Japan. However, the biological functions of fast-evolving genes in akame are largely unknown, and further studies are needed to understand the adaptive changes that occurred during the speciation of akame and barramundi.

Conclusion

In this study, we present the draft genome sequences of two individuals of akame. These draft genomes provide essential basic information for future conservation studies of this endangered marine fish species. We show that the akame genomes exhibit extremely low genetic diversity, and the population has persisted over a long period (ca. 30,000 years) with small Ne. Similar to the case of mountain gorilla (Xue et al. 2015; van der Valk et al. 2019), the low genetic diversity of akame is likely caused by fixation due to random genetic drift occurring in long-term stable small populations. However, in contrast to the mountain gorilla, purging of deleterious variants was unlikely in akame, because of the elevated levels of genetic load inferred in this species. Analysis of nucleotide diversity-peak regions revealed that high genetic polymorphisms of some immune, synaptic development, and sensory-related genes were maintained within the population, possibly by balancing selection. This implies that akame has a genetic mechanism to maintain functional genetic variation in specific genomic regions that are required for resistance against pathogens and adaptive response to changing environments. This maintenance of functional genetic diversity despite the stable small Ne possibly contributed to the long-term persistence of this species.

Materials and Methods

Sample Collection

For de novo genome sequencing, a blood sample was collected from one juvenile akame individual at the Shimanto river, Kochi Prefecture, Japan (2017 October 20). A blood sample was also collected from one individual at the Shiomi river in Miyazaki Prefecture, Japan (2017 October 4) (Fig. 1). The blood samples were preserved in 99% ethanol until used. In Kochi individual, brain, muscle, and liver samples were also collected from the same individual for transcriptome analysis. In both localities, the wild individuals of akame were collected under special permission from the Divisions of inland water fisheries in Kochi (No. 863) and Miyazaki (No. 21) Prefectures, respectively.

DNA Extraction, Library Preparation, and Genome Sequencing

Genomic DNA samples were extracted from the red blood cells using a phenol–chloroform method. The quality of genomic DNA was assessed by automated gel electrophoresis using 2200 TapeStation (Agilent Technologies, Santa Clara, CA, USA) and average DNA fragment length was confirmed to be above 60 kb. The concentration of the genomic DNA samples was quantified using microfluorimetry (Qubit Broad range dsDNA Kit, Invitrogen, Carlsbad, CA, USA). From the genomic DNA of the Kochi individual, 10× Genomics Chromium Library was constructed using the Chromium Genome Library Kit and Gel Bead Kit v2, and Chromium Controller and Next Gen Accessory Kit (10× Genomics, Pleasanton, CA, USA) following the manufacturer's protocols. The library was sequenced on a paired-end 2 × 150 nt lane on a HiSeq X Ten (Illumina, San Diego, CA, USA) for a total of 895,181,812 sequence reads. To obtain long DNA reads for scaffolding and gap closing in the assembly, the genome of the Kochi individual was also sequenced by Nanopore PromethION (Oxford Nanopore Technologies, Oxford, UK) following the manufacturer's protocols. For the Miyazaki individual, 10× Genomics Chromium library was also constructed and sequenced on a paired-end 2 × 150 nt lane on a HiSeq X Ten for a total of 835,621,234 sequence reads.

RNA-Sequencing

To improve genome assembly and gene prediction, transcriptome sequences in akame were obtained by RNA-Seq. Total RNAs of brain, muscle, and liver from the Kochi individual used for genome sequencing were extracted using a PureLink RNA extraction kit (Thermo Fisher Scientific, Waltham, MA, USA) with TRIZOL reagent (Thermo Fisher Scientific) following the manufacturer's protocol. The concentration and RNA integrity number (RIN) of total RNA were measured by Agilent Bioanalyzer 2100. High-quality total RNA (RIN > 7.0; Bioanalyzer 2100 RNA nano kit, Agilent Technologies) was stored at −80 °C until use. Reverse-transcribed cDNA libraries were prepared using TruSeq Stranded mRNA LT Sample Prep Kit (Illumina) and sequenced by paired-end 2 × 100 nt on Illumina platform, following the manufacturer's protocol.

Genome Size Estimation

To estimate the genome size of akame, a k-mer frequency analysis was performed using KmerGenie (Chikhi and Medvedev 2013). First, in the Kochi individual, 10× barcode sequences of the Chromium linked-reads were trimmed by Long Ranger basic pipeline (10× Genomics). Second, the best k-mer length was estimated from the barcode-trimmed reads. Third, genome size was estimated from the frequency distribution of the best k-mer length.

Genome Assembly

The procedure used for the hybrid genome assembly approach of the Kochi individual is shown in supplementary fig. S10, Supplementary Material online. First, the Chromium linked-reads of the Kochi individual were assembled using Supernova version 2.1.1 (Weisenfeld et al. 2017) with default settings. Second, the Nanopore long reads were used for scaffolding the linked-reads assembly using SSPACE-LongRead (Boetzer and Pirovano 2014). Third, scaffolding with RNA-Seq reads (see next section) was performed using Rascaf (Song et al. 2016). Gaps in the scaffolds were filled by Nanopore long reads using LR_GapCloser (Xu et al. 2019). Finally, the assembly was polished automatically with the Pilon software tool (Walker et al. 2014) using Illumina linked-reads mapped to the scaffold sequences by BWA (Li and Durbin 2009). Assembly statistics in each analysis step were estimated by assemblathon_stats.pl script (Bradnam et al. 2013). To analyze the chromosomal distribution of polymorphic sites, a chromosome-level assembly was also generated by ordering the scaffolds using mScaffolder (Chakraborty et al. 2018) guided by their alignments to the chromosome-level genome assembly of L. calcarifer (Vij et al. 2016) using Nucmer implemented in MUMmer (Kurtz et al. 2004). The Chromium reads of the Miyazaki individual were also assembled by using Supernova version 2.1.1. The quality of the genome assemblies was evaluated by calculating the completeness in BUSCO version 5.1.2 (Simão et al. 2015) with a database of ray-finned fishes (Actinopterygii odb10). To assess the levels of contamination in genome assembly, the proportions of potential contaminants in the akame (Kochi and Miyazaki) genome assemblies were calculated, and visualized by generating BlobPlot, implemented in BlobTools (Laetsch and Blaxter 2017).

Annotation of Genes and Repeats

Both homology-based and de novo prediction methods were applied to identify repeat sequences in the akame (Kochi) genome assembly. For the homology-based analysis, Repbase (version 20170127) was used to perform searches for TE with RepeatMasker program version 4.1.0 (http://www.repeatmasker.org). For de novo predictions, TEs were identified using RepeatMasker with a de novo repeat library constructed using RepeatModeler version 2.0.1 (Flynn et al. 2020).

Genes in the akame (Kochi) genome (in both scaffold- and chromosome-level assemblies) were predicted using transcript-based methods. First, RNA-Seq reads from the brain, muscle, and liver were cleaned and trimmed by FaQCs (Lo and Chain 2014) and aligned to the soft-masked genome assembly using HISAT2 version 2.1.0 (Kim et al. 2019), and a sorted binary format alignment/map (BAM) file was generated by SAMtools version 1.7 (Li et al. 2009). Second, gene predictions were carried out using the masked genome and RNA-Seq spliced alignment information by GeneMark-ET (Lomsadze et al. 2014) and AUGUSTUS (Keller et al. 2011) implemented in the BRAKER1 pipeline (Hoff et al. 2019). Third, functional information for all putative genes was retrieved by aligning the predicted peptide sequences to the NCBI L. calcarifer transcripts (https://www.ncbi.nlm.nih.gov/genome/14180?genome_assembly_id=275271) using DIAMOND-BLASTP searches (E-value < 1e−10). Functional information was also retrieved from NCBI RefSeq database (https://www.ncbi.nlm.nih.gov/refseq/) using DIAMOND-BLASTP searches (E-value < 1e−10). In addition, to identify corresponding barramundi ENSEMBL transcript IDs, BLASTN searches (E-value < 1e−10 and percent_ID >90%) were conducted against ENSEMBL barramundi cDNAs (https://asia.ensembl.org/Lates_calcarifer/Info/Index) using nucleotide sequences of the predicted akame genes as queries. Completeness of the annotated protein-coding gene set was estimated by BUSCO version 5.1.2 (Simão et al. 2015) with a database of ray-finned fishes (Actinopterygii odb10).

Reconstruction of the Demographic History of Akame and Barramundi

We used PSMC v0.6.5 software (Li and Durbin 2011) to infer the historical Nes of akame and barramundi. For akame, we used the 10× barcode trimmed Illumina reads of both individuals (Kochi and Miyazaki) sampled in this study. For barramundi, we retrieved whole genome sequencing (WGS) data of individuals of all the three major populations of barramundi collected from the Western coast of India (SRR3183258), Thailand (SRR3183264), and Indonesia (SRR3183270); these individuals were deeply sequenced in a previous low coverage WGS project (Vij et al. 2016). These Illumina reads were aligned to the corresponding repeat masked chromosome-level assembly of akame or barramundi using BWA-MEM v0.7.17 (Li and Durbin 2009) with the default parameters. We subsampled 20% of the mapped reads of akame individuals to equalize coverage across samples, resulting in mean depths of 24 to 33. Variant calling was performed using SAMtools v1.10 mpileup utility (Li 2011) (-q 20, -Q 20, and -C 50), and the FASTQ transformed consensus sequences were generated with vcfutils.pl (vcf2fq -d 10 -D 60). The input file for the PSMC modeling was created using fq2psmcfa tools (-q 20, -g 10000, and -s 10) and processed in psmc program with 25 iterations using the following parameters: -N 25, -t 10, -r 5, -p “4 + 25 * 2 + 4 + 6.” In order to prevent overfitting, we confirmed the inference of at least 10 recombinations in each time interval after 20 rounds of iterations with these parameter settings. Bootstrapping (100 replicates) was performed by random sampling with the replacement of 50-kb sequence segments generated by splitfa scripts. For scaling the inferred population history, we used the same parameters as used in the previous demographic analysis of the barramundi (Wang et al. 2016), namely a generation time of 4 years (Jerry 2013) and a mutation rate of 1.0 × 10−8 substitutions per year (Brumfield et al. 2003).

Genotyping and Functional Annotation of SNVs

To assess the genetic diversity in the akame genome, 10× barcode-trimmed Illumina reads of Kochi and Miyazaki individuals were mapped to the chromosome-level assembly (Kochi individual). The main steps for SNV calling and genotyping are given in supplementary fig. S11, Supplementary Material online. First, paired-end Illumina reads were cleaned by filtering adaptor sequences and trimming low-quality bases using FaQCs (Lo and Chain 2014). Second, in each individual, the cleaned reads were mapped to the reference genome by NextGenMap (Sedlazeck et al. 2013), and a BAM file was generated. The BAM file was sorted by SAMtools and mapping quality was checked by QualiMap program package (García-Alcalde et al. 2012). In the sorted BAM file, PCR duplicates were marked using the Picard toolkit (http://broadinstitute.github.io/picard/) and local realignments of INDELs were conducted by GATK v3.8.1 (McKenna et al. 2010). In the BAM file for each individual, GVCF files were generated using GATK HaplotypeCaller tool with options -hets 0.001 and -indelHeterozygosity 0.001. Finally, GVCF files for the two individuals were jointly genotyped and an output VCF file was generated using GATK GenotypeGVCFs tool. For genotyped SNVs, variant filtering was applied using GATK VariantFilteration tool with the following cutoff values: MQ > 30.00, SOR < 4.000, QD > 2.00, FS < 60.000, MQRankSum > −20.000, ReadPosRankSum > −10.000, and ReadPosRankSum < 10.000.

Genome-wide empirical single nucleotide polymorphism (SNP) heterozygosity in the two akame genomes calculated the fraction of heterozygous SNVs in all effective nucleotide sites in the draft genome (=genome effective length: Total genome length − number of N sites). To assess the difference in genetic diversity between akame and barramundi, the genome-wide SNP heterozygosity in the individual of barramundi from SG, the genome of which has been deeply sequenced (ca. 80× coverage; Vij et al. 2016), was also estimated in the same way. Illumina paired-end reads (accession nos. SRR3140997 and SRR3140998) of the SG individual were mapped to the barramundi reference genome assembly, and the number of heterozygote SNV sites was counted.

To assess the genetic load in akame and barramundi, the ratio of the heterozygosity of 0-fold (the proxy for nonsynonymous sites) and 4-fold (the proxy for synonymous sites) degenerate sites in protein-coding regions were calculated in the two akame (Kochi and Miyazaki) and four barramundi (SG, India, Thailand, and Indonesia) individuals, using the similar method as Robinson et al. (2016) and Robinson et al. (2018). SNVs in barramundi individuals were identified by mapping their Illumina reads to the chromosome-level assembly of the akame (Kochi) genome, applying the same filtering criteria in the SNV calling of the two akame individuals. Total numbers and positions of 0- and 4-fold degenerate sites were counted using DEGENOTATE program (Mirchandani et al. 2024). In each individual of akame and barramundi, heterozygous 0- and 4-fold degenerate sites were retrieved using BEDTools (Quinlan and Hall 2010). In addition, functional effect categories of SNVs (High: Start lost, stop gained, and splice acceptor/donor variants; Moderate: Missense variants; Low: Synonymous, stop retained, etc.) in the protein-coding regions were also estimated in these individuals. Variant annotation and functional effect categorization of SNVs in the akame and barramundi genomes were performed using the SnpEff program (Cingolani et al. 2012), based on the predicted gene set of the Akame (Kochi) genome. In this analysis, only heterozygous SNVs were counted and classified into the functional categories. The protein-coding SNVs in akame and barramundi were also classified into “deleterious” and “tolerated” functional categories by SIFT missense prediction method (Vaser et al. 2016). Initially, a custom SIFT database of the akame genes was generated using Swiss-Prot protein sequences downloaded by UniProt database (https://www.uniprot.org). Next, functional categories of SNVs in each individual of akame and barramundi were inferred using the SIFT 4G program. In SIFT missense prediction, both homozygote and heterozygote SNVs were considered and counted separately.

Runs of Homozygosity

In Kochi and Miyazaki individuals of akame and the SG individual of barramundi, ROHs were called using the “bcftools roh” command in BCFTools version 1.7 (Li 2011), with the default allele frequency value set at 0.4 (-AF-dflt 0.4). Indels were not included for the detection of ROH. FROH was calculated as the length of the genome within ROH of at least 100 kb divided by the total length of the genome (chromosome-level assembly).

Nucleotide Diversity-Peak Analysis

A sliding-window analysis of nucleotide diversity was carried out to examine the genome-wide distribution of polymorphic sites in the akame genome. The nucleotide diversity of the akame genome was calculated by passing filters in nonoverlapping 10-kb windows, using the SNVs obtained from the genome sequences of the Kochi and Miyazaki individuals. Nucleotide diversity in each window was calculated using the window-pi program implemented in the VCFtools software package (Danecek et al. 2011). Some genomic regions with unusually high nucleotide diversity may be associated with unannotated repeats or segmental duplications. Thus, on the same sliding-window interval used for nucleotide diversity, estimation of the mappability scores and the depth of coverages were also conducted, and windows with unusual values were excluded from the analysis as possible artifacts. The mappability scores were estimated by GenMap (Pockrandt et al. 2020). Windows with mappability of <0.8 were excluded from the nucleotide diversity-peak analysis. The depth of coverage in each window was estimated by TinyCov (https://github.com/cmdoret/tinycov). Windows with unusually low (<40×) and high (>400×) mean coverages were also excluded from the analysis. Peaks were identified as windows with nucleotide diversity in excess of 4 SDs above the mean across the genome. Enrichment analysis was performed on genes within the nucleotide diversity peak regions using g:Profiler (Raudvere et al. 2019) with the ENSEMBL barramundi perch annotation (Vij et al. 2016). We defined the genes with SNP heterozygosity >0.01 in the protein-coding region located within the nucleotide diversity peak regions as “polymorphic genes” and indicated in the window plot (see Fig. 4a and Table 2). Genes in the nucleotide diversity-peak regions were also classified by functional annotation clustering using DAVID web server (https://david.ncifcrf.gov/gene2gene.jsp; Sherman et al. 2022).

To quantify the genome-wide distribution of nucleotide divergence between akame and barramundi, a sliding-window analysis of FST between akame and barramundi was carried out. Mean FST values between akame (Kochi and Miyazaki) and barramundi (SG, India, Thailand, and Indonesia) were estimated by passing filter in nonoverlapping 10-kb windows, similar to the estimation of nucleotide diversity. Initially, in each of the akame and barramundi individuals, Illumina reads were mapped to the chromosome-level assembly of the akame (Kochi) genome. After indel realignments, GVCF files were generated using GATK HaplotypeCaller tool with options -hets 0.001 and -indelHeterozygosity 0.001. GVCF files for the six individuals were jointly genotyped and an output VCF file was generated using GATK GenotypeGVCFs tool. Finally, variant filtering was applied using the GATK VariantFilteration tool, using the same criteria in the SNV calling of the two akame individuals. The mean FSTs in each window were calculated using the weir-fst-pop program implemented in the VCFtools software package (Danecek et al. 2011). Nucleotide diversity per 10 kb nonoverlapping window in barramundi was also calculated from the four individuals using the window-pi program implemented in the VCFtools software package (Danecek et al. 2011). Nucleotide diversity-peaks were identified as windows with nucleotide diversity in excess of 4 SDs above the mean across the genome. Windows of the nucleotide diversity-peaks common in akame and barramundi were detected by comparing the results of the sliding-window analysis between the two species. To test whether the number of the observed common nucleotide diversity-peak windows was significantly larger than that expected by chance, the reference distribution of the numbers of windows common to the two species under the random expectation was generated using the Monte Carlo method of random sampling with 1,000,000 replicates. P-value was calculated as the proportion of trials that show a larger number of common windows than observed. The R source code of the Monte Carlo simulation was provided in supplementary appendix, Supplementary Material online.

Analysis of Selective Forces on Akame and Barramundi Genes

To assess selective forces on akame genes by molecular evolution-based analysis, orthologous relationships between genes of akame (Kochi individual) and other fish species were inferred based on homology and species phylogeny. First, transcript sequences of barramundi, Oryzias latipes (Japanese medaka), Perca fluviatilis (European perch), and Hippocampus comes (tiger tail seahorse) were downloaded from the NCBI database. Second, protein-coding sequences of these transcripts were predicted by TransDecoder (https://github.com/TransDecoder/TransDecoder/wiki). Third, orthology relationships were inferred by all-by-all BLAST, Markov clustering, and phylogenetic tree inference implemented in the Python programs by Yang and Smith (2014). Codon-based nucleotide sequence alignment was constructed in each of the orthologous gene datasets obtained from the one-to-one (homologs without any taxon repeat) and the MI (an orthology inference method that iteratively cuts out the subtree with the highest number of taxa without taxon duplication) algorithms. Initially, protein sequences for each of the ortholog sequences were aligned by MAFFT version 7 (Katoh and Standley 2013). Next, a codon-based nucleotide sequence alignment was generated using a custom Perl script, by referring to the protein alignment. Then, the phylogenetic relationship of the five species was reconstructed by using the concatenated dataset of all one-to-one orthologs, generated by Phyutility software (Smith and Dunn 2008). A maximum-likelihood tree (supplementary fig. S8, Supplementary Material online) was reconstructed by the multithreading version of RAxML program (Stamataxis 2014) and the reliability of each tree node was assessed by 1,000 replications of the rapid-bootstrap method implemented in RAxML. Topology of the maximum-likelihood tree was used for the evolutionary analyses described below.

To infer the changes in selective pressure to the genes in akame or barramundi, a selection intensity parameter (k) was inferred using RELAX descriptive models (Wertheim et al. 2015) in all one-to-one and MI ortholog gene sets in the five species. In RELAX analysis, a branch connecting to akame or barramundi was set to “test branch” and other branches were considered to the reference branches. Model parameters (three categories of ω = dN/dS values and their proportions in test and reference branches) were estimated in null (k was constrained to 1) and alternative (k was a free parameter, k ≥ 0) models, and a hypothesis for relaxation or intensification of selection is tested by an LRT using the standard χ2 distribution with one degree of freedom. In each of the two species, the numbers of genes under relaxed functional constraints (k < 1) and intensified selective pressure (k > 1; i.e. positive or purifying selection) were counted. The analysis was conducted using the RELAX program implemented in the HYPHY-MPI software package (Pond et al. 2005) and the results were retrieved using a custom Python script.

In addition, differences in the evolutionary rates between genes in akame and barramundi were tested by estimating nonsynonymous–synonymous substitution rate ratios (ω = dN/dS) in all one-to-one and MI ortholog gene sets. Nonsynonymous–synonymous substitution rate ratios of the genes along the tree for these species were estimated by the maximum likelihood-based method developed by Yang (1998), using the CODEML program in the PAML 4.7a software package (Yang 2007). Nonsynonymous–synonymous substitution rate ratios for each tree branch were estimated under the branch-specific models of codon evolution (Yang 1998). Details of the models for ω estimation used in this study are described in the “Results” section. GO enrichment analysis was performed on genes that showed ω > 1 with FDR-adjusted P < 0.05 in the log-likelihood test in akame or barramundi using g:Profiler (Raudvere et al. 2019) with the ENSEMBL barramundi annotation.

Supplementary Material

Supplementary material is available at Genome Biology and Evolution online.

Acknowledgments

We would like to thank the River Fisheries Cooperative Association of Tomishima; the Fisheries Cooperative Association of Down­stream Shimanto River; and the Federation of Fisheries Cooperative Associations of Shimanto River for information on sampling sites and help with fieldwork. We also thank Chikako Tanaka (Department of Biology, Osaka Medical and Pharmaceutical University) for technical assistance. We are grateful to three anonymous referees for helpful comments that improved the manuscript substantially.

Funding

This work was supported by JSPS KAKENHI (grant numbers 17H03629, 18K06368, 18J00928, 20H03012, and 23H02233). This work was also partially supported by the Osaka Medical and Pharmaceutical University (OMPU) Joint Research Grant.

Data Availability

The scaffold-level genome assemblies in akame (Kochi: BRZM01000000, Miyazaki: BAABXA010000000) have been deposited in the DNA Data Bank of Japan (DDBJ) genome database. The Chromium-linked Illumina reads, RNA-Seq reads (Illumina HiSeq), and Nanopore reads utilized for the genome assembly have been deposited to the DDBJ sequence read archive (DRA) under the BioProject accession number PRJDB13763. The chromosome-level assembly and the GFF3 annotation files of akame (Kochi) were deposited in Dryad (URL: https://doi.org/10.5061/dryad.m37pvmdb6).

Literature Cited

Allendorf
FW
,
Luikart
G
,
Aitken
SN
.
Conservation and the genetics of populations
. 2nd ed.
UK
:
John Wiley & Sons, Ltd.
;
2013
.

Alonso
S
,
López
S
,
Izagirre
N
,
de la Rúa
C
.
Overdominance in the human genome and olfactory receptor activity
.
Mol Biol Evol
.
2008
:
25
(
5
):
997
1001
. https://doi.org/10.1093/molbev/msn049.

Barry
P
,
Broquet
T
,
Gagnaire
P-A
.
Age-specific survivorship and fecundity shape genetic diversity in marine fishes
.
Evol Lett
.
2021
:
6
(
1
):
46
62
. https://doi.org/10.1002/evl3.265.

Boetzer
M
,
Pirovano
W
.
SSPACE-LongRead: scaffolding bacterial draft genomes using long read sequence information
.
BMC Bioinformatics
.
2014
:
15
(
1
):
211
. https://doi.org/10.1186/1471-2105-15-211.

Bradnam
KR
,
Fass
JN
,
Alexandrov
A
,
Baranay
P
,
Bechner
M
,
Birol
I
,
Boisvert
S
,
Chapman
JA
,
Chapuis
G
,
Chikhi
R
, et al.
Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species
.
Gigascience
.
2013
:
2
(
1
):
10
. https://doi.org/10.1186/2047-217X-2-10.

Brandt
DYC
,
César
J
,
Goudet
J
,
Meyer
D
.
The effect of balancing selection on population differentiation: a study with HLA genes
.
G3
.
2018
:
8
(
8
):
2805
2815
. https://doi.org/10.1534/g3.118.200367.

Bravington
MV
,
Grewe
PM
,
Davies
CR
.
Absolute abundance of southern bluefin tuna estimated by close-kin mark-recapture
.
Nat Commun
.
2016
:
7
(
1
):
13162
. https://doi.org/10.1038/ncomms13162.

Brumfield
RT
,
Beerli
P
,
Nickerson
DA
,
Edwards
SV
.
The utility of single nucleotide polymorphisms in inferences of population history
.
Trend Ecol Evol
.
2003
:
18
(
5
):
249
256
. https://doi.org/10.1016/S0169-5347(03)00018-1.

Chakraborty
M
,
VanKuren
NW
,
Zhao
R
,
Zhang
X
,
Kalsow
S
,
Emerson
JJ
.
Hidden genetic variation shapes the structure of functional elements in Drosophila
.
Nat Genet
.
2018
:
50
(
1
):
20
25
. https://doi.org/10.1038/s41588-017-0010-y.

Charlesworth
B
.
Effective population size and patterns of molecular evolution and variation
.
Nat Rev Genet
.
2009
:
10
(
3
):
195
205
. https://doi.org/10.1038/nrg2526.

Chikhi
R
,
Medvedev
P
.
Informed and automated k-mer size selection for genome assembly
.
Bioinformatics
.
2013
:
30
(
1
):
31
37
. https://doi.org/10.1093/bioinformatics/btt310.

Cingolani
P
,
Platts
A
,
Wang
LL
,
Coon
M
,
Nguyen
T
,
Wang
L
,
Land
SJ
,
Lu
X
,
Ruden
DM
.
A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3
.
Fly (Austin)
.
2012
:
6
(
2
):
80
92
. https://doi.org/10.4161/fly.19695.

Crnokrak
P
,
Barrett
SCH
.
Purging the genetic load: a review of the experimental evidence
.
Evolution
.
2002
:
56
(
12
):
2347
2358
. https://doi.org/10.1111/j.0014-3820.2002.tb00160.x.

Croze
M
,
Zivkovic
D
,
Stephan
W
,
Hutter
S
.
Balancing selection on immunity genes: review of the current literature and new analysis in Drosophila melanogaster
.
Zoology
.
2016
:
119
(
4
):
322
329
. https://doi.org/10.1016/j.zool.2016.03.004.

Danecek
P
,
Auton
A
,
Abecasis
G
,
Albers
CA
,
Banks
E
,
DePristo
MA
,
Handsaker
RE
,
Lunter
G
,
Marth
GT
,
Sherry
ST
, et al.
The variant call format and VCFtools
.
Bioinformatics
.
2011
:
27
(
15
):
2156
2158
. https://doi.org/10.1093/bioinformatics/btr330.

de Villemereuil
P
,
Rutschmann
A
,
Lee
KD
,
Ewen
JG
,
Brekke
P
,
Santure
AW
.
Little adaptive potential in a threatened passerine bird
.
Curr Biol
.
2019
:
29
(
5
):
889
894.e3
. https://doi.org/10.1016/j.cub.2019.01.072.

Flynn
JM
,
Hubley
R
,
Goubert
C
,
Rosen
J
,
Clark
AG
,
Feschotte
C
,
Smit
AF
.
RepeatModeler2 for automated genomic discovery of transposable element families
.
Proc Natl Acad Sci U S A
.
2020
:
117
(
17
):
9451
9457
. https://doi.org/10.1073/pnas.1921046117.

Frankham
R
,
Ballou
JD
,
Briscoe
DA
.
Introduction to conservation genetics
.
UK
:
Cambridge University Press
;
2010
.

Ganim
RB
,
Peckol
EL
,
Larkin
J
,
Ruchhoeft
ML
,
Cameron
JS
.
ATP-sensitive K+ channels in cardiac muscle from cold-acclimated goldfish: characterization and altered response to ATP
.
Comp Biochem Physiol
.
1998
:
119
(
1
):
395
401
. https://doi.org/10.1016/S1095-6433(97)00443-1.

García-Alcalde
F
,
Okonechnikov
K
,
Carbonell
J
,
Cruz
LM
,
Götz
S
,
Tarazona
S
,
Dopazo
J
,
Meyer
TF
,
Conesa
A
.
Qualimap: evaluating next-generation sequencing alignment data
.
Bioinformatics
.
2012
:
28
(
20
):
2678
2679
. https://doi.org/10.1093/bioinformatics/bts503.

Gibbons
TC
,
Rudman
SM
,
Schulte
PM
.
Responses to simulated winter conditions differ between threespine stickleback ecotypes
.
Mol Ecol
.
2016
:
25
(
3
):
764
775
. https://doi.org/10.1111/mec.13507.

Gonzalvo
S
,
Kawakami
T
,
Tanoue
H
,
Komatsu
T
.
Estuarine dependency of Lates japonicus in Shimanto Estuary, Japan, inferred from otolith Sr:Ca
.
Estuar Coast Shelf Sci
.
2021
:
252
(
5
):
107269
. https://doi.org/10.1016/j.ecss.2021.107269.

Hawkins
MTR
,
Culligan
RR
,
Frasier
CL
,
Dikow
RB
,
Hagenson
R
,
Lei
R
,
Louis
EE
.
Genome sequence and population declines in the critically endangered greater bamboo lemur (Prolemur simus) and implications for conservation
.
BMC Genomics
.
2018
:
19
(
1
):
445
. https://doi.org/10.1186/s12864-018-4841-4.

Hedrick
PW
.
Pathogen resistance and genetic variation at MHC loci
.
Evolution
.
2002
:
56
(
10
):
1902
1908
. https://doi.org/10.1111/j.0014-3820.2002.tb00116.x.

Himmel
NJ
,
Letcher
JM
,
Sakurai
A
,
Gray
TR
,
Benson
MN
,
Cox
DN
.
Drosophila menthol sensitivity and the Precambrian origins of transient receptor potential-dependent chemosensation
.
Philos Trans R Soc Lond Ser B Biol Sci
.
2019
:
374
(
1785
):
20190369
. https://doi.org/10.1098/rstb.2019.0369.

Hoff
KJ
,
Lomsadze
A
,
Borodovsky
M
,
Stanke
M
.
Whole-genome annotation with BRAKER
.
Methods Mol Biol
.
2019
:
1962
:
65
95
. https://doi.org/10.1007/978-1-4939-9173-0_5.

Ishikawa
A
,
Kabeya
N
,
Ikeya
K
,
Kakioka
R
,
Cech
JN
,
Osada
N
,
Leal
MC
,
Inoue
J
,
Kume
M
,
Toyoda
A
, et al.
A key metabolic gene for recurrent freshwater colonization and radiation in fishes
.
Science
.
2019
:
364
(
6443
):
886
889
. https://doi.org/10.1126/science.aau5656.

Ishikawa
A
,
Yamanouchi
S
,
Iwasaki
W
,
Kitano
J
.
Convergent copy number increase of genes associated with freshwater colonization in fishes
.
R Soc Lond B Biol Sci
.
2022
:
377
(
1855
):
20200509
. https://doi.org/10.1098/rstb.2020.0509.

Iwatsuki
Y
,
Tashiro
K
,
Hamasaki
T
.
Distribution and fluctuations in occurrence of the Japanese centropimid fish, Lates japonicus
.
Jpn J Ichthyol
.
1993
:
40
(
3
):
327
332
.

Jerry
DR
(Ed.).
2013
.
Biology and culture of Asian seabass Lates calcarifer
.
Boca Raton, FL
:
CRC Press
.

Jones
FC
,
Baldwin
J
,
Bloom
T
,
Jaffe
DB
,
Nicol
R
,
Wilkinson
J
,
Lander
ES
,
Palma
FD
,
Lindblad-Toh
K
,
Kingsley
DM
.
The genomic basis of adaptive evolution in threespine sticklebacks
.
Nature
.
2012
:
484
(
7392
):
55
61
. https://doi.org/10.1038/nature10944.

Kardos
M
,
Armstrong
EE
,
Fitzpatrick
SW
,
Hauser
S
,
Hedrick
PW
,
Miller
JM
,
Tallmon
DA
,
Funk
WC
.
The crucial role of genome-wide genetic variation in conservation
.
Proc Natl Acad Sci U S A
.
2021
:
118
(
48
):
e2104642118
. https://doi.org/10.1073/pnas.2104642118.

Katayama
M
,
Taki
Y
.
Lates japonicus, a new centropomid fish from Japan
.
Jpn J Ichthyol
.
1984
:
30
(
4
):
361
367
.

Katoh
K
,
Standley
DM
.
MAFFT multiple sequence alignment software version 7: improvements in performance and usability
.
Mol Biol Evol
.
2013
:
30
(
4
):
772
780
. https://doi.org/10.1093/molbev/mst010.

Keller
O
,
Kollmar
M
,
Stanke
M
,
Waack
S
.
A novel hybrid gene prediction method employing protein multiple sequence alignments
.
Bioinformatics
.
2011
:
27
(
6
):
757
763
. https://doi.org/10.1093/bioinformatics/btr010.

Khan
A
,
Patel
K
,
Shukla
H
,
Viswanathan
A
,
van der Valk
T
,
Borthakur
U
,
Nigam
P
,
Zachariah
A
,
Jhala
YV
,
Kardos
M
, et al.
Genomic evidence for inbreeding depression and purging of deleterious genetic variation in Indian tigers
.
Proc Natl Acad Sci U S A
.
2021
:
118
(
49
):
e2023018118
. https://doi.org/10.1073/pnas.2023018118.

Kim
D
,
Paggi
JM
,
Park
C
,
Bennett
C
,
Salzberg
SL
.
Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype
.
Nat Biotechnol
.
2019
:
37
(
8
):
907
915
. https://doi.org/10.1038/s41587-019-0201-4.

Kinoshita
I
,
Fujita
S
,
Takahashi
I
,
Azuma
K
.
Occurrence of larval and juvenile Japanese snook, Lates japonicus, in the Shimanto estuary, Japan
.
Jpn J Ichthyol
.
1988
:
34
(
4
):
462
467
. https://doi.org/10.1007/BF02905651.

Kolora
SRR
,
Owens
GL
,
Vazquez
JM
,
Stubbs
A
,
Chatla
K
,
Jainese
C
,
Seeto
K
,
McCrea
M
,
Sandel
MW
,
Vianna
JA
, et al.
Origins and evolution of extreme life span in Pacific Ocean rockfishes
.
Science
.
2021
:
374
(
6569
):
842
847
. https://doi.org/10.1126/science.abg5332.

Kurtz
S
,
Phillippy
A
,
Delcher
AL
,
Smoot
M
,
Shumway
M
,
Antonescu
C
,
Salzberg
SL
.
Versatile and open software for comparing large genomes
.
Genome Biol
.
2004
:
5
(
2
):
R12
. https://doi.org/10.1186/gb-2004-5-2-r12.

Laetsch
DR
,
Blaxter
ML
.
BlobTools: interrogation of genome assemblies. [version 1; peer review: 2 approved with reservations]
.
F1000Res
.
2017
:
6
:
1287
. https://doi.org/10.12688/f1000research.12232.1.

Larsen
L-K
,
Pélabon
C
,
Bolstad
GH
,
Viken
Å
,
Fleming
IA
,
Rosenqvist
G
.
Temporal change in inbreeding depression in life-history traits in captive populations of guppy (Poecilia reticulata): evidence for purging?
J Evol Biol
.
2011
:
24
(
4
):
823
834
. https://doi.org/10.1111/j.1420-9101.2010.02224.x.

Larson
H
. Order Perciformes. Suborder Percoidei. Centropomidae. Sea perches. In:
Carpenter
KE
,
Niem
VH
, editors.
FAO species identification guide for fishery purposes. The living marine resources of the Western Central Pacific. Bony fishes part 2 (Mugilidae to Carangidae)
. Vol.
4
.
Rome
:
FAO
;
1999
. p.
2429
2432
.

Li
H
.
A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data
.
Bioinformatics
.
2011
:
27
(
21
):
2987
2993
. https://doi.org/10.1093/bioinformatics/btr509.

Li
H
,
Durbin
R
.
Fast and accurate short read alignment with Burrows–Wheeler transform
.
Bioinformatics
.
2009
:
25
(
14
):
1754
1760
. https://doi.org/10.1093/bioinformatics/btp324.

Li
H
,
Durbin
R
.
Inference of human population history from individual whole-genome sequences
.
Nature
.
2011
:
475
(
7357
):
493
496
. https://doi.org/10.1038/nature10231.

Li
H
,
Handsaker
B
,
Wysoker
A
,
Fennell
T
,
Ruan
J
,
Homer
N
,
Marth
G
,
Abecasis
G
,
Durbin
R
.
The sequence alignment/map (SAM) format and SAMtools
.
Bioinformatics
.
2009
:
25
(
16
):
2078
2079
. https://doi.org/10.1093/bioinformatics/btp352.

Li
S
,
Li
B
,
Cheng
C
,
Xiong
Z
,
Liu
Q
,
Lai
J
,
Carey
HV
,
Zhang
Q
,
Zheng
H
,
Wei
S
, et al.
Genomic signatures of near-extinction and rebirth of the crested ibis and other endangered bird species
.
Genome Biol
.
2014
:
15
(
12
):
557
. https://doi.org/10.1186/s13059-014-0557-1.

Liu
Q
,
Mishra
M
,
Saxena
AS
,
Wu
H
,
Qiu
Y
,
Zhang
X
,
You
X
,
Ding
S
,
Miyamoto
MM
.
Balancing selection maintains ancient polymorphisms at conserved enhancers for the olfactory receptor genes of a Chinese marine fish
.
Mol Ecol
.
2021
:
30
(
16
):
4023
4038
. https://doi.org/10.1111/mec.16016.

Lo
C
,
Chain
PSG
.
Rapid evaluation and quality control of next generation sequencing data with FaQCs
.
BMC Bioinformatics
.
2014
:
15
(
1
):
366
. https://doi.org/10.1186/s12859-014-0366-2.

Lomsadze
A
,
Burns
PD
,
Borodovsky
M
.
Integration of mapped RNA-Seq reads into automatic training of eukaryotic gene finding algorithm
.
Nucleic Acid Res
.
2014
:
42
(
15
):
e119
. https://doi.org/10.1093/nar/gku557.

Mathur
S
,
Mason
AJ
,
Bradburd
GS
,
Gibbs
HL
.
Functional genomic diversity is correlated with neutral genomic diversity in populations of an endangered rattlesnake
.
Proc Natl Acad Sci U S A
.
2023
:
120
(
43
):
e2303043120
. https://doi.org/10.1073/pnas.2303043120.

McKemy
DD
,
Neuhausser
WM
,
David Julius
D
.
Identification of a cold receptor reveals a general role for TRP channels in thermosensation
.
Nature
.
2002
:
416
(
6876
):
52
58
. https://doi.org/10.1038/nature719.

McKenna
A
,
Hanna
M
,
Banks
E
,
Sivachenko
A
,
Cibulskis
K
,
Kernytsky
A
,
Garimella
K
,
Altshuler
D
,
Gabriel
S
,
Daly
M
, et al.
The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data
.
Genome Res
.
2010
:
20
(
9
):
1297
1303
. https://doi.org/10.1101/gr.107524.110.

Mirchandani
CD
,
Shultz
AJ
,
Thomas
GWC
,
Smith
SJ
,
Baylis
M
,
Arnold
B
,
Corbett-Detig
R
,
Enbody
E
,
Sackton
TB
.
A fast, reproducible, high-throughput variant calling workflow for population genomics
.
Mol Biol Evol
.
2024
:
41
(
1
):
msad270
. https://doi.org/10.1093/molbev/msad270.

Morishita
H
,
Yagi
T
.
Protocadherin family: diversity, structure, and function
.
Curr Opin Cell Biol
.
2007
:
19
(
5
):
584
592
. https://doi.org/10.1016/j.ceb.2007.09.006.

Naito
T
,
Nakayama
K
,
Takeshima
H
,
Hashiguchi
Y
,
Akita
T
,
Yamasaki
YY
,
Mishina
T
,
Takeshita
N
,
Nagano
AJ
,
Takahashi
H
.
The detailed population genetic structure of the rare endangered latid fish akame Lates japonicus with extremely low genetic diversity revealed from single-nucleotide polymorphisms
.
Conserv Genet
.
2023
:
24
(
4
):
523
535
. https://doi.org/10.1007/s10592-023-01517-2.

Nelson
JS
,
Grande
TC
,
Wilson
MVH
.
Fishes of the world
. 5th ed.
Hoboken, NJ
:
John Wiley & Sons Inc.
;
2016
.

Noonan
JP
,
Li
J
,
Nguyen
L
,
Caoile
C
,
Dickson
M
,
Grimwood
J
,
Schmutz
J
,
Feldman
MW
,
Myers
RM
.
Extensive linkage disequilibrium, a common 16.7-kilobase deletion, and evidence of balancing selection in the human protocadherin α cluster
.
Am J Hum Genet
.
2003
:
72
(
3
):
621
635
. https://doi.org/10.1086/368060.

Olender
T
,
Waszak
SM
,
Viavant
M
,
Khen
M
,
Ben-Asher
E
,
Reyes
A
,
Nativ
N
,
Wysocki
CJ
,
Ge
D
,
Lancet
D
.
Personal receptor repertoires: olfaction as a model
.
BMC Genomics
.
2012
:
13
(
1
):
414
. https://doi.org/10.1186/1471-2164-13-414.

Pan
H
,
Yu
H
,
Ravi
V
,
Li
C
,
Lee
AP
,
Lian
MM
,
Tay
B-H
,
Brenner
S
,
Wang
J
,
Yang
H
, et al.
The genome of the largest bony fish, ocean sunfish (Mola mola), provides insights into its fast growth rate
.
GigaScience
.
2016
:
5
(
1
):
36
. https://doi.org/10.1186/s13742-016-0144-3.

Peier
AM
,
Moqrich
A
,
Hergarden
AC
,
Reeve
AJ
,
Andersson
DA
,
Story
GM
,
Earley
TJ
,
Dragoni
I
,
McIntyre
P
,
Bevan
S
, et al.
A TRP channel that senses cold stimuli and menthol
.
Cell
.
2002
:
108
(
5
):
705
715
. https://doi.org/10.1016/S0092-8674(02)00652-9.

Pockrandt
C
,
Alzamel
M
,
Iliopoulos
CS
,
Reinert
K
.
GenMap: ultra-fast computation of genome mappability
.
Bioinformatics
.
2020
:
36
(
12
):
3687
3692
. https://doi.org/10.1093/bioinformatics/btaa222.

Pond
SLK
,
Frost
SDW
,
Muse
SV
.
Hyphy: hypothesis testing using phylogenies
.
Bioinformatics
.
2005
:
21
(
5
):
676
679
. https://doi.org/10.1093/bioinformatics/bti079.

Quinlan
AR
,
Hall
IM
.
BEDTools: a flexible suite of utilities for comparing genomic features
.
Bioinformatics
.
2010
:
26
(
6
):
841
842
. https://doi.org/10.1093/bioinformatics/btq033.

Raudvere
U
,
Kolberg
L
,
Kuzmin
I
,
Arak
T
,
Adler
P
,
Peterson
H
,
Vilo
J
.
G:Profiler: a web server for functional enrichment analysis and conversions of gene lists (2019 update)
.
Nucleic Acid Res
.
2019
:
47
(
W1
):
W191
W198
. https://doi.org/10.1093/nar/gkz369.

Robinson
JA
,
Brown
C
,
Kim
BY
,
Lohmueller
KE
,
Wayne
RK
.
Purging of strongly deleterious mutations explains long-term persistence and absence of inbreeding depression in island foxes
.
Curr Biol
.
2018
:
28
(
21
):
3487
3494
. https://doi.org/10.1016/j.cub.2018.08.066.

Robinson
JA
,
Ortega-Del Vecchyo
D
,
Fan
Z
,
Kim
BY
,
vonHoldt
BM
,
Marsden
CD
,
Lohmueller
KE
,
Wayne
RK
.
Genetic flatlining in the endangered island fox
.
Curr Biol
.
2016
:
26
(
9
):
1183
1189
. https://doi.org/10.1016/j.cub.2016.02.062.

Ruzzante
DE
,
McCracken
GR
,
Førland
B
,
MacMillan
J
,
Notte
D
,
Buhariwalla
C
,
Mills Flemming
J
,
Skaug
H
.
Validation of close-kin mark–recapture (CKMR) methods for estimating population abundance
.
Method Ecol Evol
.
2019
:
10
(
9
):
1445
1453
. https://doi.org/10.1111/2041-210X.13243.

Sedlazeck
FJ
,
Rescheneder
P
,
von Haeseler
A
.
NextGenMap: fast and accurate read mapping in highly polymorphic genomes
.
Bioinformatics
.
2013
:
29
(
21
):
2790
2791
. https://doi.org/10.1093/bioinformatics/btt468.

Sherman
BT
,
Hao
M
,
Qiu
J
,
Jiao
X
,
Baseler
MW
,
Lane
HC
,
Imamichi
T
,
Chang
W
.
DAVID: a web server for functional enrichment analysis and functional annotation of gene lists (2021 update)
.
Nucl Acid Res
.
2022
:
50
(
W1
):
W216
W221
. https://doi.org/10.1093/nar/gkac194.

Simão
FA
,
Waterhouse
RM
,
Ioannidis
P
,
Kriventseva
EV
,
Zdobnov
EM
.
BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs
.
Bioinformatics
.
2015
:
31
(
19
):
3210
3212
. https://doi.org/10.1093/bioinformatics/btv351.

Smith
SA
,
Dunn
CW
.
Phyutility: a phyloinformatics tool for trees, alignments, and molecular data
.
Bioinformatics
.
2008
:
24
(
5
):
715
716
. https://doi.org/10.1093/bioinformatics/btm619.

Song
L
,
Shankar
D
,
Florea
L
.
Rascaf: improving genome assembly with RNA-Seq data
.
Plant Genome
.
2016
:
9
(
3
). https://doi.org/10.3835/plantgenome2016.03.0027.

Spurgin
LG
,
Richardson
DS
.
How pathogens drive genetic diversity: MHC, mechanisms and misunderstandings
.
Proc Biol Sci
.
2010
:
277
(
1684
):
979
988
. https://doi.org/10.1098/rspb.2009.2084.

Stamataxis
A
.
RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies
.
Bioinformatics
.
2014
:
30
(
9
):
1312
1313
. https://doi.org/10.1093/bioinformatics/btu033.

Takahashi
H
,
Takeshita
N
,
Tanoue
H
,
Ueda
S
,
Takeshima
H
,
Komatsu
T
,
Kinoshita
I
,
Nishida
M
.
Severely depleted genetic diversity and population structure of a large predatory marine fish (Lates japonicus) endemic to Japan
.
Conserv Genet
.
2015
:
16
(
5
):
1155
1165
. https://doi.org/10.1007/s10592-015-0729-x.

Teixeira
JC
,
Huber
CD
.
The inflated significance of neutral genetic diversity in conservation genetics
.
Proc Natl Acad Sci U S A
.
2021
:
118
(
10
):
e2015096118
. https://doi.org/10.1073/pnas.2015096118.

van der Valk
T
,
Diez-del-Molino
D
,
Marques-Bonet
T
,
Guschanski
K
,
Dalén
L
.
Historical genomes reveal the genomic consequences of recent population decline in Eastern gorillas
.
Curr Biol
.
2019
:
29
(
1
):
165
170
. https://doi.org/10.1016/j.cub.2018.11.055.

Vaser
R
,
Adusumalli
S
,
Leng
SN
,
Sikic
M
,
Ng
PC
.
SIFT missense predictions for genomes
.
Nat Protoc
.
2016
:
11
(
1
):
1
9
. https://doi.org/10.1038/nprot.2015.123.

Vergeer
P
,
Wagemaker
NCAM
,
Ouborg
NJ
.
Evidence for an epigenetic role in inbreeding depression
.
Biol Lett
.
2012
:
8
(
5
):
798
801
. https://doi.org/10.1098/rsbl.2012.0494.

Vij
S
,
Kuhl
H
,
Kuznetsova
IS
,
Komissarov
A
,
Yurchenko
AA
,
Van Heusden
P
,
Singh
S
,
Thevasagayam
NM
,
Prakki
SRS
,
Purushothaman
K
, et al.
Chromosomal-level assembly of the Asian seabass genome using long sequence reads and multi-layered scaffolding
.
PLoS Genet
.
2016
:
12
(
4
):
e1005954
. https://doi.org/10.1371/journal.pgen.1005954.

Walker
BJ
,
Abeel
T
,
Shea
T
,
Priest
M
,
Abouelliel
A
,
Sakthikumar
S
,
Cuomo
CA
,
Zeng
Q
,
Wortman
J
,
Young
SK
, et al.
Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement
.
PLoS One
.
2014
:
9
(
11
):
e112963
. https://doi.org/10.1371/journal.pone.0112963.

Wang
L
,
Wan
ZY
,
Lim
HS
,
Yue
GH
.
Genetic variability, local selection and demographic history: genomic evidence of evolving towards allopatric speciation in Asian seabass
.
Mol Ecol
.
2016
:
25
(
15
):
3605
3621
. https://doi.org/10.1111/mec.13714.

Weisenfeld
NI
,
Kumar
V
,
Shah
P
,
Church
DM
,
Jaffe
DB
.
Direct determination of diploid genome sequences
.
Genome Res
.
2017
:
27
(
5
):
757
767
. https://doi.org/10.1101/gr.214874.116.

Wertheim
JO
,
Murrell
B
,
Smith
MD
,
Pond
SLK
,
Scheffler
K
.
RELAX: detecting relaxed selection in a phylogenetic framework
.
Mol Biol Evol
.
2015
:
32
(
3
):
820
832
. https://doi.org/10.1093/molbev/msu400.

Wilder
AP
,
Supple
MA
,
Subramanian
A
,
Mudide
A
,
Swofford
R
,
Serres-Armero
A
,
Steiner
C
,
Koepfli
K-P
,
Genereux
DP
,
Karlsson
EK
, et al.
The contribution of historical processes to contemporary extinction risk in placental mammals
.
Science
.
2023
:
380
(
6643
):
eabn5856
. https://doi.org/10.1126/science.abn5856.

Wu
C
,
Zhang
D
,
Kan
M
,
Lv
Z
,
Zhu
A
,
Su
Y
,
Zhou
D
,
Zhang
J
,
Zhang
Z
,
Xu
M
, et al.
The draft genome of the large yellow croaker reveals well-developed innate immunity
.
Nat Commun
.
2014
:
5
(
1
):
5227
. https://doi.org/10.1038/ncomms6227.

Xie
H-X
,
Liang
X-X
,
Chen
Z-Q
,
Li
W-M
,
Mi
C-R
,
Li
M
,
Wu
Z-J
,
Zhou
X-M
,
Du
W-G
.
Ancient demographics determine the effectiveness of genetic purging in endangered lizards
.
Mol Biol Evol
.
2022
:
39
(
1
):
msab359
. https://doi.org/10.1093/molbev/msab359.

Xu
G-C
,
Xu
T-J
,
Zhu
R
,
Zhang
Y
,
Li
S-Q
,
Wang
H-W
,
Li
J-T
.
LR_Gapcloser: a tiling path-based gap closer that uses long reads to complete genome assembly
.
Gigascience
.
2019
:
8
(
1
):
giy157
. https://doi.org/10.1093/gigascience/giy157.

Xue
Y
,
Prado-Martinez
J
,
Sudmant
PH
,
Narasimhan
V
,
Ayub
Q
,
Szpak
M
,
Frandsen
P
,
Chen
Y
,
Yngvadottir
B
,
Cooper
DN
, et al.
Mountain gorilla genomes reveal the impact of long-term population decline and inbreeding
.
Science
.
2015
:
348
(
6231
):
242
245
. https://doi.org/10.1126/science.aaa3952.

Xue
D
,
Wang
G
,
Yan-lia
S
,
Min
Z
,
Yong-hua
H
.
Black rockfish C-type lectin, SsCTL4: a pattern recognition receptor that promotes bactericidal activity and virus escape from host immune defense
.
Fish Shellfish Immunol
.
2018
:
79
:
340
350
. https://doi.org/10.1016/j.fsi.2018.05.033.

Yamaguchi
T
,
Dijkstra
JM
.
Major histocompatibility complex (MHC) genes and disease resistance in fish
.
Cells
.
2019
:
8
(
4
):
378
. https://doi.org/10.3390/cells8040378.

Yang
Z
.
Likelihood ratio tests for detecting positive selection and application to primate lysozyme evolution
.
Mol Biol Evol
.
1998
:
15
(
5
):
568
573
. https://doi.org/10.1093/oxfordjournals.molbev.a025957.

Yang
Z
.
PAML 4: a program package for phylogenetic analysis by maximum likelihood
.
Mol Biol Evol
.
2007
:
24
(
8
):
1586
1591
. https://doi.org/10.1093/molbev/msm088.

Yang
Y
,
Smith
SA
.
Orthology inference in nonmodel organisms using transcriptomes and low-coverage genomes: improving accuracy and matrix occupancy for phylogenomics
.
Mol Biol Evol
.
2014
:
31
(
11
):
3081
3092
. https://doi.org/10.1093/molbev/msu245.

Author notes

Conflict of Interest The authors declare that there is no conflict of interest.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
Associate Editor: Bonnie Fraser
Bonnie Fraser
Associate Editor
Search for other works by this author on:

Supplementary data