Genomic analysis reveals multiple mismatches between biological and management units in yellowfin tuna (Thunnus albacares)


 The South African (SAF) yellowfin tuna (Thunnus albacares) fishery represents a potential example of misalignment between management units and biological processes. The SAF fishery spans an operational stock with a boundary at 20°E, either side of which fish are considered part of Atlantic or Indian Ocean regional stocks. However, the actual recruitment of fish from Atlantic and Indian Ocean spawning populations into SAF waters is unknown. To address this knowledge gap, genomic analysis (11 101 SNPs) was performed on samples from Atlantic and Indian Ocean spawning sites, including SAF sites spanning the current stock boundary. Outlier loci conferred high discriminatory power to assignment tests and revealed that all SAF fish were assigned to the Indian Ocean population and that no Atlantic Ocean fish appeared in the SAF samples. Additionally, several Indian Ocean migrants were detected at the Atlantic spawning site demonstrating asymmetric dispersal and the occurrence of a mixed-stock fishery in Atlantic waters. This study highlights both the spatial inaccuracy of current stock designations and a misunderstanding of interactions between the underlying biological units, which must be addressed in light of local and global declines of the species. Specifically, the entire SAF fishery must be managed as part of the Indian Ocean stock.

Genomic analysis reveals multiple mismatches between biological and management units in yellowfin tuna (Thunnus albacares)

Introduction
The worldwide depletion of fish communities (Myers and Worm, 2005) with evidence of fishery induced economic (Botsford et al., 1997) and biological extinctions (Jackson et al., 2001) highlights the importance of identifying biologically distinct units within marine fishes for both sustainable management and conservation of marine biodiversity (Ruzzante, 2006). The ability to monitor the dynamics of such components within systems involving seasonal migration and potential spatial overlap is beneficial, as more easily exploited and/or less productive units may be susceptible to overharvesting, contributing to loss of diversity and adaptive potential (Iles and Sinclair, 1982) and producing negative effects on recruitment potential and population/fishery viability (Ryman et al., 1995). Recent advances in population genomic methods offer considerable potential as tools to meet the many challenges associated with sustainable fishery management (Allendorf et al., 2010). However, this potential is yet to be fully harnessed for a variety of reasons. The integration of genomics data and fishery management is particularly hampered in developing countries where threats to fishery sustainability may be most concentrated (Waples et al., 2008;Willette et al., 2014;Bernatchez et al., 2017).
Yellowfin tuna, Thunnus albacares is globally distributed in tropical waters and supports fishery stocks that extend across V C International Council for the Exploration of the Sea 2018. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/ licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
ICES Journal of Marine Science (2018), doi:10.1093/icesjms/fsy102 boundaries of national exclusive economic zones (EEZs) and into the high seas. The species accounts for the second largest worldwide catch of tuna and tuna-like species (after skipjack tuna Katsuwonus pelamis) in terms of catch weight and volume (Miyake et al., 2010;Juan-Jordá et al., 2013;FAO, 2017), with an average annual catch of $1.25 million metric tons over the past decade (Pecoraro et al., 2017). At present, four regional tuna management units are described, Atlantic, Indian, Eastern Pacific, and West Central Pacific, and each is managed as a single stock by the respective tuna Regional Fisheries Management Organisations (tRFMOs) (Pecoraro et al., 2017). Recent genomic studies have confirmed genetic differences and restricted interbreeding among regional groups (Pecoraro et al., 2016;Barth et al., 2017). As previous studies have been macrogeographic in scale, there is still considerable uncertainty over the fine-scale boundaries and interactions among genetic groups and their congruence with current operational stock boundaries. Mismatch between biological and management units is recognized as a major threat to global fishery sustainability (Reiss et al., 2009), and clarifying any such inconsistencies for yellowfin tuna is an urgent consideration as the management units are currently described as fully exploited and may even be overexploited (Majkowski, 2007), with the species being described as "near threatened" by the IUCN (Collette et al., 2011).
The South African (SAF) yellowfin tuna fishery provides a striking case where management units may be incongruent with biological population processes of dispersal, interbreeding and adaptation. The present management boundary between the Atlantic and Indian Ocean operational stocks lies off South Africa at 20 E (FAO, 2017), and the two stocks are assessed and managed by the International Commission for the Conservation of Atlantic Tunas (ICCAT) and the Indian Ocean Tuna Commission (IOTC), respectively ( Figure 1). Yellowfin tuna catches west of the 20 E are reported to ICCAT for inclusion in the Atlantic Ocean stock assessment, whereas catches east of this line are reported to the IOTC. However, this boundary is not based on any recognized biogeographic boundary or speciesspecific biological information but is based on a geographic feature, the southernmost tip of Africa at Cape Agulhas at the confluence of the Atlantic and Indian Oceans. Yellowfin tuna do not spawn off the southern coast of South Africa, as the local environment does not provide conditions for optimal survival of early life stages (Pecoraro et al., 2016), and thus adults occurring there represent allochthonous feeding migrants. Barth et al. (2017) suggested that the Benguela current system (BCS) along southwest African coast may be a barrier to dispersal of individuals from Atlantic spawning areas north of the BCS into SAF waters. It could therefore be hypothesized that the SAF fishery is being sustained solely by individuals from Indian Ocean spawning population(s). If so, the accuracy of current stock monitoring methods is fundamentally compromised. These inaccuracies need to be urgently resolved, because the Standing Committee on Research and Statistics of ICCAT has reported the Atlantic stock as over-fished (ICCAT, 2017), and the Scientific Committee of the IOTC has declared the Indian Ocean stock as overfished and has proposed an interim rebuilding plan (IOTC, 2016).
The present research aimed to build on and extend previous studies of yellowfin tuna by using RAD-Seq genotyping to assess the membership of yellowfin tuna adults sampled in SAF waters, from sites spanning the operational stock boundary, to Atlantic and Indian Ocean genetic stocks. A specific hypothesis being tested was that, in line with current management unit delineation, Western Cape and Southern/Eastern Cape yellowfin are derived from distinct Atlantic and Indian Ocean spawning populations, respectively. This study reports robust genetic patterns that both corroborate previous studies and reveal new aspects of yellowfin biocomplexity that collectively highlight both the spatial inaccuracy of current stock designations and a misunderstanding of dispersal patterns and population boundaries. These discrepancies prevent the accurate assessment of population productivity and dynamics, and so undermine the effectiveness of management actions that may compromise resilience and sustainability of the resource.

Material and methods
Sample collection, DNA isolation, and species verification Our sampling strategy (Table 1 and Figure 1) was devised to include fish off the Western (Atlantic Ocean waters) and Eastern (Indian Ocean waters) Cape Provinces, as well as fish from putative spawning areas of regional stocks in the Atlantic Ocean [Gulf of Guinea (CG), ICCAT, 2016 and Indian Ocean (KwaZulu-Natal), IOTC, 2014]. Fin clips from individual yellowfin tuna were fixed in 95% ethanol. DNA was extracted from fin clips using a phenol/chloroform/isoamyl alcohol (PCIA) method following Winnepenninckx et al. (1993). A segment of the mitochondrial Control Region was amplified with the polymerase chain reaction (PCR) and sequenced in both directions using the species-specific primers (5 0 -TCCTACCCCTAACTCCCAAAG-3 0 ; and reverse primer: 5 0 -AAACTGTGGGGATTCTCAC-3 0 ). Sequences were used to confirm species identity using the BLAST function in GenBank. MtDNA summary statistics and estimates of inter-sample differentiation were calculated following McKeown et al. (2015).

SNP genotyping
SNP genotyping by sequencing was performed using tunable Genotyping By Sequencing (tGBS), a modified version of RAD- Seq that incorporates digestion with two enzymes for genome reduction and results in an increased number of reads per site (Ott et al., 2017). The tGBS libraries were sequenced on a Life Technologies' Ion Proton System. Sequenced reads were analysed using a custom Perl script (available at https://github.com/orgs/ schnablelab; accessed 20 July 2018), which assigned each read to a sample and removed barcode sequences. Seqclean (sourceforge.net/projects/seqclean) was then used to remove proton adaptor sequences and chimaeric reads harbouring internal restriction enzyme sites. Retained reads were subjected to quality trimming in two phases using the software Lucy2 (Li and Chou, 2004) in which bases with PHRED scores <15 (of 40) were removed. In the first phase, sequences were scanned at each end; whereas in the second phase, sequences were scanned using overlapping 10 bp windows. Quality trimmed sequence reads were aligned to the reference genome of Pacific bluefin tuna, Thunnus orientalis (Nakamura et al., 2013), using GSNAP (Wu and Nacu, 2010). Sequence alignments were then scanned for polymorphisms. A SNP was called homozygous if at least 15 reads supported the genotype at the site and at least 90% of all reads covering that site shared the same nucleotide. A SNP was considered heterozygous if each of the two nucleotide variants were reported at least 10 times, and each allele was represented in >30% of the total reads.

SNP summary statistics and outlier analysis
Allele frequencies and observed (H O ) and expected (H E ) heterozygosities were estimated using ARLEQUIN 3.4.2.2 (Excoffier and Lischer, 2010). ARLEQUIN was also used to test for departures from expectations of Hardy-Weinberg Equilibrium (HWE). There are numerous challenges to detect SNP "outlier" loci that are putatively under selection (Narum and Hess, 2011). Even though the concept of balancing selection is well established, there remain methodological limitations for its identification in hitchhiking mapping (Hansen et al., 2010). Therefore, we restricted our analysis to the detection of signals of directional selection using two conceptually different approaches. First, we used the FDIST2 outlier detection method (Beaumont and Nichols, 1996) implemented in LOSITAN (Antao et al., 2008). We used the "force mean F ST " option to obtain the genome-wide mean (neutral) F ST value, to reduce the rate of false-positive detection of outliers. We used a null model distribution of the mean F ST as a function of H E under an island model of population subdivision using 50 000 simulations. Positive outlier SNPs were identified as those falling above the 99% confidence intervals of the null distribution. Second, the Bayesian approach implemented in BAYESCAN 2.1 (Foll and Gaggiotti, 2008) was performed using default Markov Chain Monte Carlo (MCMC) parameters. Following the suggestion of Foll and Gaggiotti, (2008) to minimize false positives, we used the "decisive" criterion under Jeffreys' scale of evidence (Jeffreys, 1961) to identify outliers. Loci were considered to be under directional selection if identified as outliers using both methods, as the detection of outliers through multiple methods increases the confidence that these loci are non-neutral. The results between global and pairwise sample tests were compared to confirm consistency of outlier designations. Functional significance of loci was investigated using BLAST in GenBank following Milano et al. (2014).

Geographical structuring of genetic variation
Genetic differentiation among samples was quantified by global and pairwise F ST (Weir and Cockerham, 1984) with statistical significances evaluated in ARELQUIN with 10 000 permutations. Genetic structuring was also investigated using the Bayesian clustering method in STRUCTURE 2.3.4 (Pritchard et al., 2000), which identifies the most probable number of genetically distinct groups (K) represented by the data and estimates assignment probabilities (Q) for each individual (specifically their genomic components) to these groups. The analysis can be run without prior information; however, incorporating prior information using the LOCPRIOR model allows the clustering algorithm to assume that the probability of assignment varies among samples thereby increasing power while not biasing results (Hubisz et al., 2009). The analysis was performed with and without the LOCPRIOR model, in both cases assuming admixture. Simulations were run 10 times for each proposed value of K (1-5) to assess convergence. Each run had a burn-in of 100 000 MCMC samples followed by 1 000 000 MCMC repetitions. Models were assessed using DK (Evanno et al., 2005). To complement the STRUCTURE analysis, classical assignment tests were performed in GENECLASS 2.0 (Piry et al., 2004). The "detect migrants" function was used to identify first generation migrants born in a population other than the one in which they were sampled Piry et al., 2004). The test statistic L h was used following Paetkau et al. (2004) and employing Nei's D A (Nei et al., 1983), which does not require conformance to HWE. The simulation method in GENECLASS (10 000 simulations) was also used to exclude individuals from a candidate population if that individual's probability of assignment to that population fell below a 0.05 threshold. Genetic structuring was also assessed using the discriminant analysis of principal components (DAPC) implemented in ADEGENET (Jombart et al., 2010). Whereas STRUCTURE assigns cluster membership by minimizing Hardy-Weinberg and linkage disequilibria within clusters, DAPC employs fewer assumptions and simply maximizes differences between groups while minimizing differences within groups with the optimal number of groups identified using the Bayesian Information Criterion (BIC).

MTDNA variation
Alignment of 633 bp sequences across 93 individuals revealed 137 variable sites that defined 91 haplotypes, of which 89 were found in a single individual. BLAST confirmed that all sequences were yellowfin tuna. Haplotype diversity was high (h ¼ 0.999) across samples (Table 1). Global and pairwise tests of differentiation among samples yielded non-significant values (global

Nuclear variation
A total of 179 521 855 raw sequenced reads across 96 yellowfin tuna were obtained with between 56 871 and 10 442 486 raw reads (average ¼ 1 870 018) per individual. Seven individuals from the WC sample with >60% of identified SNPs missing were removed from the dataset, resulting in a dataset of $30 000 SNP loci for 89 individuals. The number of genotypes was further reduced by removing SNP loci that were missing in >10% of individuals; this resulted in a final SNP genotype dataset of 11 101 SNP loci for 89 fish with a mean of 10 495 (SD ¼ 398) SNPs per individual. Levels of genetic variability were similar across samples, and all samples conformed to HWE (Table 1). Global genetic structuring was weak (F ST ¼ 0.002) but statistically significant (p < 0.0001). Pairwise tests revealed this significant structuring largely resulted from the divergent GG sample, which showed significant F ST values with all other samples ( Table 2). All other pairwise F ST values were not significant, and excluding GG global structuring was not significant (F ST ¼ À0.0002).
The LOSITAN outlier analyses consistently reported a greater number of positive outlier loci than the BAYESCAN analysis in the various global and pairwise tests. For example, in the initial global outlier test LOSITAN identified 756 positive outliers compared with 11 by BAYESCAN. In all cases, outliers identified by BAYESCAN were also among the outliers identified by LOSITAN. Therefore, we restricted our discussion to the smaller number of outliers identified using BAYESCAN, as the consensus outliers were ultimately defined by the BAYESCAN results. The results of outlier tests revealed a clear pattern that aligned with the differentiation of GG from the remaining samples. Firstly, no outliers were identified in global tests excluding the GG sample compared with the detection of 11 in global test including GG. Second, in tests between pairs of samples outliers were only identified in comparisons involving GG. All outliers identified in pairwise tests were among the 11 identified initially in the global test and certain outlier were common across different pairwise comparisons. BLAST analysis did not provide insight into potential functional associations of any of the outlier loci.
When outlier loci were excluded, the pattern of significant differentiation of the GG sample from all other (non-differentiated) samples was still evident in F ST values (Table 2; Figure 2a) and PCA clustering (Figure 2b). Analysis of the 11 outlier loci reported considerably higher levels of differentiation of the GG sample while retaining the pattern of homogeneity among the remaining samples (Table 2; Figure 3a and b). Bayesian clustering analysis of outlier genotypes unanimously supported a model of K ¼ 2, with identical results for the analyses with and without LOCPRIOR. Under K ¼ 2, all non-GG fish strongly assigned to group B with high individual Q values, whereas the majority of GG fish had higher or intermediate (overlapping confidence  intervals for Q values) assignment probabilities of belonging to group A (Figure 4). Four individuals in the GG sample had larger Q membership values for group B. On the basis of the lack of differentiation among the SAF/Indian Ocean samples, the samples were pooled for the classical assignment tests between GG and the remaining samples. Assignment tests identified the same four GG individuals that had higher group B Q values as being first generation migrants from the pooled SAF/Indian Ocean group. No other migrants were detected with all individuals correctly assigned to the CG or Pooled group. Exclusion analysis using the strict 0.05 threshold further highlighted the genetic differentiation between both groups; first, of 16 non-migrants in the GG sample, 13 could be excluded from the Atlantic group, second, 42 of 69 non-migrants in the Atlantic/Indian ocean group could be excluded from GG.

Discussion
Mismatches between fishery management units and biological population processes of dispersal, interbreeding and adaptation are recognized as fundamental threats to fishery sustainability and to long-term resilience to climate change (Reiss et al., 2009). In light of global declines in yellowfin tuna numbers and continued intensive harvesting, a primary objective of our study was to assess the biological validity of the current Atlantic Ocean and Indian Ocean operational stocks through genomic analysis of fish spanning the current stock boundary off the south coast of South Africa. The salient findings were (i) a lack of population structuring among the SAF samples, and (ii) genetic differentiation of the GG population from the SAF population. These patterns were evident in F ST estimates from the entire SNP and neutral SNP data sets. A major hindrance to the integration of genetics into fishery management has been the inability of traditional F ST -based measures to infer short-term demographics because these measures reflect connectivity over multiple generations (Whitlock and McCauley, 1999;Hellberg, 2009;Berry et al., 2012). Additionally, divergence is expected to be slow for large populations experiencing little genetic drift. Increasingly, individual-based kinship and assignment tests are being used to directly assess dispersal (Christie et al., 2010). Combining assignment tests with outlier loci can provide high discriminatory power even with low levels of background differentiation among populations (Helyar et al., 2011;Bekkevold et al., 2015). This approach revealed two genetic clusters in which all SAF specimens were assigned to one cluster with the other comprising only GG fish. Collectively the neutral and non-neutral genetic data concurrently resolved two genetic groups.
On the basis of our sampling scheme, it can be assumed that the two genetic groups (GG vs. the rest) correspond respectively to the distinct Atlantic and Indian ocean genetic stocks previously suggested by allozyme variation (Ward et al., 1997) and confirmed by genomic studies (Pecoraro et al., 2016;Barth et al., 2017). The results therefore demonstrate that Western Cape yellowfin tuna, currently assigned to the Atlantic operational stock, are derived from an Indian Ocean genetic stock. Tagging data corroborate the movement of yellowfin tuna individuals from the western equatorial Indian Ocean to the southern Benguela region off western South Africa Murua et al., 2015). The concordant results among the present and previous genetic studies and the temporal sampling range represented point to a temporally stable pattern in which the Atlantic population is not contributing to recruitment in SAF waters. Nevertheless, it cannot be ruled out that there may be seasonal variation in stock mixing or overlap.
The identification of isolating mechanisms is a crucial facet for inferring demographic independence with a central topic in fisheries genetics being the relative roles of environmental forcing and behaviour (Heath et al., 2008;McKeown et al., 2017). The spatial genetic structure reported here and in Barth et al. (2017) is compatible with a role for the cool upwelled waters of the BCS as a potential barrier to dispersal of Atlantic yellowfin tuna into SAF waters. Cold water barriers have been reported for several marine  Genomic analysis of yellowfin tuna reveals multiple mismatches between biological and management units fishes (Gwilliam et al., 2018). Big eye tuna, T. obesus, is an interesting comparison because genetically distinct Atlantic and Indian Ocean fish mix in SAF waters. Yellowfin and bigeye tuna have similar geographical distributions but occupy different ecological niches and vertical habitats (Schaefer et al., 2009); bigeye tuna are more tolerant of cooler water temperatures than are yellowfin tuna (Brill and Lutcavage, 2001). Hence, the cold water of the BCS may have less influence on the movement of bigeye tuna from Atlantic Ocean waters north of the BCS into SAF waters, than on yellowfin tuna. Barth et al. (2017) reported evidence of asymmetric Indo-Pacific-to-Atlantic gene flow across the BCS in yellowfin tuna, and Reid et al. (2016) described a similar pattern in the cosmopolitan bluefish (Pomatous saltatrix). In the present study, the Bayesian clustering analysis revealed patterns compatible with unidirectional gene flow from South Africa to GG. However, such gene flow is typically regarded as being historical and no studies have yet revealed recurrent dispersal of adults across the BCS in species for which the BCS is a barrier to contemporary gene flow. The multilocus identification of four Indian Ocean yellowfin tuna among GG fish is striking as it shows that Indian Ocean spawned yellowfin tuna are crossing the BCS and are mixing with members of the Atlantic genetic stock. These dispersals may be linked to Agulhas warm-core rings (Schouten et al., 2000) that can facilitate the transfer of larvae/adults across the BCS and/ or to active swimming. In the latter case, unidirectional dispersals may be facilitated by distinct genetic/plastic thermal tolerances of Indian Ocean spawned fish. This suggests that a cryptic mixedstock fishery exists north of the BCS. Temporal studies are needed to assess the consistency with which Indian Ocean migrants cross the BCS.
The maintenance of genetic integrity of the Atlantic Ocean stock in the face of potentially high rates of dispersal of Indian Ocean genotypes implicates additional mechanisms that may be isolating both stocks. The capacity for active dispersal of adults highlights the potential importance of natal homing (Gaggiotti et al., 2009;McKeown et al., 2017). Indian Ocean fish may migrate into Atlantic spawning areas but return to the Indian Ocean to spawn. If this is occurring samples collected during spawning times should contain fewer migrants than at other times. In this study the GG sample was collected outside of the spawning period and so comparative analysis of a spawning sample from this area could be highly informative regarding the role of homing. The two genetic stocks may also be isolated by adaptation (Nosil et al., 2009) with migrant offspring being selected against (Hendry, 2004). The increased level of genetic divergence between groups at outlier loci might be explained by directional selection, with the resolved groups corresponding to differentially adapted units (Milano et al., 2014). Alternatively, outlier loci may reflect endogenous forces stemming from pre-and/or post-zygotic incompatibilities between populations with different genetic backgrounds to produce intrinsic incompatibilities, rather than reflecting adaptive environmental selection through genotypeenvironment associations (Bierne et al., 2011). However, adaptation and intrinsic incompatibilities are not mutually exclusive. Along with natal homing, the identification of slowly changing isolating factors is important, as these factors may lead to resistance to hybridization, colonization of new habitats, and by extension, recovery of stocks (e.g. Svedäng et al., 2007).

Applications and management implications
The use of management units that include only a portion of a larger population can present problems with understanding population stock dynamics and environmental linkages (Frisk et al., 2008). Conversely, management units containing multiple biological populations can lead to inaccurate descriptions of population-specific abundances and productivity, and may conceal the declines of vulnerable populations (Kell et al., 2009;Ying et al., 2011). Our study reveals that the management of the yellowfin tuna fishery around southern Africa presents both types of problems. At a local level, it is recommended that the South Africa Department of Agriculture, Forestry and Fisheries report all future yellowfin tuna catches in SAF waters to the IOTC for inclusion in Indian Ocean stock assessments. Our results indicate that the SAF catch is derived entirely from Indian Ocean spawned fish. It may be more accurate to move the boundary between the Atlantic and Indian Ocean management stocks to the western extent of South Africa's EEZ at 13.35 E (Figure 1). At a regional level, it is essential to understand the proportions of individuals derived from the Atlantic and Indian Ocean genetic stocks over space and time in the mixed-stock fishery north of the BCS. The proportion of "consensus" outliers identified here (0.9%) was generally smaller than the proportions of outlier loci reported by other studies (5.2%, Bradbury et al., 2013;4.5%, Milano et al., 2014;3.65%, Hess et al., 2013;0.99%, Guo et al., 2015). These outlier loci represent a resource that can be used to identify migrants and to characterize mixed-stock dynamics. At present the indiscriminate harvesting of both stocks in the region means that the Atlantic spawning stock biomass is overestimated.