Ploidy Variation and Its Implications for Reproduction and Population Dynamics in Two Sympatric Hawaiian Coral Species

Abstract Standing genetic variation is a major driver of fitness and resilience and therefore of fundamental importance for threatened species such as stony corals. We analyzed RNA-seq data generated from 132 Montipora capitata and 119 Pocillopora acuta coral colonies collected from Kāneʻohe Bay, Oʻahu, Hawaiʻi. Our goals were to determine the extent of colony genetic variation and to study reproductive strategies in these two sympatric species. Surprisingly, we found that 63% of the P. acuta colonies were triploid, with putative independent origins of the different triploid clades. These corals have spread primarily via asexual reproduction and are descended from a small number of genotypes, whose diploid ancestor invaded the bay. In contrast, all M. capitata colonies are diploid and outbreeding, with almost all colonies genetically distinct. Only two cases of asexual reproduction, likely via fragmentation, were identified in this species. We report two distinct strategies in sympatric coral species that inhabit the largest sheltered body of water in the main Hawaiian Islands. These data highlight divergence in reproductive behavior and genome biology, both of which contribute to coral resilience and persistence.


Introduction
Given ongoing climate change, it is critical to understand how rapidly changing ocean conditions impact coral population biology and resilience and how the innate adaptability of coral populations may contribute to their persistence (Cant et al. 2021;Fischer et al. 2021). For coral reef ecosystems, which depend on the nutritional symbiosis between scleractinian coral hosts and their single celled dinoflagellate (algal) endosymbionts in the family Symbiodiniaceae (LaJeunesse et al. 2018), thermal stress may lead to dysbiosis and mortality. This phenomenon is known as coral "bleaching," whereby symbiotic cells and pigments are expelled or lost from the host tissue, leaving the bright white color of the underlying coral animal body and skeleton (van Oppen and Lough 2009). Bleaching is the primary cause of mass coral mortality (Hughes et al. 2017). Coral reefs are also threatened by ocean acidification resulting from the increased amount of CO 2 in the atmosphere that dissolves in the surface ocean, changing carbonate chemistry and lowering the pH (Hoegh-Guldberg et al. 2007). Understanding the mechanisms that underlie coral response to long-term environmental stress is, however, challenging, given the genetically diverse collection of organisms (cnidarian animal host, algal symbionts, prokaryotic microbiome, fungi and other eukaryotes, and viruses) that comprise the holobiont and contribute to its health and resilience (Veron 2000). Furthermore, corals are impacted by persistent abiotic stresses (e.g., diurnal and seasonal light and temperature variation) and a plethora of interacting taxa (e.g., algae, fish, and viruses) that are of nonholobiont provenance, making these complex models for field studies.
We previously generated high-quality genome assemblies from two Hawaiian coral species (Stephens et al. 2022). The first is at chromosome-level from the rice coral, Montipora capitata, and comprises 14 large scaffolds that likely represent the 14 chromosomes predicted in other Montipora species (Kenyon 1997). The second is from the cauliflower coral, Pocillopora acuta, and is the first polyploid (i.e., triploid) genome assembly generated from Scleractinia. Whereas the mechanisms that give rise to polyploidy in corals, and its effects on organismal fitness, are unknown, it can result from genome duplication within a species (autopolyploidy) or from hybridization of two different species (allopolyploidy). This process often precipitates drastic changes in cell organization and genome structure and can alter gene expression, genome stability, cell physiology, and the cell cycle (Wertheim et al. 2013). In some animals, triploidy may be beneficial with respect to improved growth and pathogen resistance (Kang and Rosenwaks 2008). This observation increases interest in corals with respect to how changes in their genomic configuration may contribute to the evolution of stress resistant genotypes. To advance understanding of ploidy variation in corals and differences in reproductive strategies of sympatric species, we generated and analyzed RNA-seq data from fragments (i.e., nubbins) of M. capitata and P. acuta colonies collected from across six reefs in Kāneʻohe Bay, a 45 km² sheltered water body in Oʻahu, Hawaiʻi. Analysis of singlenucleotide polymorphisms (SNPs) in each coral sample was used to investigate genetic diversity, ploidy, and reproductive strategy in these two sympatric species.

Ploidy Differences in Kāneʻohe Bay Corals
Transcriptome data were collected from 119 P. acuta  1C). Analysis of the P. acuta RNA-seq data using the program nQuire predicted (using data preand postdenoising) that 44 (37% of the 119 total) samples are derived from diploid genets (i.e., at genomic loci with 2 alleles, each present in ∼50% of the reads, producing an allele frequency distribution with a single peak at roughly 0.5; supplementary fig. S1A and B, supplementary table S3, and supplementary data S1, Supplementary Material online). In contrast, 75 (63%) samples are from triploid genets (i.e., at genomic loci with 2 alleles, 1 present in ∼33% and the other in ∼66% of the reads, producing an allele frequency distribution with 2 peaks at roughly 0. 33  to be a triploid, was used to generate the P. acuta reference genome (Stephens et al. 2022). This genome was shown to be triploid using k-mer-based methods (see Stephens et al. 2022). The presence of this sample in our analysis supports the accuracy of our approach for ploidy determination using RNA-seq data. nQuire predicted that all additional Hawaiian P. acuta samples (n = 32) from sites outside of Kāneʻohe Bay, acquired from the NCBI Sequence Read Archive (SRA; see Materials and Methods), were diploid (supplementary fig. S1A fig. S1F, Supplementary Material online). Instead, the distribution has no visible middle peaks but does have a much higher frequency of alleles at the tails of the distribution. The distribution also has a higher frequency of alleles with values around 0.1 and 0.9, with an increase in the frequency of alleles occurring around 0.2 and 0.8, which is not observed in any of the other M. capitata samples which all have a single (approximately) uniform peak at 0.5 and no increase in the frequency of alleles with values around the tails of the distribution. This result is likely explained by the sample being derived from a chimeric colony, which also likely explains the variability in the prediction of the ploidy of Mcapitata_ATAC_TP11_1644, although obviously to a lesser degree (see Discussion for a description of ploidy vs. chimeric allele frequency distribution patterns). Montipora capitata samples (n = 27, see Materials and Methods for criteria) downloaded from SRA to compare with this study were predicted by nQuire to be diploids (supplementary table S3 and supplementary data S4, Supplementary Material online). One of the M. capitata samples (SRR5453755) was identified as a triploid in the non-denoised data, although this is likely caused by the sample being from a colony that is comprised of multiple genets.
Population Structure of P. acuta in Kāne'ohe Bay Several approaches were used to make pair-wise comparisons of the 119 P. acuta RNA-seq samples to assess relatedness and determine if any were derived from clones (i.e., colonies derived from the same genet), given indications of clonal relationships in past studies (Yeoh and Dai 2010;Combosch and Vollmer 2013;Gorospe and Karl 2013). Sample relationship was initially assessed using the proportion of shared SNPs ( fig. 2A; see Materials and Methods), with a threshold of >94% used to aggregate samples into groups that are assumed to represent clonal samples. This threshold was chosen based on the distribution of shared SNPs (fig. 2B) between each pair-wise combination of samples; the set of pair-wise comparisons captured by this threshold (regions shaded bright yellow in fig. 2) is clearly separated from the other distinct sets of comparisons observed in the distribution. This threshold is very close to the 95% similarity threshold applied in another coral study (Locatelli and Drew 2019). In total, there are eight clonal groups (Groups 1-8) which comprise 113/119 (94.96%) of the P. acuta samples ( fig. 2). Groups 1-4 are triploids, whereas Groups 5-8 are diploid. Only two triploid and four diploid samples were ungrouped. Generally, the ungrouped samples had higher similarity with samples of the same ploidy; however, the diploid sample Pacuta_HTHC_TP5_1415 had higher similarity to the triploid samples compared with the diploid samples ( fig. 2). An additional 32 P. acuta samples (collected from locations not in Kāne'ohe Bay) were downloaded from SRA and incorporated into the SNP analysis. These samples were all derived from a single experiment (BioProject: PRJNA435468; Poquita-Du et al. 2019) and represent three genotypes. Each genotype was collected from a separate reef near Singapore and had been fragmented into multiple ramets, each of which underwent RNA sequencing as part of a stress experiment (supplementary table S4, Supplementary Material online). The proportion of shared SNPs between the samples derived from each of the three genotypes was ∼98%, which is very similar to the values observed between many of the putative clonal samples generated in this study (e.g., ∼97% between the samples in The relatedness metric produced by vcftools (originally proposed by Manichaikul et al. 2010; see Materials and Methods) agrees with the relationship between samples established in figure 2, with clear grouping within, and separation of samples between, the identified groups ( fig. 3). In addition, the samples in each of the groups all have relatedness values > 0.43; values of 0.5 denote samples that are monozygotic twins (i.e., perfect clones), suggesting that the samples in each of these groups are clones, albeit not identical, because each colony has accumulated some segregating variants (Vasquez Kuntz et al. 2022). Furthermore, the majority of P. acuta samples have a relatedness of around 0.25 (equivalent to the relatedness of parent and offspring or full siblings), with almost all samples having a relatedness > 0.06 (third degree relatives; fig. 3). It is also notable that within each of the groups (in particular, the larger Groups 2, 3, and 6), there appears to be subgroups of samples that have slightly higher similarity with each other ( fig. 3 fig. S3, Supplementary Material online). This result supports our hypothesis that these groups represent samples from colonies that have spread throughout the bay via asexual reproduction. Furthermore, the relatedness between the three SRA genotypes, and between the SRA genotypes and the samples generated in this study, is at, or close to, 0 (i.e., are unrelated individuals).
The program PCAngsd identified four ancestral populations of P. acuta in Kāne'ohe Bay. The admixture results ( fig. 4) are consistent with the groups shown in figure 2, with uniformity of ancestral population profiles within each group and separation of profiles between different groups. Interestingly, the ancestry of Pacuta_HTHC_TP5_ 1415 (the diploid sample that clustered with triploids, based on the SNP similarity scores) is derived from the same ancestral populations as triploid Group 1, albeit with an increased abundance of the second ancestral population. This analysis demonstrates that the ancestry of each clonal group is distinct and that they have all likely arisen from separate asexual propagation events that occurred in different ancestral lineages. Principal component analysis (PCA) performed by PCAngsd using the Kāne'ohe Bay P. acuta samples ( fig. 5) supports the pair-wise similarity and admixture results ( fig.  4). That is, the samples in each clonal group form clusters along both PC1 and PC2 and the clonal groups are separated from each other. Whereas the triploid clonal groups (Groups 1-4) are clearly separated from diploid clonal groups (Groups 5-8) along PC1 and PC2, the difference between the largest triploid groups (Groups 2 and 3) is roughly the same as that between these groups and the largest diploid group (Group 6). Reanalysis using a single sample per clonal group with the highest read mapping rate to the reference genome is consistent with the results produced using all samples (supplementary fig. S4, Supplementary Material online). The representative samples have congruent ancestral population profiles (albeit with only two ancestral populations inferred and not four; possibly due to the significantly reduced number of samples used in the analysis) and relative positions in the inferred PCA plots. These results reinforce our hypothesis that the majority of Kāne'ohe Bay P. acuta samples are derived from colonies that have arisen via asexual reproduction and that these events are likely to have occurred in separate related (i.e., previously mixing) lineages over an extended (currently unknown) period of time.

Discussion
In this study, we generated RNA-seq data from colonies of P. acuta and M. capitata collected from six reefs distributed across Kāneʻohe Bay, Oʻahu, Hawaiʻi. We report significant differences in ploidy and reproductive strategies between the two sympatric species, with P. acuta derived from a mix of diploid and triploid clonal lineages and M. capitata being a highly heterozygous, panmictic outbreeder.

The Adaptive Advantage of Triploidy in Corals Is Currently Not Known
The role of triploidy (or any form of polyploidy) in corals is not well understood. Although triploids are rare in wild populations, they occur frequently in commercially farmed plants and animals, such as oysters and some banana cultivars, often conferring beneficial commercial traits such as improved growth, pathogen resistance, and through infertility, protection of superior, adapted genotypes (Kang et al. 2013). Triploids may also enhance the rate of autotetraploid formation (Husband 2004). Triploids occur in the coral Acropora palmata and may be a path to generating different ploidy levels in different members of this genus (Kenyon 1997;Baums et al. 2005). Our results show that triploidy is common in Kāneʻohe Bay P. acuta (supplementary fig. S1 and supplementary table S3, Supplementary Material online) and has a higher abundance (63% at the sites sampled) than diploids (only 37%). This stands in clear contrast to M. capitata, which is completely (barring a single, possibly, chimeric sample) diploid. All methods for assessing sample relatedness (i.e., shared SNPs, relatedness metrics, PCA, and admixture analysis) predict that diploid samples are different from triploids, that is, there is clear separation between these groups (figs. 2, 3, 4, and 5). The only exception is a single diploid sample (Pacuta_HTHC_TP5_1415) that has higher similarity to triploid than diploid samples (although not high enough to be considered part of the closely related triploid Group 1). This individual could be an example of reversion (i.e., from triploid to a diploid) or be the extant member of the progenitor lineage of triploid Group 1. Our results suggest that the diploid P. acuta are both sexual outbreeders and generate asexual brooded larvae, as previously described (Richmond and Jokiel  Nakajima et al. 2018). Triploidy may have arisen from selffertilization of a P. acuta egg, followed by fertilization by a foreign sperm, or one of the two gametes was diploid and provided two closely related sets of alleles. Alternatively, failure of the ovum to extrude the second polar body after fertilization could lead to triploidy. These are the most common mechanisms for generating triploid plants and animals (Rosenbusch 2008;Carson et al. 2018). The evolution of triploid genotypes in P. acuta may be explained by adaptation to local conditions in Kāne'ohe Bay, possibly allowing them to outcompete ancestral diploid genotypes.
It is plausible that SNP allele frequency distributions, which are the basis of our estimation of ploidy, are explained by chimeric P. acuta colonies. Evidence exists for chimerism in corals through fusion of two or more genetically distinct individuals (Willis et al. 2006;Rinkevich et al. 2016;Oury et al. 2020) as well as mosaicism via somatic cell mutations (Willis et al. 2006;Schweinsberg et al. 2015). Nonetheless, the two-peaked SNP distributions of P. acuta are difficult to explain under chimerism because the fused colonies would have to comprise roughly equal amounts of one haploid and one diploid individual to generate this result, which is unlikely to have occurred in so many closely related colonies from across the bay. If cells from one of the fused colonies were present at a much higher frequency than the other (which is more likely than them having equal proportions), then we would see an increase in the frequency of SNP alleles with support toward the ends of the distribution (as observed for the one, putative chimeric M. capitata sample). In addition, k-mer analysis of the reference triploid genome of P. acuta from Kāne'ohe Bay (Stephens et al. 2022) provides results that are consistent with our current findings. Given these results, we hypothesize that the most likely scenario to explain our data is triploidy in many P. acuta individuals, rather than fused/mixed diploids. Our results also suggest that P. acuta in Kāne'ohe Bay almost exclusively undergoes asexual reproduction, with only a few genets giving rise to colonies in the bay. This "genotypes everywhere" result has previously been found for Kāneʻohe Bay Pocillopora damicornis populations (Gorospe and Karl 2013). Using microsatellite data, these authors studied a single reef and found that >70% of the colonies comprised seven genotypes with high clonal propagation. Neighboring reefs however conformed to a genetic panmixia model with no interreef genetic structure. Our results support this model, showing the existence of at least eight groups of P. acuta samples (with each group representing a genet that has given rise to multiple independent colonies [ramets]) with broad distribution across Kāneʻohe Bay. The presence of a limited number of genets in the bay, and the absence of isolation by distance, even when individual reefs show genetic structure is puzzling. This result may be explained by microhabitat variability that selects for particular genotypes that occupy specific . The relatedness values that correspond to a particular relationship between samples (described in Manichaikul et al. (2010)) are annotated on the histogram using vertical red lines. The colors used in the heatmap and histogram were manually chosen to highlight the distinct sets of relatedness scores observed in the histogram presented in (B). niches in each reef (Gorospe and Karl 2011). These locally adapted genotypes disperse via asexual reproduction given that no major barriers exist for larval dispersal in Kāneʻohe Bay. This result might also be explained by a genetic bottleneck. A recent natural event, such as severe bleaching that caused mass coral mortality (e.g., the 2014-2015 Kāneʻohe Bay bleaching event (Bahr, Jokiel, Toonen 2015)), may have removed much of the P. acuta from this region. The subsequent repopulation of Kāneʻohe Bay by surviving corals, or recolonization from different regions, coupled with asexual reproduction, would result in the observed, low genetic diversity. In addition, this would also explain why all the P. acuta samples analyzed in this study, even those not in the same clonal group, have relatively high relatedness. The majority of samples (even between diploids and triploids) has relatedness values around 0.25 (i.e., the relatedness expected between parent and offspring or full sibling; fig. 3 (Concepcion et al. 2014;Caruso et al. 2021). A recent survey of nearly 600 colonies of this species in Kāneʻohe Bay found very few clonal individuals and no evidence of isolation by distance (Caruso et al. 2021). Colonies that were potentially derived from the same genet were almost exclusively found at the same collection site, consistent with our observations.

Study Limitations
Our study makes extensive use of RNA-seq data that was originally generated as part of a mesocosm experiment not relevant to the results of this research. Whereas RNA-seq data are not commonly used in population genetics, in this case, we believe that they provide valuable insights into coral biology that can inform follow-up DNA-based sequencing projects. We acknowledge however that ploidy is more challenging to interpret using RNA-seq data. We have previous described, using DNA sequencing data, a triploid P. acuta genet from Kāneʻohe Bay (Stephens et al. 2022), which was included in this study and was identified using RNA-seq data as a triploid. To the best of our knowledge, all bioinformatic approaches for ploidy determination (such as nQuire and visualization of allele frequencies, which were used by this study) are designed for use with DNA, not RNA data. Thus, we cannot fully discount allele-specific expression (ASE) as an alternative explanation for the patterns that we observe. However, we believe it is unlikely that ASE has affected our results, for the following four reasons: 1) we have clear DNA evidence for triploidy from one of the samples (Stephens et al. 2022).
2) If ASE is affecting our results, it would have to be strongly affecting some groups of samples and not others (i.e., ASE is only occurring in putative triploid and not diploid lineages and not in the triploid with DNA evidence). 3) ASE typically occurs at different rates across the genome, that is, ASE produces an uneven distribution of expression ratios. Our results suggest that all loci are being affected at the same rate, which would support variation in chromosome copy number and not locus-specific allelic expression modification. And 4), coral genomes have relatively low rates of methylation (Trigg et al. 2022), with 11.4% of CpG sites in the M. capitata genome being methylated, with this value being 2.9% in P. acuta. Given that methylation would be the most obvious mechanism for ASE, the low methylation rate in P. acuta makes ASE a less likely explanation for the ploidy results.
Regarding the analysis of samples with mixed ploidy, few of the available population genetic techniques accept nondiploid data and, none that we are aware of, accept data with mixed ploidies. As a result, all samples were treated as diploid, which may adversely affect results from the putative triploid samples because it would bias our analysis to just biallelic sites. However, given that we expect most variant sites in the genome to be biallelic (because multiple mutations occurring at a single site to create a multiallelic site are less likely than a single mutation to create a biallelic site), we believe that our approach is valid given the current techniques and data available. Furthermore, given that a variety of data analysis tools were used, and all led to the same conclusions, we believe our results are robust. These insights should prove valuable for designing DNA-based studies that focus on generating additional population genetic data not only from Kāneʻohe Bay but also from other locations in the Hawaiian Islands. There is currently very little data available for P. acuta, preventing us from comparing our results with other populations or studies done in this region.

Final Remarks
The data presented in this study underline how selection may be acting in a divergent manner to forge ecologically successful lineages. We find that two sympatric species, living in a sheltered Hawaiian bay, follow disparate strategies that enable their persistence in an environment that is strongly impacted by human activity, including warming events, freshwater incursion, and dredging (Bahr, Jokiel, Toonen 2015). Montipora capitata relies on strict outbreeding to generate high standing genetic variation, likely as a "defense" against changing local environments. In contrast, P. acuta appears to undergo periodic polyploidization events, perhaps triggered by local stress, that putatively generate fitter, clonal groups that allow persistence (or reestablishment after stressful events) of populations in Kāneʻohe Bay. The next steps in this research are to expand our understanding of how these patterns relate to organismal fitness by studying the response of Hawaiian corals with divergent genotypes to the same regime of environmental stress.

Sample Processing
The coral samples (one ∼5 × 5 cm fragment per colony) were collected from six reef areas ranging across the north to south span and fringing to patch reefs of Kāneʻohe Bay ( fig. 1C) under Hawaiʻi Department of Aquatic Resources Special Activity Permit 2019-60, between September 4 and 10, 2018. RNA was extracted from the snap frozen nubbins and stored at −80 °C. A small piece was clipped off using clippers sterilized in 10% bleach, deionized water, isopropanol, and RNAse free water and then placed in 2-mL microcentrifuge tube containing 0.5-mm glass beads  with 1,000 μL of DNA/RNA shield. A two-step extraction protocol was used to extract RNA, with the first step as a "soft" homogenization to reduce shearing RNA. Tubes were vortexed at high speed for 1 and 2 min for P. acuta and M. capitata fragments, respectively. The supernatant was removed and designated as the "soft extraction". Second, 500 μL of DNA/RNA shield was added to the bead tubes and placed in a Qiagen TissueLyser for 1 min at 20 Hz. The supernatant was removed and designated as the "hard extraction". Subsequently, 300 μL of sample from both soft and hard homogenate was extracted with the Zymo Quick-DNA/RNA Miniprep Plus Kit. RNA quality was measured with an Agilent TapeStation System. RNA-seq samples were sequenced by GENEWIZ (Azenta; https://www.genewiz.com) using the Illumina NovaSeq 6000 platform.

Processing of Coral Data Not Generated in This Study
Additional RNA-seq samples were acquired on June 15, 2022, from NCBI's SRA database for use as outgroups in downstream analysis. A list of all sequencing "Runs" from scleractinian species were acquired by searching the NCBI's SRA database using the following search term: "Scleractinia[Organism]" (without the double quotes). The resulting list of 19,050 entries was filtered, keeping only Runs that were generated on an Illumina platform that had a library strategy of "RNA-Seq", a library layout of "PAIRED", and >1.5 billion "bases". These filters were chosen to keep the types of samples selected from SRA uniform with the samples generated in this study (i.e., paired-end RNA-seq reads generated on an Illumina platform). The 1.5 Gbp threshold was chosen as it is roughly half the minimum number of bases (∼3.2 Gbp; supplementary table S2, Supplementary Material online) generated for a single sample from this study before base quality filtering was applied; samples with less than this number of bases would likely have far fewer sites with sufficient coverage for downstream analysis so would be less informative and harder to interpret and integrate with the existing samples if included. Sample derived from colonies identified as P. acuta and M. capitata was extracted from the resulting list of filtered Runs using the species name listed in the "ScientificName" column. The "geographi-c_location" of each Run was extracted from its associated BioSample and used to identify Runs generated from samples collected from Hawaiʻi. Runs generated from colonies collected in Kāneʻohe Bay were excluded, as were Runs without a listed geographic location. As there were no Runs listed on SRA of Hawaiian P. acuta colonies that were not from Kāneʻohe Bay Runs from other non-Hawaiian locations were selected. This resulted in a total of 27 M. capitata and 32 P. acuta Runs (samples) that were used for downstream analysis (supplementary table S4, Supplementary Material online).
The samples were aligned against the P. acuta (V2) or M. capitata (V3) reference genomes using the same workflow as the RNA-seq samples generated in this study, with the only exception being that read-group information was added manually to the reads (using gatk AddOrReplaceReadGroups; setting read-group platform to be "illumina" and the read-group library, platform unit, sample name, and ID to be the SRA ID of the sample) instead of using rgsam to extract the information automatically from the read names. This was done because read-group information needs to be set for some of the downstream tools (such as gatk MarkDuplicates), but this information is not required or preserved during upload of read data to SRA, so cannot be reliably extracted from the downloaded samples. The resulting BAM files were used as the input for downstream population structure, sample relatedness, and ploidy analysis.

Variants (SNPs and insertion and deletion variants [INDELs])
were identified across the RNA-seq samples from each species (both generated in this study and from SRA) using the GATK (McKenna et al. 2010) (v4.2.0.0) framework, adhering to their best practices workflow (https://gatk. broadinstitute.org/hc/en-us/articles/360035531192?id = 4067) whenever possible. The base quality recalibration steps suggested in the GATK best practices workflow could not be applied to our samples because a set of expected high confidence variants (which is required for the recalibration process) is not available for either of the species being studied. Haplotypes were called using the aligned postprocessed reads (i.e., after alignment using STAR with read-group information added, duplicate reads marked, and reads spanning exon boundaries split) using gatk HaplotypeCaller (-dont-use-soft-clipped-bases -ERC GVCF). For each species, the VCF files produced by HaplotypeCaller (one per sample) were combined using gatk CombineGVCFs before being jointly genotyped using gatk GenotypeGVCFs (-stand-call-conf 30 --annotation AS_MappingQualityRankSumTest --annotation AS_ReadPosRankSumTest). For the gatk analysis, the ploidy of each sample was treated as diploid (even when the samples were believed to be triploid). This was done because the downstream tools are incapable of processing triploid variants or data sets with mixed diploid and triploid variants.

Analysis of Sample Ploidy
The ploidy of the RNA-seq samples generated in this study and acquired from SRA was assessed using nQuire (Weiss et al. 2018) (retrieved on July 7, 2021, from https:// github.com/clwgg/nQuire), which was run using the BAM files produced by gatk SplitNCigarReads (i.e., aligned RNA-seq reads that have had duplicates removed and that have been split if they span an intron-exon boundary). The aligned BAM files were converted into "BIN" files, filtering for reads with a minimum mapping quality of 20 and sites with a minimum coverage of 20 ("nQuire create -q 20 -c 20 -x"). Denoised BIN files were created using the "nQuire denoise" command run on the initial BIN files. The delta Log-Likelihood values for each ploidy model was calculated by the "nQuire lrdmodel" command for each of the initial and denoised BIN files (supplementary table S3, Supplementary Material online). A second round of analysis using nQuire was conducted to generate, for each sample, a distribution of the proportion of reads which support each allele at biallelic sites. Briefly, "BIN" files were created, filtering for reads with a minimum mapping quality of 20, sites with a minimum coverage of 20, and a minimum fraction of reads supporting an allele of 5% ("nQuire create -f 0.05 -q 20 -c 20 -x"). Denoised BIN files were created using the "nQuire denoise" command. The "nQuire view" command was used to extract the biallelic sites from the denoised BIN files. The number of aligned reads reported by nQuire that support each of the alleles at the biallelic sites was used to generate a distribution, for each of the samples, of the proportion of reads which support each of the alleles; this distribution was used to visually confirm the ploidy of each sample (supplementary data S1, S2, S3, and S4, Supplementary Material online).

Exploration of Population Structure
The population structure of the P. acuta and M. capitata samples collected from Kāne'ohe Bay and SRA was assessed using multiple approaches. The genotype likelihoods Ploidy Variation and Its Implications for Reproduction and Population Dynamics in Two Sympatric Hawaiian Coral SpeciesGBE Genome Biol. Evol. 15(8) https://doi.org/10.1093/gbe/evad149 Advance Access publication 11 August 2023 of the samples from each species were estimated by ANGSD (v0.935; "-GL 2 -doGlf 2 -doMajorMinor 1 -SNP_pval 1e-6 -doMaf 1") (Korneliussen et al. 2014) and used with PCAngsd (v1.10) (Meisner and Albrechtsen 2018) to perform PCA (with estimates for individual allele frequencies; default parameters), with the resulting covariance matrix used for visualization. The genotype likelihoods produced by ANGSD were also used with PCAngsd to perform an admixture analysis ("--admix --admix_alpha 50") of the samples from each species.
The variants produced by "gatk GenotypeGVCFs" were filtered using vcftools (v0.1.17; "--remove-indels --min-meanDP 10 --max-missing 1.0 --recode --recode-INFO-all") to remove indels, variants with low average read coverage across all samples, and sites which do not have called genotypes across all samples. A minor allele frequency (MAF) threshold was not applied to the data, because we knew a priori from preliminary analysis of the data that some of the P. acuta clonal groups are comprised of very few samples. That is, the use of a MAF (e.g., 0.05, which would remove alleles that appear in <5% of samples) would disproportionately remove variants that segregate the small clonal groups of samples (e.g., P. acuta Groups 7 and 8 which are each comprised of two samples [1.7% of the 119 samples]). This would reduce our ability to resolve the smaller groups of clonal samples in our data set, biasing our analysis toward resolving only larger clonal groups. By not applying a MAF threshold, we increase the chances of incorporating false positive variants into our analysis; however, this would likely only slightly reduce the relatedness between the samples and obscure their ploidy. Thus, it would only result in increased noise in the data, not inflation of sample relatedness or ploidy. Vcftools ("--relatedness2") was also used to compute the relatedness statistic developed by Manichaikul et al. (2010). Negative relatedness values produced by vcftools were converted to zero before downstream analysis. The number of SNPs shared between each pair of samples from each species was assessed using the "vcf_clone_detect.py" script from https://github.com/pimbongaerts/radseq (retrieved December 6, 2022), which was run using the vcftoolsfiltered VCF file. A threshold of >94% shared SNPs, chosen based on the distribution of shared SNPs across all pair-wise combinations of samples ( fig. 2), was used to aggregate samples into groups that were used for all downstream analysis.
For each species, two sets of analyses using ANGSD, PCAngsd, vcftools, and "vcf_clone_detect.py" were performed, one using only the RNA-seq samples produced in this study and the other using the RNA-seq samples from this study and the samples downloaded from SRA.

Supplementary Material
Supplementary data are available at Genome Biology and Evolution online (http://www.gbe.oxfordjournals.org/).