-
PDF
- Split View
-
Views
-
Cite
Cite
Bryan Piatkowski, David J Weston, Blanka Aguero, Aaron Duffy, Karn Imwattana, Adam L Healey, Jeremy Schmutz, A Jonathan Shaw, Divergent selection and climate adaptation fuel genomic differentiation between sister species of Sphagnum (peat moss), Annals of Botany, Volume 132, Issue 3, 18 August 2023, Pages 499–512, https://doi.org/10.1093/aob/mcad104
- Share Icon Share
Abstract
New plant species can evolve through the reinforcement of reproductive isolation via local adaptation along habitat gradients. Peat mosses (Sphagnaceae) are an emerging model system for the study of evolutionary genomics and have well-documented niche differentiation among species. Recent molecular studies have demonstrated that the globally distributed species Sphagnum magellanicum is a complex of morphologically cryptic lineages that are phylogenetically and ecologically distinct. Here, we describe the architecture of genomic differentiation between two sister species in this complex known from eastern North America: the northern S. diabolicum and the largely southern S. magniae.
We sampled plant populations from across a latitudinal gradient in eastern North America and performed whole genome and restriction-site associated DNA sequencing. These sequencing data were then analyzed computationally.
Using sliding-window population genetic analyses we find that differentiation is concentrated within ‘islands’ of the genome spanning up to 400 kb that are characterized by elevated genetic divergence, suppressed recombination, reduced nucleotide diversity and increased rates of non-synonymous substitution. Sequence variants that are significantly associated with genetic structure and bioclimatic variables occur within genes that have functional enrichment for biological processes including abiotic stress response, photoperiodism and hormone-mediated signalling. Demographic modelling demonstrates that these two species diverged no more than 225 000 generations ago with secondary contact occurring where their ranges overlap.
We suggest that this heterogeneity of genomic differentiation is a result of linked selection and reflects the role of local adaptation to contrasting climatic zones in driving speciation. This research provides insight into the process of speciation in a group of ecologically important plants and strengthens our predictive understanding of how plant populations will respond as Earth’s climate rapidly changes.
INTRODUCTION
Adaptation has long served as one of the foundational principles bolstering the theory of evolution by natural selection and can drive the process of speciation to generate biodiversity. For example, divergent selection in populations of a species distributed along an environmental gradient might favour different alleles such that individuals are fitter in their own habitat compared to others originating elsewhere, a process termed local adaptation (Savolainen et al., 2013). Over time, epistatic interactions involving these loci could result in genetic incompatibilities that act to reduce gene flow between populations and thereby promote speciation (Orr, 1996). While there is much debate about what constitutes a ‘species’ (see De Queiroz, 2007; Mishler and Wilkins, 2018), it is widely understood that clades of organisms differ from one another genetically and that molecular evolution plays an important role in cladogenesis (Presgraves et al., 2010). Here, we use evolutionary analysis of genome sequence data to better understand the process of speciation in two sister species of peat moss.
Over the nearly two centuries since Darwin’s treatise on the origin of species (Darwin, 1859), different models of speciation have emerged to explain this phenomenon (reviewed in Gavrilets, 2014) but all involve the evolution of mechanisms that limit or abolish gene flow between populations (Coyne and Orr, 2004). Speciation need not involve natural selection, as neutral evolutionary processes can cause reproductive isolation to evolve in populations such as those in allopatry that are separated by geographical barriers (Mayr, 1942). However, ecological adaptation frequently drives the evolution of reproductive isolation via divergent selection on new mutations, the fixation of different alleles in response to similar pressures or selection on standing variation (Schluter and Conte, 2009; Schluter et al., 2009). Despite numerous examples of such ecological speciation in various eukaryotic lineages (Dettman et al., 2007; Jones et al., 2012; Wright et al., 2013), understanding the relative importance of adaptive and neutral processes in generating new units of biodiversity remains largely an empirical endeavour. Nevertheless, molecular adaptation will probably play an increasingly important role in shaping global patterns of biodiversity as climates rapidly change during the Anthropocene.
The evolutionary history of Sphagnum (peat moss) is intimately associated with climate and these plants hold enormous sway over global carbon cycling. Peat mosses are keystone species of peatland ecosystems in the Northern Hemisphere that harbour roughly one-quarter of global terrestrial carbon (Loisel et al., 2021). Sphagnum species construct their own niches within northern peatlands through functional traits (Piatkowski and Shaw, 2019). In so doing, they store carbon in the form of decay-resistant biomass and reduce competitive pressure between sympatric species. Phylogenetic analysis suggests that the diversification of extant peat moss species, numbering nearly 400 worldwide (World Flora Online, 2023), happened quite rapidly in the Miocene during a period of global cooling (Shaw et al., 2010). Most species of Sphagnum diversified at northern latitudes but there have been several independent range expansions to lower latitudes suggesting that adaptation to warming has historically played an important role in shaping current biogeographical distributions (Shaw et al., 2019).
An exciting system for studying the evolutionary biogeography of Sphagnum is the S. magellanicum species complex. Historically, S. magellanicum has been a valuable model for peatland ecologists due to its global distribution and red–violet pigmentation that makes it easily discernible from other species in subgenus Sphagnum (Shaw et al., 2003a). More recently, genetic studies have demonstrated that this lineage is a complex of morphologically cryptic species: S. magellanicum that is restricted to South America, S. divinum and S. medium that are present in Europe and North America, and S. diabolicum and S. magniae that are known from eastern North America (Kyrkjeeide et al., 2016a; Yousefi et al., 2017; Hassel et al., 2018; Shaw et al., 2022). Phylogenomic analysis has demonstrated that S. diabolicum and S. magniae are sister species that have not reached reciprocal monophylly across all chromosomes suggesting that these clades are in the early stages of speciation (Shaw et al., 2022, 2023).
Sphagnum diabolicum is distributed throughout northeastern North America south to the Appalachian Mountains in North Carolina, whereas populations of S. magniae are generally found at coastal plain sites in the southeastern United States and throughout the Gulf Coast west to Arkansas (Fig. 1A). The species also have distinct niches, with S. magniae occurring in pine and hardwood forests (Fig. 1B) and S. diabolicum occurring in bogs and fens that are more characteristic of boreal peatlands (Fig. 1C). While their biogeographical ranges have minor overlap in the southern Appalachians, the two species have never been found to co-occur within localities, suggesting that the ecological factors defining their respective niches are probably different from those contributing to within-habitat niche differentiation known best from northern peatlands (e.g. height-above-water-table). Climate varies dramatically along this latitudinal gradient (De Frenne et al., 2013) and it may be that these two species are locally adapted to different climatic zones. Indeed, the warm temperate to subtropical distribution of S. magniae is distinct from the cold temperate to boreal distributions of other species in the S. magellanicum complex present in North America and suggests that this species evolved from cold-adapted ancestors (Shaw et al., 2023). Bioclimatic variables such as temperature and precipitation are important determinants of species distributions in many eukaryotic lineages (Franks and Hoffman, 2012; De Kort et al., 2021) but there is limited understanding of how they might contribute to reproductive isolation between S. diabolicum and S. magniae.

Populations from two sister species in the Sphagnum magellanicum complex were sampled from along a latitudinal gradient in eastern North America. (A) DNA sequences from plants of S. diabolicum (orange) and S. magniae (blue) were obtained using full genome resequencing (circles) and RADseq (squares). (B) A hardwood hammock in Florida representing the typical habitat of the largely southern species, S. magniae. (C) A kettlehole bog in New York representing the typical habitat of the northern species, S. diabolicum.
In spore-producing plants such as bryophytes, it has generally been argued that the geographical scale at which populations are genetically structured is greater than that for seed plants due to a high capacity for long-distance dispersal. Indeed, other mosses can have little genetic structure between continents (e.g. Piñeiro et al., 2012; Biersma et al., 2020) although this dispersal capacity might be limited across latitudinal bands. For Sphagnum, some species show little population differentiation between continents (Stenøien and Såstad, 1999; Kyrkjeeide et al., 2016a) and within countries (Gunnarsson et al., 2005). Other Sphagnum species have pronounced genetic structure within continents particularly along latitudinal gradients (Kyrkjeeide et al., 2016b; Duffy et al., 2022). In addition to sexual reproduction, Sphagnum can reproduce asexually via fragmentation and this mode predominates in populations of species such as S. fuscum (Gunnarsson et al., 2007) and S. subnitens (Karlin et al., 2011). While S. diabolicum and S. magniae are phylogenetically distinct, it is unknown whether populations of these sister species are genetically structured and how such differentiation might relate to their biogeography.
To date, most research on the process of speciation in Sphagnum has focused on polyploid species and their progenitors. Some 36 allopolyploid and one homoploid hybrid species are known in Sphagnum (Meleshko et al., 2018), which is comparable to the rate of polyploid speciation observed in other plant groups (Wood et al., 2009). However, polyploid speciation is just one of the mechanisms by which reproductive isolation can evolve in plant populations (Rieseberg and Willis, 2007) and accounts for a minority of peat moss diversity. Our understanding of how non-polyploid Sphagnum species arise is much more limited and no studies to date have investigated the genomic architecture of cladogenesis. In other plants, divergent selection can promote speciation between ecologically differentiated populations by rendering certain portions of the genome less permeable to gene flow, thereby elevating genetic divergence (Via, 2009). In some cases, divergent selection at a small number of loci can cause speciation through the evolution of epistatic genetic incompatibilities (Bateson–Dobzhansky–Muller incompatibilities) that reduce hybrid fitness (Lexer and Widmer, 2008). With emerging genomic resources for Sphagnum (Weston et al., 2018; Healey et al., 2023), we are finally able to determine how the peat moss genome changes during speciation and how such changes relate to their ecology.
We sought to characterize the architecture of genomic differentiation between S. diabolicum and S. magniae to better understand the process of speciation. Using a combination of genome resequencing and RADseq data, we asked the following questions: (1) What is the genetic structure of populations across the geographical landscape? (2) Which ecological variables differentiate the two species? (3) How is genetic differentiation distributed across the genome? (4) Is there evidence for natural selection in the speciation process? In pursuit of these questions, we aimed to determine whether climate adaptation has had an important role in driving the recent speciation of these ecologically important plants.
MATERIALS AND METHODS
Study system
We sampled plants in the S. magellanicum Brid. species complex along a latitudinal gradient in eastern North America from subarctic Canada to subtropical Florida (Supplementary Data Table S1). We used the R package ggmap v3.0.0 (https://github.com/dkahle/ggmap) to visualize the geographical distribution of these samples. Previous phylogenetic analyses of genetic structure within the complex suggest that four species exist within this geographical range: S. divinum, S. medium, S. diabolicum and S. magniae (Shaw et al., 2022). Of these species, S. diabolicum and S. magniae are sister taxa and differ biogeographically. Given the relatively recent speciation of S. diabolicum and S. magniae, we chose to investigate the genomic architecture of differentiation between these two lineages to better understand the mechanisms by which speciation took place.
Ecological differentiation between species
Bioclimatic variables were compared between the two sister species to assess ecological differentiation. Such environmental variables are commonly used for ecological niche and species distribution modelling (Wiens et al., 2009; Searcy and Shaffer, 2016) as well as demonstrating local adaptation in other plant lineages (e.g. Keller et al., 2012; Hämälä et al., 2020). We also calculated annual water balance and the monthly standard deviation of water balance following Campbell et al. (2021). Raster layers for temperature and precipitation were downloaded from Worldclim (v2.1, Fick and Hijmans, 2017) and for potential evapotranspiration from Trabucco and Zomer (2019) at a resolution of 2.5 arc minutes (~8.5 km). Annual water balance reflects the net amount of water that a location receives over the course of a year. Bioclimatic variables and those from Campbell et al. (2021) were obtained using the R packages raster v3.6-3 (Hijmans et al., 2022a) and dismo v1.3-9 (Hijmans et al., 2022b). We extracted these variables for each set of plant collection coordinates, then compared the two species using Mann–Whitney U tests and principal component analysis in the R statistical programming environment (R Core Team, 2022).
Genetic data
We obtained whole genome resequencing data for 17 plants: eight samples of S. diabolicum and nine samples of S. magniae. DNA was extracted from the plants using CTAB (Shaw et al., 2003b), measured for purity using a NanoDrop spectrophotometer, sheared to 400 bp using a Covaris LE220, and size-selected using the Pippin system (Sage Science, Beverly, MA, USA). Illumina libraries were prepared using the KAPA library preparation kit (KAPA Biosystems, Potters Bar, UK) and sequenced 2 × 150 bp on the Illumina HiSeq2500 sequencing platform. Illumina reads were filtered to remove those with >95 % simple sequence and those contaminated with PhiX and organellar sequence. Reads were aligned to the S. divinum v1.1 reference genome (Healey et al., 2023) using the BWA-MEM algorithm of BWA v0.7.12 (Li, 2013) and duplicates in the resulting BAM files were marked using Picard v2.2.6.2 (http://broadinstitute.github.io/picard/). Variants were called using HaplotypeCaller and joint genotyping was performed using GenotypeGVCFs from the GATK v4.2.2.0 toolkit (Van de Auwera and O’Connor, 2020). Biallelic single nucleotide polymorphisms (SNPs) were then filtered using the following criteria: QD ≥ 2.0, MQ ≥ 40.0, FS ≤ 60.0, SOR ≤ 3.0, MQRankSum ≥ −12.5 and ReadPosRankSum ≥ −8.0.
We sequenced additional individuals using double digestion restriction site-associated DNA sequencing (ddRADseq) following Duffy et al. (2020). DNA libraries were sequenced 1 × 150 bp at the Genome Sequencing Shared Resource at the Duke Center for Genomic and Computational Biology (https://biostat.duke.edu/research/center-genomic-and-computational-biology/gcb-core-facilities) using the Illumina NextSeq500 platform. Raw reads were demultiplexed using ipyrad v0.7.29 (Eaton and Overcast, 2020), inspected for quality using FastQC v0.11.9 (Andrews, 2019) and trimmed using Fastp v0.23.1 (Chen et al., 2018) to remove Illumina adapter sequences and low-quality bases. Reads were aligned to the reference genome and variants were then called as described above for the genome resequencing data.
Genetic structure and demographic modelling
We explored genetic structure within the combined dataset (resequencing and ddRADseq) using population genetic methods. Prior to analysis, sites were filtered to keep those genotyped in at least 80 % of individuals and for a minimum minor allele frequency of 0.05 using VCFtools v0.1.16 (Danecek et al., 2011). Variants were then pruned for linkage disequilibrium using PLINK v1.90b6.24 (Chang et al., 2015) with the parameters ‘--indep 50 10 1.5’. The program ADMIXTURE v1.3.0 (Alexander et al., 2009) was used to estimate ancestry among these samples with 10-fold cross-validation to select the best number of populations (K = 2 to K = 5) and 200 bootstrap replicates to quantify the standard errors of parameter estimates.
To better understand the evolutionary dynamics of speciation between S. magniae and S. diabolicum, we performed coalescent simulations with the program fastsimcoal2 v2.7.0.2 (Excoffier et al., 2021). This software allowed us to simulate what the expected site frequency spectrum (SFS) might be under various evolutionary models while marginalizing potential impacts of incomplete lineage sorting (i.e. deep coalescence). We used biallelic SNPs from the combined dataset and pruned for linkage disequilibrium using PLINK. To estimate the unfolded SFS in the presence of missing data, we used easySFS (https://github.com/isaacovercast/easySFS.git) and projected down to 20 samples of each species. The simplest model that was fitted was one in which the two species diverged in the absence of gene flow. Other models included migration rates between the clades and had gene flow occurring consistently since divergence at equal rates, occurring consistently but with a shift in migration rate at some time since divergence, occurring early in the process of speciation but absent later in time, and occurring recently but restricted immediately following species divergence (Supplementary Data Fig. S1). We fitted each model 100 times using 100 000 coalescent simulations with 40 cycles of expectation conditional maximization. The Akaike information criterion (AIC) was used to select the best run for each model and values from these five best runs were used to determine which model received the most support. We report our results in terms of generations instead of absolute time as this is how fastsimcoal models the timing of demographic events and the generation time of peat mosses is currently unknown.
Genomic architecture of differentiation
To determine how genetic differentiation is distributed within the genome, we calculated population genetic summary statistics and identified fixed biallelic differences between S. magniae and S. diabolicum. For these analyses, we used only the genome resequencing samples and did not impose a filter on minor allele frequency or linkage disequilibrium. The program pixy v1.2.5.beta1 (Korunes and Samuk, 2021) was used to calculate relative divergence (Hudson’s FST, Hudson et al., 1992), genetic distance (Dxy) and nucleotide diversity (π) for 10-kb non-overlapping windows of the genome. Genetic distance represents the average pairwise distance between samples from different populations whereas nucleotide diversity represents the average pairwise distance between samples from the same population. Relative divergence (FST) is derived from both these values and increases as the ratio of π to Dxy decreases. Fixed biallelic differences between species were identified using bcftools v1.13 (Danecek et al., 2021). To demonstrate that windows with elevated FST clustered together in genomic islands, we compared the distribution of distances between adjacent high-FST windows within a given chromosome to a null distribution generated by sampling an equal number of windows randomly without replacement 1000 times using Mann–Whitney U tests. Finally, we estimated the magnitude of linkage disequilibrium within genomic windows by calculating the average squared correlation coefficient for variants using PLINK.
Gene–environment association and selection analyses
Genome resequencing data were used to identify loci associated with environmental variables and genetic structure. For these analyses, the dataset was filtered to include only those sites genotyped in every individual with a minor allele frequency of 0.1 at minimum. We first used latent factor mixed models (LFMMs) to relate the molecular markers with PC1 of the environmental variables described above using the R package lfmm v0.0 (Cayes et al., 2019). The number of latent factors was determined using a dataset pruned for linkage disequilibrium and the full LFMM analysis was performed on the unpruned dataset using least-squares estimation with a ridge penalty. The resulting P-values were adjusted using the Benjamini–Hochberg procedure (Benjamini and Hochberg, 1995) and the significance level was set to 0.05. In a similar fashion, we used the R package pcadapt v4.3.3 (Privé et al., 2020) to identify outlier markers that are significantly related to genetic structure.
We also performed a redundancy analysis (RDA) using the R package vegan v2.6-4 (Oksanen et al., 2022) to identify groups of loci that covary with the environmental variables (Forester et al., 2018). Multicollinearity among environmental variables was assessed by estimating variance inflation factors (VIFs) and we sequentially removed the variable with the largest VIF until all remaining variables had VIF values below 10. The variables used in the final analysis included BIO2 (mean diurnal range), BIO8 (mean temperature of the wettest quarter), BIO9 (mean temperature of the driest quarter), BIO14 (precipitation of the driest month), BIO16 (precipitation of the wettest quarter), BIO17 (precipitation of the driest quarter), elevation and the monthly standard deviation of water balance. SNP loadings exceeding 2.5 standard deviations were considered significant.
To determine which genes had evidence for positive selection, we used McDonald–Kreitman tests which compared levels of within-species polymorphism to between-species divergence. For each genome resequencing sample, we used FastaAlternateReferenceMaker from the GATK toolkit to create an alternative sequence of the S. divinum reference genome with a dataset that contained sites genotyped in all resequenced individuals. Predicted transcripts were then extracted from these sequences using gffread v0.12.7 (Pertea and Pertea, 2020) and tested for selection using the R package PopGenome v2.7.5 (Pfeifer et al., 2014). Finally, for each gene we estimated the rate ratio of non-synonymous (dN) to synonymous (dS) substitutions using the Fixed Effects Likelihood algorithm (Kosakovsky Pond and Frost, 2005) implemented in the program HyPhy v2.5.42 (Kosakovsky Pond et al., 2019).
Functional enrichment of adaptive loci
We sought to identify the biological processes associated with genes and sequence variants that might be involved in local adaptation and species differentiation. SNPs were annotated using SnpEff v5.1d (Cingolani et al., 2012) and the S. divinum reference genome to determine if they are near or within a gene and what their functional consequences might be (e.g. if an SNP is predicted to be a missense mutation). Genes within islands of differentiation were identified using BEDTools v2.27.1 (Quinlan et al., 2010). Gene Ontology (GO) term enrichment analyses were performed using the R package clusterProfiler v4.4.4 (Wu et al., 2021).
RESULTS
Genetic structure and ecological differentiation
Analysis of genomic data using principal components and software designed to estimate ancestry suggest strong genetic structure between S. diabolicum and S. magniae, in agreement with previous phylogenomic findings (Shaw et al., 2022). Ancestry analysis using ADMIXTURE at K = 2 revealed that most samples have no evidence of admixture (Fig. 2A). However, several samples collected from where the species’ ranges overlap (including North Carolina and Tennessee) show evidence of admixture with 3–28 % mixed ancestry, although this could possibly reflect a combination of admixture and incomplete lineage sorting. Nevertheless, principal component analysis of SNP data resolved two major groups along PC1 (12.1 % variance explained) corresponding to the two species (Fig. 2B). To better understand the process of speciation, we performed coalescent simulations and found that the best speciation model was a scenario in which the two species diverged from one another roughly 223 000 generations ago in the absence of gene flow and experienced secondary contact more recently (Fig. 2C).

Molecular data demonstrate genetic differentiation between S. diabolicum (orange) and S. magniae (blue). (A) Ancestry estimation using ADMIXTURE suggests that most samples are of non-admixed ancestry but some show evidence of admixture where the species’ ranges overlap. (B) Principal component analyses demonstrate that the two species are genetically distinct. (C) Coalescent analyses favour a model of secondary contact with gene flow initially restricted following divergence from a common ancestor (grey) but occurring more recently. Rectangle width reflects the log10-transformed estimates of effective population size.
Largely due to their differentiation along a latitudinal gradient (Fig. 2A), these two species of Sphagnum experience quite different climates. Principal component analysis of bioclimatic variables extracted from sample coordinates revealed that the species were separated along PC1 (52.2 % variance explained) which is driven primarily by BIO11 (mean temperature of coldest quarter, 8.5 % contribution), BIO1 (mean annual temperature, 8.4 % contribution), BIO6 (minimum temperature of coldest month, 8.4 % contribution) and BIO3 (isothermality, 8.0 % contribution) (Supplementary Data Fig. S2). The bioclimatic variables relating to temperature had the largest effect sizes between species, but variables relating to precipitation and water balance also differentiate the species (Mann–Whitney U tests, Table S2).
Genomic landscape of differentiation
Genome-wide FST scans revealed that genetic differentiation between S. diabolicum and S. magniae is heterogeneously distributed across chromosomes and concentrated within distinct regions of the genome (Fig. 3A). Average FST across genomic windows is 0.12 [median 0.07, interquartile range (IQR) 0.13]. The distribution of mean FST within 10-kb windows is positively skewed (skewness = 2.39) and leptokurtic (kurtosis = 9.41), and some windows have exceptionally high differentiation where mean FST is elevated to ≥0.59 (z-score >3; ~3 % of all windows). Across these outlier windows FST is 0.72 on average (median = 0.71, IQR = 0.14). Chromosomes with the largest proportion of FST outlier windows include chromosome 16 (15.8 %), chromosome 11 (13.15 %) and chromosome 1 (10.2 %) whereas those with the smallest proportion include chromosome 10 (0.5 %), chromosome 4 (1.2 %) and chromosome 15 (1.45 %). Contiguous FST outlier windows of the genome (i.e. adjacent 10-kb FST outlier windows) are roughly 44 kb on average (median = 20 kb, IQR = 30 kb) but can reach in excess of 400 kb on chromosomes such as chromosome 11 and chromosome 16. Comparison of the distances between adjacent FST outlier windows to those from randomly selected windows demonstrates that the outlier windows tend to be clustered together on chromosomes within genomic islands (Supplementary Data Table S3).

The genomic landscape of differentiation between sister species of Sphagnum is heterogeneous and shows hallmarks of linked selection. (A) Manhattan plot of FST estimated across 10-kb sliding windows. Mean FST across the genome is shown by a solid blue line and the threshold for FST outliers is depicted by a dotted blue line. (B) FST is negatively correlated with reduced nucleotide diversity (π) but not genetic distance (Dxy) across all autosomal sliding windows. (C) There are ‘islands’ of genomic differentiation within chromosomes, such as chromosome 16 shown here, that have high levels of genetic divergence (FST and Dxy) and low nucleotide diversity.
Across the entire set of 10-kb windows, the relative divergence metric of FST is uncorrelated with the absolute divergence metric of genetic distance (Dxy) (Spearman rank test; ρ = 0, P > 0.05), negatively correlated with nucleotide diversity (π) (ρ = -0.28, P < 0.001) and positively correlated with linkage disequilibrium (LD) (ρ = 0.14, P < 0.001) (Fig. 3B; Supplementary Data Fig. S3). However, across FST outlier windows a positive relationship between FST and Dxy was apparent (ρ = 0.16, P < 0.001; Fig. S3). Indeed, relative to the rest of the genome these FST outlier windows have significantly higher Dxy (Mann–Whitney U test; r = 0.06, P < 0.001), lower π (r = 0.22, P < 0.001) and a greater magnitude of LD (r = 0.17, P < 0.001) (Fig. S4). Together, these islands of differentiation are characterized by correlated patterns of variation (Fig. 3C). Importantly, however, there are many regions of the genome that have high FST and reduced π, but Dxy is roughly equal to or even lower than genome-wide average Dxy (Figs S5–S14).
Islands of differentiation, molecular adaptation and gene–environment associations
Regions of the genome with elevated FST harbour 542 genes and contain nearly all (~91 %) of the 74 344 fixed nucleotide differences between S. diabolicum and S. magniae. Indeed, the proportion of segregating sites that are fixed differences between species is much greater within these regions compared to the rest of the genome (Fisher’s exact test, P < 2.2E-16). Roughly 10 % of the fixed differences within these genomic islands of differentiation are present within genes and 645 result in missense mutations. Results from the pcadapt method show that all of these fixed differences are outliers with respect to their association with genetic structure, as are over 71 000 other SNPs (Fig. 4A). Furthermore, we found that predicted transcripts from protein-coding loci within islands of differentiation have greater rates of non-synonymous to synonymous substitution (median dN/dS = 0.46, IQR = 0.61) relative to transcripts outside of these islands (median dN/dS = 0.39, IQR = 0.54) (Mann–Whitney U test; r = 0.03, P < 0.001).

Integration of results across analyses highlights the genetic loci responsible for local adaptation. (A) Venn diagram showing single nucleotide polymorphisms (SNPs) that are fixed differences between species, significantly associated with genetic structure as inferred using pcadapt, and significantly associated with bioclimatic variables using redundancy analysis (RDA) and latent factor mixed model (LFMM) analysis. (B) Upset plot showing the genes that have fixed missense differences between species, are present in genomic islands of differentiation (FST outliers) and lie within 5 kb of SNPs significantly associated with genetic structure or bioclimatic variables.
To determine whether genomic islands of differentiation might be caused by molecular adaptation, we performed McDonald–Kreitman tests for positive selection and found that 20 transcripts from ten genes show significant evidence of adaptive evolution (Table 1). Based on sequence homology, these genes are likely related to disease resistance (Sphmag04G015800 and Sphmag04G080100), intracellular signalling pathways (Sphmag08G014600), aldehyde oxidation (Sphmag09G009500), ubiquitination (Sphmag04G080100 and Sphmag12G041600) and regulation of gene expression (Sphmag14G070000, Sphmag16G011900 and Sphmag16G013600). Seven of the ten genes are also found within islands of differentiation. It should be noted, however, that this test relies only on rates of substitution within coding sequences and was not designed to detect adaptation involving changes to other genomic features such as regulatory elements.
Genes with significant evidence for molecular adaptation as determined by McDonald–Kreitman tests. For each gene, information is provided for the best BLAST hit to the Arabidopsis thaliana TAIR10 annotation, a brief description of the gene and whether this gene is present within a genomic island of differentiation.
Locus . | Transcript(s) . | Best Arabidopsis hit . | Description . | Within island . |
---|---|---|---|---|
Sphmag04G015800 | Sphmag04G015800.1 | AT4G27220 | NB-ARC domain-containing disease resistance protein | No |
Sphmag04G080100 | Sphmag04G080100.1 | AT2G42030 (MUSE2) | RING-U-box superfamily protein | No |
Sphmag04G080800 | Sphmag04G080800.1 | AT5G62670 (HA11) | H+-ATPase | No |
Sphmag08G014600 | Sphmag08G014600.1 Sphmag08G014600.2 | AT5G10720 (HK5) | Histidine kinase | Yes |
Sphmag09G009500 | Sphmag09G009500.1 Sphmag09G009500.2 | AT1G44170 (ALDH3H1) | Aldehyde dehydrogenase | Yes |
Sphmag12G041600 | Sphmag12G041600.4 Sphmag12G041600.5 Sphmag12G041600.6 | AT3G49600 (UBP26) | Ubiquitin-specific protease | Yes |
Sphmag14G070000 | Sphmag14G070000.1 Sphmag14G070000.2 Sphmag14G070000.3 | AT3G30530 (bZIP42) | Basic leucine-zipper transcription factor | Yes |
Sphmag16G010100 | Sphmag16G010100.2 | N/A | N/A | Yes |
Sphmag16G011900 | Sphmag16G011900.1 | AT5G64220 (CAMTA2) | Calmodulin-binding transcription activator | Yes |
Sphmag16G013600 | Sphmag16G013600.1 Sphmag16G013600.2 Sphmag16G013600.3 Sphmag16G013600.4 Sphmag16G013600.5 | AT5G19400 (SMG7) | Telomerase activating protein | Yes |
Locus . | Transcript(s) . | Best Arabidopsis hit . | Description . | Within island . |
---|---|---|---|---|
Sphmag04G015800 | Sphmag04G015800.1 | AT4G27220 | NB-ARC domain-containing disease resistance protein | No |
Sphmag04G080100 | Sphmag04G080100.1 | AT2G42030 (MUSE2) | RING-U-box superfamily protein | No |
Sphmag04G080800 | Sphmag04G080800.1 | AT5G62670 (HA11) | H+-ATPase | No |
Sphmag08G014600 | Sphmag08G014600.1 Sphmag08G014600.2 | AT5G10720 (HK5) | Histidine kinase | Yes |
Sphmag09G009500 | Sphmag09G009500.1 Sphmag09G009500.2 | AT1G44170 (ALDH3H1) | Aldehyde dehydrogenase | Yes |
Sphmag12G041600 | Sphmag12G041600.4 Sphmag12G041600.5 Sphmag12G041600.6 | AT3G49600 (UBP26) | Ubiquitin-specific protease | Yes |
Sphmag14G070000 | Sphmag14G070000.1 Sphmag14G070000.2 Sphmag14G070000.3 | AT3G30530 (bZIP42) | Basic leucine-zipper transcription factor | Yes |
Sphmag16G010100 | Sphmag16G010100.2 | N/A | N/A | Yes |
Sphmag16G011900 | Sphmag16G011900.1 | AT5G64220 (CAMTA2) | Calmodulin-binding transcription activator | Yes |
Sphmag16G013600 | Sphmag16G013600.1 Sphmag16G013600.2 Sphmag16G013600.3 Sphmag16G013600.4 Sphmag16G013600.5 | AT5G19400 (SMG7) | Telomerase activating protein | Yes |
Genes with significant evidence for molecular adaptation as determined by McDonald–Kreitman tests. For each gene, information is provided for the best BLAST hit to the Arabidopsis thaliana TAIR10 annotation, a brief description of the gene and whether this gene is present within a genomic island of differentiation.
Locus . | Transcript(s) . | Best Arabidopsis hit . | Description . | Within island . |
---|---|---|---|---|
Sphmag04G015800 | Sphmag04G015800.1 | AT4G27220 | NB-ARC domain-containing disease resistance protein | No |
Sphmag04G080100 | Sphmag04G080100.1 | AT2G42030 (MUSE2) | RING-U-box superfamily protein | No |
Sphmag04G080800 | Sphmag04G080800.1 | AT5G62670 (HA11) | H+-ATPase | No |
Sphmag08G014600 | Sphmag08G014600.1 Sphmag08G014600.2 | AT5G10720 (HK5) | Histidine kinase | Yes |
Sphmag09G009500 | Sphmag09G009500.1 Sphmag09G009500.2 | AT1G44170 (ALDH3H1) | Aldehyde dehydrogenase | Yes |
Sphmag12G041600 | Sphmag12G041600.4 Sphmag12G041600.5 Sphmag12G041600.6 | AT3G49600 (UBP26) | Ubiquitin-specific protease | Yes |
Sphmag14G070000 | Sphmag14G070000.1 Sphmag14G070000.2 Sphmag14G070000.3 | AT3G30530 (bZIP42) | Basic leucine-zipper transcription factor | Yes |
Sphmag16G010100 | Sphmag16G010100.2 | N/A | N/A | Yes |
Sphmag16G011900 | Sphmag16G011900.1 | AT5G64220 (CAMTA2) | Calmodulin-binding transcription activator | Yes |
Sphmag16G013600 | Sphmag16G013600.1 Sphmag16G013600.2 Sphmag16G013600.3 Sphmag16G013600.4 Sphmag16G013600.5 | AT5G19400 (SMG7) | Telomerase activating protein | Yes |
Locus . | Transcript(s) . | Best Arabidopsis hit . | Description . | Within island . |
---|---|---|---|---|
Sphmag04G015800 | Sphmag04G015800.1 | AT4G27220 | NB-ARC domain-containing disease resistance protein | No |
Sphmag04G080100 | Sphmag04G080100.1 | AT2G42030 (MUSE2) | RING-U-box superfamily protein | No |
Sphmag04G080800 | Sphmag04G080800.1 | AT5G62670 (HA11) | H+-ATPase | No |
Sphmag08G014600 | Sphmag08G014600.1 Sphmag08G014600.2 | AT5G10720 (HK5) | Histidine kinase | Yes |
Sphmag09G009500 | Sphmag09G009500.1 Sphmag09G009500.2 | AT1G44170 (ALDH3H1) | Aldehyde dehydrogenase | Yes |
Sphmag12G041600 | Sphmag12G041600.4 Sphmag12G041600.5 Sphmag12G041600.6 | AT3G49600 (UBP26) | Ubiquitin-specific protease | Yes |
Sphmag14G070000 | Sphmag14G070000.1 Sphmag14G070000.2 Sphmag14G070000.3 | AT3G30530 (bZIP42) | Basic leucine-zipper transcription factor | Yes |
Sphmag16G010100 | Sphmag16G010100.2 | N/A | N/A | Yes |
Sphmag16G011900 | Sphmag16G011900.1 | AT5G64220 (CAMTA2) | Calmodulin-binding transcription activator | Yes |
Sphmag16G013600 | Sphmag16G013600.1 Sphmag16G013600.2 Sphmag16G013600.3 Sphmag16G013600.4 Sphmag16G013600.5 | AT5G19400 (SMG7) | Telomerase activating protein | Yes |
Both RDA and LFMM analysis point to associations between thousands of SNPs and environmental variables (Fig. 4A). Using RDA, we found that the constrained ordination explained roughly 7.5 % of the genotypic variance and the first constrained axis (RDA1) was the most important (Supplementary Data Fig. S15). All of the 79 215 SNPs associated with the first constrained axis (RDA1) were also significant in the LFMM analysis, while most of the 142 765 SNPs associated with environmental PC1 were also significant using RDA and pcadapt. Over 1500 genes had variants that were significant in both RDA and LFMM analyses, while another 3818 genes were significant using LFMM but not when using RDA (Fig. 4B).
Integrating the results from all of our analyses, we identified 153 genes that differentiate species and are likely involved in local adaptation to contrasting environments (Supplementary Data Table S4). These genes are those that are present within islands of differentiation, have fixed differences between species that represent missense mutations, and are within 5 kb of SNPs that were found to be significant using pcadapt and the gene–environment association analyses (Fig. 4B). These genes are enriched for biological processes such as photoperiodism, detection of abiotic stimuli, abiotic stress responses and hormone-mediated signalling (Table S5). GO term enrichments for all genes within 5 kb of significant SNPs are largely similar to those found exclusively within islands of differentiation, except the latter category does not include biological processes related to anatomical development, biotic response or cell wall biosynthesis (Table S6).
DISCUSSION
This research demonstrates that the recent speciation of two sister clades in the Sphagnum magellanicum complex, S. diabolicum and S. magniae, was driven in large part by natural selection on genetic loci in response to bioclimatic ecological variables. We find that ecological divergence of these two clades along a latitudinal gradient in eastern North America involves factors relating to temperature, precipitation and water balance. Genetic data suggest that most populations along this gradient are not admixed, but some plants occurring where the species’ biogeographical ranges nearly overlap have hybrid ancestry comprising up to one-quarter of their genome. Genetic differentiation between S. diabolicum and S. magniae is heterogeneously distributed across the genome, with some genomic intervals having FST elevated to nearly eight-fold the genome-wide average. These outlier intervals also have reduced nucleotide diversity, suppressed recombination and often greater genetic distance. Furthermore, such regions harbour SNP variation that is statistically associated with bioclimatic variables and genes bearing the footprints of positive selection.
Ecological divergence between sister species such as these is widespread throughout eukaryotes (Hernández-Hernández et al., 2021) but the role of molecular adaptation in the early stages of speciation is less clear. Radiations involving the emergence of many closely related species within a short period of time are often linked to rapid rates of niche evolution, as evidenced in groups such as Darwin’s finches (Lamichhaney et al., 2015), stickleback fishes (Marques et al., 2018) and New World lupins (Nevado et al., 2016). Nevertheless, other mechanisms including chromosomal rearrangement and (particularly for plants) polyploidy can cause reproductive isolation (Levin, 2019). In such cases, ecologically mediated selection might occur later during the speciation process and potentially reinforce species boundaries (Pfennig, 2016) but not initiate reproductive isolation. In the case of S. diabolicum and S. magniae, the data are consistent with both divergent selection contributing to and reinforcing reproductive isolation.
In the genomic era, there is increasing evidence for the presence of ‘islands of differentiation’ in the genomes of closely related species in which differentiation among loci is particularly high relative to the genome-wide background (Wolf and Ellegren, 2017). Such islands with elevated FST are present on virtually all chromosomes when comparing S. diabolicum to S. magniae. Within-species nucleotide diversity is low within these islands consistent with natural selection (Cruickshank and Hahn, 2014), but different patterns of between-species nucleotide diversity (Dxy) exist that can provide insight into how selection forms such islands (Irwin et al., 2018; Supplementary Data Fig. S16). Regions with higher-than-average Dxy are consistent with a ‘divergence with gene flow’ model of island formation in which selection at a locus causes reproductive isolation to evolve between populations. Nearly 44 % of FST outlier windows in this study have Dxy values in the upper quartile consistent with this first model and the most prominent such island is within the first few megabases of chromosome 16. The formation of genomic islands with Dxy approximately equal to that of the genome-wide average is best explained using a ‘selection in allopatry’ model in which selection acts within populations following divergence. In our study, roughly 37 % of FST outlier windows have Dxy values within the interquartile range which probably correspond to this allopatric selection model. Finally, both the ‘recurrent selection’ and ‘geographical sweep before differentiation’ models predict low Dxy in genomic islands due to either selection occurring in both the ancestral and daughter populations or acting to rapidly spread beneficial alleles across a geographically structured species, respectively. Around 20 % of FST outlier windows in this study have low levels of Dxy consistent with these final two models of island formation and can be seen on chromosomes such as chromosome 2 and chromosome 4.
Our demographic modelling and ADMIXTURE results suggests that secondary contact has occurred between these two species, and this could have contributed to the formation of genomic islands through antagonism between divergent selection and gene flow (Via and West, 2008). Nevertheless, the absence of high genetic distance in the majority of such islands and elevated LD suggests a stronger role of linked selection and divergence hitchhiking in island formation (Ma et al., 2018). These two scenarios are not mutually exclusive, however, and low levels of admixture could provide new targets for divergent selection to act on rather than relying on standing ancestral polymorphism or emergence of novel mutations. Adaptive introgression is well known in other groups such as butterflies (Pardo-Diaz et al., 2012) and poplars (Rendón-Anaya et al., 2021), but more rigorous population-level sampling of these peat moss species would be necessary to further investigate this possibility.
Genes within genomic islands that differentiate S. diabolicum and S. magniae are largely related to abiotic stress tolerance, suggesting a role for local adaptation in speciation. Genes including heat shock proteins (HSPs) such as Sphmag01G114100, heat shock factors (HSFs) such as Sphmag16G012100 or those involved in the ubiquitin–proteasome system such as Sphmag12G041600 are likely to be directly involved with tolerance to abiotic stress, for example high temperature (Xu and Xue, 2019; Bourgine and Guihur, 2021). Others might be responsible for coordinating the signalling cascades involved in stress perception and response, such as the ABSCISIC ACID INSENSITIVE 3 (ABI3) homologue Sphmag06G045200 or members of the APETALA2/ethylene response factor (AP2/ERF) transcription factor family. Indeed, local adaptation to temperature and precipitation in other plant species such as poplar (Zhang et al., 2019) often involves hormonal crosstalk and similar changes to components of gene regulatory networks. Those transcription factors that might be particularly important, given their presence within islands of differentiation and significant McDonald–Kreitman tests, involve the calmodulin-binding transcription factors encoded by Sphmag16G011900 or the basic leucine zipper (bZIP) transcription factors encoded by Sphmag14G070000 that have homology to proteins involved in abiotic stress responses in plants such as switchgrass (Liu and Chu, 2015). However, detailed systems biology studies characterizing the molecular response of Sphagnum to different abiotic stressors would be necessary to confirm this hypothesis.
Divergent selection on genes involved in perceiving light levels and regulating photoperiod is often detected in plant species distributed along environmental gradients, such as in Arabidopsis (Samis et al., 2008) or poplar (Keller et al., 2012). In this study, we found that these biological processes were enriched in genes that differentiate species and show gene–environment associations. For example, genomic signatures of local adaptation can be found in proteins that regulate photoreceptor expression. Intriguingly, genes putatively related to the production of flavonoid pigments, such as homologues of TRASNPARENT TESTA 7/ FLAVONOID 3ʹ-HYDROXYLASE and TRANSPARENT TESTA 8 (Appelhagen et al., 2014), are present in islands of differentiation and might be involved in the pigmentation differences between the generally red S. diabolicum and often green S. magniae. In addition to mitigating the harmful effects of ultraviolet radiation, red–violet pigments such as the anthocyanins of flowering plants can protect against oxidative stress and mediate biotic interactions (Landi et al., 2015). Biotic factors also probably drive divergent selection between S. diabolicum and S. magniae, as evidenced by adaptive evolution of the disease resistance R-protein Sphmag04G015800 and the presence of genes in the Bcl-2-associated athanogene (BAG) family within islands of differentiation (van Ooijen et al., 2008; You et al., 2016). Future research involving common garden studies and reciprocal transplants along latitudinal gradients could provide further insight into which abiotic and/or biotic factors might represent particularly strong selective pressures for S. diabolicum and S. magniae. Such studies could be supplemented with functional characterization of candidate genes, as inferences regarding gene function in Sphagnum that are based on sequence homology to model systems such as Arabidopsis may not always be accurate. Nevertheless, there is increasing evidence for conservation of certain stress response genes throughout land plants (e.g. the abscisic acid signalling pathway: Komatsu et al., 2009; Takezawa et al., 2015) which highlights the possibility that our findings in Sphagnum could help to elucidate the core components of climate resiliency that are shared across plant lineages.
In summary, we found that divergent selection has played an important role in the recent speciation of S. diabolicum and S. magniae. Our results demonstrate that adaptation to warmer climates drove the evolution of S. magniae by contributing to reproductive isolation from the more northern S. diabolicum. These two species are an excellent system for studying the process of speciation at the genomic level, as their recent divergence allows for dissecting specific islands of differentiation that are not present even in other species comparisons within the S. magellanicum complex (Supplementary Data Fig. S17). By identifying the loci responsible for the divergence of S. diabolicum and S. magniae and relating their molecular evolution to climate, this study provides a framework to further investigate how plants might adapt to changing environments.
SUPPLEMENTARY DATA
Supplementary data are available online at https://academic.oup.com/aob and consist of the following.
Table S1: Collection information for samples used in this study. Table S2: Environmental differentiation between samples of S. magniae and S. diabolicum. Table S3: Mann–Whitney U tests demonstrate that the distributions of distances between adjacent FST outlier windows on chromosomes are smaller than distributions of distances between randomly selected windows. Table S4: Genes that are present within islands of differentiation, have fixed differences between species that represent missense mutations, and are within 5 kb of SNPs that were significant using pcadapt and the gene–environment association analyses. Table S5: Gene ontology term enrichments for those genes that are present within islands of differentiation, have fixed differences between species that represent missense mutations, and are within 5 kb of SNPs that were significant using pcadapt and the gene–environment association analyses. Table S6: Gene ontology term enrichments for those genes that are within 5 kb of SNPs that were significant using pcadapt and the gene–environment association analyses. Figure S1: Models of divergence used for demographic analyses. Figure S2: Biplot from the principal component analysis of bioclimatic variables for all samples used in this study. Figure S3: Plots depicting the relationship of FST to other statistics calculated from all non-overlapping 10-kb windows of the genome and those windows that have outlier FST values. Figure S4: Boxplots depicting the distributions of population genetic statistics for 10-kb genomic windows that have outlier FST values and for those windows from the rest of the genome. Figure S5: Plot of relative genetic distance, absolute genetic distance and nucleotide diversity values calculated from 10-kb non-overlapping windows within the autosomal chromosomes 1 and 2. Figure S6: Plot of relative genetic distance, absolute genetic distance and nucleotide diversity values calculated from 10-kb non-overlapping windows within the autosomal chromosomes 3 and 4. Figure S7: Plot of relative genetic distance, absolute genetic distance and nucleotide diversity values calculated from 10-kb non-overlapping windows within the autosomal chromosomes 5 and 6. Figure S8: Plot of relative genetic distance, absolute genetic distance and nucleotide diversity values calculated from 10-kb non-overlapping windows within the autosomal chromosomes 7 and 8. Figure S9: Plot of relative genetic distance, absolute genetic distance and nucleotide diversity values calculated from 10-kb non-overlapping windows within the autosomal chromosomes 9 and 10. Figure S10: Plot of relative genetic distance, absolute genetic distance, and nucleotide diversity values calculated from 10-kb non-overlapping windows within the autosomal chromosomes 11 and 12. Figure S11: Plot of relative genetic distance, absolute genetic distance and nucleotide diversity values calculated from 10-kb non-overlapping windows within the autosomal chromosomes 13 and 14. Figure S12: Plot of relative genetic distance, absolute genetic distance and nucleotide diversity values calculated from 10-kb non-overlapping windows within the autosomal chromosomes 15 and 16. Figure S13: Plot of relative genetic distance, absolute genetic distance and nucleotide diversity values calculated from 10-kb non-overlapping windows within the autosomal chromosomes 17 and 18. Figure S14: Plot of relative genetic distance, absolute genetic distance and nucleotide diversity values calculated from 10-kb non-overlapping windows within the autosomal chromosome 19 and the sex-linked chromosome 20. Figure S15: Biplot and screeplot from the redundancy analysis of environmental and SNP data. Figure S16: Predictions of relative genetic divergence, absolute genetic divergence and nucleotide diversity for the four models of genomic differentiation island formation. Figure S17: Analysis of data demonstrating that FST is elevated across most of the genome in pairs of North American species from the Sphagnum magellanicum complex in contrast to the ‘islands of differentiation’ seen in the S. diabolicum–S. magniae comparison.
FUNDING
B.P. was supported by funding from the Laboratory Directed Research and Development Program of Oak Ridge National Laboratory, managed by UT-Battelle, LLC, for the U.S. Department of Energy under contract no. DE-AC05–00OR22725. The work (proposal: 10.46936/10.25585/60001030) (A.J.S.) conducted by the U.S. Department of Energy Joint Genome Institute (https://ror.org/04xm1d337), a DOE Office of Science User Facility, is supported by the Office of Science of the U.S. Department of Energy under contract no. DE-AC02-05CH11231. This research used resources of the Compute and Data Environment for Science (CADES) at the Oak Ridge National Laboratory. Additional support was provided by the DOE BER Early Career Research Program to D.J.W. and by NSF grants DEB-1737899 and DEB-1928514 to A.J.S. This research was also supported in part by the Tom and Bruce Shinn Fund from the North Carolina Native Plant Society.
CONFLICT OF INTEREST
The authors declare no conflict of interest.
ACKNOWLEDGEMENTS
We thank Jane Grimwood and Jenell Webber of the Genome Sequencing Center at HudsonAlpha for preparing and sequencing the genome resequencing libraries. This paper has been authored at UT-Battelle, LLC, under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide licence to publish or reproduce the published form of this paper, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public.access-plan).
DATA AVAILABILITY
Sequencing data have been deposited in the NCBI BioProject database under BioProject accession number PRJNA907771 (https://www.ncbi.nlm.nih.gov/bioproject/PRJNA907771) and accession numbers for individual libraries can be found in Supplementary Data Table S1. The genome assembly and annotation for the S. divinum reference (v.1.1) is freely available at Phytozome (https://phytozome-next.jgi.doe.gov/). Since the collection and sequencing of the reference genome began, the taxonomy of the S. magellanicum complex has undergone substantial revision (Hassel et al., 2018; Shaw et al., 2022, 2023). As such, the S. divinum reference genome is currently listed as ‘S. magellanicum’ on Phytozome and we are in the process of updating that database.
AUTHOR CONTRIBUTIONS
Bryan Piatkowski: conceptualization (lead), methodology (lead), software, formal analysis, investigation, resources (equal), writing – original draft (lead), writing – review & editing (lead), visualization, funding acquisition (supporting). David Weston: resources, writing – original draft (supporting), writing – review & editing (supporting), funding acquisition (lead). Aaron Duffy: methodology (supporting), resources (equal), data curation (equal). Blanka Aguero: resources (equal). Karn Imwattana: methodology (supporting), resources (equal), writing – original draft (supporting). Adam Healey: resources (equal), data curation (equal), writing – original draft (supporting), writing – review & editing (supporting). Jeremy Schmutz: resources (equal). A. Jonathan Shaw: conceptualization (supporting), resources (equal), writing – original draft (supporting), writing – review & editing (supporting), funding acquisition (lead).