Genome-wide patterns of genetic diversity, population structure and demographic history in mānuka (Leptospermum scoparium) growing on indigenous Māori land

Abstract Leptospermum scoparium J. R. Forst et G. Forst, known as mānuka by Māori, the indigenous people of Aotearoa (New Zealand), is a culturally and economically significant shrub species, native to New Zealand and Australia. Chemical, morphological and phylogenetic studies have indicated geographical variation of mānuka across its range in New Zealand, and genetic differentiation between New Zealand and Australia. We used pooled whole genome re-sequencing of 76 L. scoparium and outgroup populations from New Zealand and Australia to compile a dataset totalling ~2.5 million SNPs. We explored the genetic structure and relatedness of L. scoparium across New Zealand, and between populations in New Zealand and Australia, as well as the complex demographic history of this species. Our population genomic investigation suggests there are five geographically distinct mānuka gene pools within New Zealand, with evidence of gene flow occurring between these pools. Demographic modelling suggests three of these gene pools have undergone expansion events, whilst the evolutionary histories of the remaining two have been subjected to contractions. Furthermore, mānuka populations in New Zealand are genetically distinct from populations in Australia, with coalescent modelling suggesting these two clades diverged ~9–12 million years ago. We discuss the evolutionary history of this species and the benefits of using pool-seq for such studies. Our research will support the management and conservation of mānuka by landowners, particularly Māori, and the development of a provenance story for the branding of mānuka based products.


Introduction
According to Māori lore, all trees in Aotearoa are the children of Tāne Mahuta, God of the forest [1,2]. Māori have many uses for the taonga (culturally significant/treasure) tree species mānuka (L. scoparium J. R. Forst et G. Forst) including for medicine [3,4], food [5,6], hunting [7], fishing [6,8], weaponry [6,9,10] and as a building material [6]. More recently, mānuka has been used to produce high value honey, including by Māoriowned agribusinesses, pushing the export value of the New Zealand honey industry to $348 million (NZD) in 2018 [11,12]. Indigenous communities and businesses across New Zealand are seeking knowledge about the genetic variation and evolutionary history of mānuka as it is a culturally, ecologically and economically important species. This is so that an authentic and higher value honey industry based on indigenous plants of known New Zealand provenance can be established, and so that this genetic resource can be managed more sustainably for the future.
The genus Leptospermum J. R. Forst et G. Forst of the Myrtaceae family comprises approximately 87 species, predominantly distributed throughout southeast Australia (Victoria, New South Wales, Tasmania); species are also present in South East Asia, New Guinea, Rarotonga and New Zealand [13][14][15][16][17][18]. Specifically, one native species resides in New Zealand -L. scoparium. L. scoparium, which is also native to south-east Australia, is a woody tree species, distributed from coastal to subalpine environments (1800 m above sea level) [11]. In New Zealand, L. scoparium is indigenously known as mānuka or kahikātoa, and is found across all of New Zealand's major islands, including the Chatham Islands [19]. Mānuka is able to establish in adverse environments (e.g. low fertility soils, peat swamps, volcanic soils, geothermal areas, exposed coastal, sub alpine) and is a primary coloniser of recently disturbed habitats (e.g. post-fire and deforestation) [19,20].
Current understanding of the population structure within New Zealand mānuka is predominantly based on studies of its antibacterial metabolic compounds and morphological variation [20][21][22][23]. Perry, et al. [22] were able to establish via common garden experiments that regional differences in mānuka's triketone chemotypes are not determined by abiotic factors, demonstrating there is a genetic contribution to the regional differentiation of essential oil compounds in mānuka. Additionally, Perry, et al. [22] identified distinct chemotypic variation between L. scoparium samples from Australia and New Zealand, and recommended a taxonomic revision of this species. Expanding on this earlier work, Douglas, et al. [21] carried out extensive chemotypic sampling of mānuka across New Zealand, elucidating 11 chemotypes that displayed degrees of geographic association. Unique compounds are also found in the nectar of mānuka (dihydroxyacetone) and its honey (methylglyoxal) [24][25][26][27], leading to the development of several medical wound care products in recent years, as well as demand as a health food. Studies by Williams, et al. [28] and Stephens [29] indicate regional variation in the dihydroxyacetone content of mānuka's nectar and the non-peroxide antibacterial activity of mānuka honey, respectively, across New Zealand. Additionally, morphological and phenological variation of mānuka across New Zealand was found to correlate with geographic and environmental factors, and common garden experiments indicated that differences were likely linked to genetic variation [20].
Recently, a molecular study of mānuka was published [30], revealing the phylogenetic grouping of New Zealand mānuka into three clades -a Northland clade, a central and southern North Island clade and a South Island clade. However, as an inadequate sample size prevented Buys et al. [30] from carrying out population genetic analyses, the population genetic structure of mānuka across New Zealand remains unexplored. Similarly, gaps remain in our knowledge regarding the demographic and evolutionary history of L. scoparium in New Zealand and the relationships between New Zealand and Australian populations of the species. It is hypothesised that L. scoparium arrived in New Zealand from Australia (either from the mainland or from Tasmania) during the Miocene (23.03-5.33 Mya) [13,32]. During this time, New Zealand has been subjected to geological events (mountain building, volcanism) [33,34] and glacial cycling (12000-5 Mya) [35,36] that may have shaped the diversity and distribution of mānuka following its establishment in New Zealand.
The use of pooled sequencing (pool-seq) to obtain genome-wide variants provides a relatively economical means to study genomic variation at a population scale [37][38][39][40][41][42]. Equal quantities of high-quality DNA from multiple individuals (preferably n= >25 [43],) are pooled for each population and sequenced using next-generation sequencing (NGS). Mapping to a high-quality reference genome reveals genetic variants in the NGS data -commonly single nucleotide polymorphisms (SNPs). Provided that read coverage across sites is high (>100x) to ensure reliability [43,44], various analyses can then be carried out, including: population genomic analyses [37,45,46]; genome wide association studies [47,48]; gene environment association studies [38,49]; and the modelling of evolutionary histories [50,51]. Pool-seq is an increasingly popular method, and provided the correct quality control steps are taken, validation studies have proven it to be a valuable and informative tool [39,44,52,53]. Alternatives to pool-seq include genotype-by-sequencing (GBS) methods, however such approaches only capture a fraction of the genome variants, are error-prone and produce missing data [54].
A complete assembly of the mānuka genome was recently developed, scaffolded into the expected 11 pseudo-chromosomes, which are syntenic to the related Myrtaceae model species Eucalyptus grandis [55]. In the present study we applied pooled whole genome resequencing of mānuka and kānuka (Kunzea robusta de Lange et Toelken, used as outgroup) collected from naturally occuring stands on indigenous Māori owned land across New Zealand and in native stands on public land in Australia to identify genome-wide DNA variants. With this data, we differentiate local provenances of L. scoparium within New Zealand, and between New Zealand and Australian populations, as well as provide insights into the complex demographic and evolutionary history of this species.

Whole genome re-sequencing
The DNA of approximately 30 trees was pooled for each New Zealand site (n = 68) and 15 to 30 trees from each Australian site (n = 6) (Supplementary material 1; 2). For some populations we were unable to extract adequate DNA from all 30 samples, thus pools for some populations comprised fewer than 30 individuals. From the Novaseq 6000 sequencing runs a total of 5811.53x10 9 nucleotidic bases ((1x10 9 = Gb) from 19.27x10 9 sequencing reads were generated, with an average 253.61 billion sequencing reads and 76.46 Gb per collection site. The quality of the sequencing reads was high, with most sequences having a quality score greater than Q30 across the 150 bp reads (data not shown). Read mapping to the reference genome was consistent among populations and ranged between 50 and 64% of the L. scoparium reads mapping to the reference genome of L. scoparium "Crimson Glory" and 26% of the kānuka reads mapping to the mānuka reference genome (Supplementary material 3). The remaining sequences

Variant detection
A total of 10.14 million candidate variants (SNPs and indels) were detected across the 76 DNA pools and throughout the genome prior to filtering. A stringent filtering protocol was applied to the raw set of variants to remove sequencing errors, rare variants and variants occurring in low depth regions, only keeping SNPs with an average read depth greater than 100X and no missing data, resulting in 5 503 881 SNPs being detected across the reference assembly of "Crimson Glory". When further filtering was applied using minor allele frequencies (MAF) of 5% and 2%, 1 498 405 and 2 513 694 SNPs were detected across all 76 collection sites (including both kānuka and Australian L. scoparium samples), respectively (Table 1). When both kānuka populations were excluded (i.e. only L. scoparium populations taken into account), a total of 2 580 451 and 4 049 649 SNPs were detected using MAF of 5% and 2%, respectively. When only the 68 New Zealand mānuka populations were included (i.e. Australian L. scoparium and kānuka populations excluded), a total of 2 526 589 and 3 270 864 SNPs were detected using MAF of 5% and 2%, respectively.

Genetic differentiation and population structure
Allele frequencies generated from 1 498 405 SNPs for 76 populations (including Australian L. scoparium and kānuka) revealed the dataset consists of seven clusters (Fig. 1A) as determined by the optimized K-means clustering algorithm find.clusters in Adegenet v2.1.1 [59]. The function was used to run 30 successive Kmeans analyses, with an optimal K of seven selected based on Bayesian Information Criterion (BIC) (Fig. 1A). A Discriminant Analysis of Principal Components (DAPC) [59] using this optimal K value supported the segregation of the 76 populations into seven genetically and geographically distinct clusters ( Fig. 1 C,D). Linear dimensions (LD) one, two, three and four (LD1, LD2, LD3 and LD4) of the DAPC (K = 7) explained 69.55%, 13.55%, 7.54% and 4.56% of the data variation, respectively ( There was also evidence of strong partitioning between L. scoparium populations in New Zealand and those sampled from Australia (including Tasmania), with the six populations of L. scoparium from Australia forming their own cluster. The exploration of DAPC posterior-membership probabilities at five different K values (K = 5 to K = 9) further demonstrated the strong geographic structuring of the New Zealand mānuka populations (Fig. 2). When K = 5 three New Zealand clusters were identified -the NNI cluster, a central, eastern and southern North Island cluster and one cluster containing all South Island populations. When K = 6 the central, eastern and southern North Island cluster split into two -into the CSNI cluster and the ECNI cluster. When K = 7 (the optimal K value), the South Island cluster divided into the NESI and SWSI clusters. The single Tasmanian population was identified as an additional cluster when K = 8, however at K = 9 this cluster was lost, with the NNI and SWSI clusters both dividing in two. The NNI populations that formed an additional cluster at K = 9 (P037, P040 and P041) are distributed at the northern tip of the North Island, and the four SWSI populations that formed an additional cluster at K = 9 are all distributed on the West Coast of the South Island, indicating additional geographical structuring. P050 is the only population to demonstrate mixture between clusters (NNI and CSNI), predominantly being clustered with CSNI at K = 5, K = 6, K = 7 and K = 8, and is entirely clustered with NNI at K = 9. This population occurs at the northern edge of the CSNI distribution, and is in close proximity to an NNI population (P066).
Pairwise Fst genetic distances were calculated in PoolFstat v1.0.0 60 using mpileup and sync input files (Fig. 3A). Overall average Fst distances for all New Zealand populations was 0.128 (SD = 0.057) ( Table 2). Similarly, an unrooted, neighbour-joining network generated in SplitsTree4 v4.14.8 [61] using the pairwise Fst distance matrix revealed strong structuring between the aforementioned gene pools; however, there was also an apparent conflicting signal observed within the dataset, visible in the boxes formed within the SplitsTree network (Fig. 3B). An Isolation by Distance (IBD) analysis revealed a significant relationship between genetic distances and geographic distances when Australian populations were included (p-value = 0.022) but no significant relationship within New Zealand populations when analysed alone (p-value = 0.383).

Genetic diversity and demographic modelling
Summary statistics generated by NPStats v1 [46] were calculated in non-overlapping 10 kb windows and results are presented as the weighted medians of these windows averaged per gene pool (Table 3) [62] analysis. Based on the explained variance of eleven different migration events, it was determined that five to six migration events best explained the observed data used in the TreeMix model (Fig. 4A, B). Phylogenetic trees produced by this analysis support the clustering of New Zealand mānuka into five gene pools (NNI, CSNI, ECNI, and two South Island clades), with clear division also seen between New Zealand mānuka and Australian L. scoparium (Fig. 4C). The five migration event analysis suggests two major migration events have occurred between New Zealand and external gene pools: one between mainland Australia and Northland (NNI), and another from an ancestral population in Australia or Tasmania (not sampled in our study) into the NNI gene pool. Within New Zealand, there have been major migration events between the SWSI gene pools and NNI, and between the CSNI gene pool and NESI. An additional migration event is seen within the CSNI gene pool. When a sixth migration event was applied alongside the aforementioned events, the model determined an additional event from NNI to the SWSI gene pool (Fig. 4C). Migration events with the greatest weight are seen between the CSNI and NESI gene pools, and between NNI and SWSI gene pools. The results suggest no substantial migration events have occurred from New Zealand back to Australia.
SFS-based demographic models that explored four hypotheses (population stability, bottleneck, expansion and contraction models) suggest that NNI, CSNI and SWSI gene pools have all undergone expansion events, whilst the ECNI, NESI and Australian gene pools have all undergone contractions ( Fig. 5A; Table 4). Parameter estimates and associated summary statistics inferred from 100 parametric bootstraps of the most suitable model for each gene pool, as well as output of the best overall model based on AIC can be found in Supplementary material 8. In the case of the three gene Table 3. Summary statistics averaged by gene pool of population genetic parameters calculated for using NPStats. S = segregating sites, π = nucleotide diversity, SD = standard deviation, NNI = northern North Island, CSNI = central and southern North Island, ECNI = East Cape North Island, NESI = north eastern South Island, SWSI = south western South Island. Individual summary statistics for each populations can be found in Supplementary material 7 The model that best explains the evolutionary history of Australian and New Zealand populations (represented by the NNI gene pool) and the arrival of mānuka to New Zealand was the D-CGF-CE model ( Fig. 5B; Fig. 6; Table 5). The parameter estimates for this model suggest there has been continuous, asymmetrical gene f low between the two countries, with more gene f low occurring from Australia to New Zealand than vice versa (Australia to NNI = 1.03E-05, NNI to Australia = 2.67E-06). Results of this model also suggest that the split between New Zealand and Australian populations occurred

A genomics study supporting indigenous branding
This first population genomic study of L. scoparium indicates clear geographic structuring of mānuka across New Zealand and a strong genetic differentiation between New Zealand and Australian populations. This research was co-developed in response to demand for this information by indigenous agribusinesses, with an interest in developing strong regional branding associated with their cultural connections to the land. Such evidence will assist Māori by addressing economic, cultural and ecological aspects. In terms of economic benefits for Māori, our research will underpin the veracity of novel mānuka-based products and assist Māori to achieve the potential of this natural resource, leveraging off global demand for provenance information -particularly for food. Determining the provenance of the plants will provide indigenous innovators producing honey with marketing opportunities unique to Māori. This branding will identify these products as being premium and authentic, and create a strong differentiation from other types of honeys, including from Australia. Finally, understanding mānuka genetic diversity will contribute to better management of this resource, improving the resilience of the honey industry.

Pool-seq
Pooled whole genome re-sequencing (pool-seq) was applied in this study as an alternative to individual genotyping. We chose this strategy as opposed to reduced representation methods such as genotyping by sequencing (GBS) as it gives access to the full set of variants across the genome. In our experience, GBS can yield variable read depth across genomic regions, with many errors and missing data [54]. Whilst poolseq does have its limitations [63] (no true sample genotypes, inability to calculate linkage disequilibrium, limited downstream applications), we believe our dataset is superior to a GBS dataset as we obtained a large proportion of the reference genome with a consistent read depth. Additionally, pool-seq is an economical choice compared with barcoding the complete sampling set, as only a limited number of costly sequencing indices are required. This allowed us to sequence a larger set of individual trees across more regions (2325 trees across 76 sites). One hundred times coverage and > 25 individuals per pool were recommended to attain reliable results [43,44], and we were able to achieve this for the majority of populations. This enabled us to estimate allelic frequencies within populations. Our results are consistent with morphological [20], phenological [20] and chemotypic [21,22] studies that have suggested there is geographical structuring of mānuka across New Zealand, and builds on previous genetic work by Buys et al. (2019).

Population genetic structure
Genetics are commonly mentioned as a factor influencing the observed variation in the chemotypic compounds, morphology and phenology across mānuka's range. However, no previous studies have investigated the population genetic structure of L. scoparium across New Zealand, or between New Zealand and Australia, or provided insight into the demographic history of this species. Our population genetic investigation indicates that there are five genetically and geographically distinct gene pools of L. scoparium in New Zealand, and that these populations are genetically distinct from Australian L. scoparium populations. DAPC, Fst, SplitsTree and IBD analyses all support the genetic differentiation of these gene pools. However, the TreeMix analyses suggest that despite this genetic differentiation, gene flow has occurred between these gene pools -including between New Zealand and Australian populations. Genetic differentiation (Fst) indicates that New Zealand populations have been separated from Australian populations for a long period of time; however again, the TreeMix analyses suggest that multiple migration events have occurred from Australia to New Zealand (most likely to NNI populations). There is no evidence to support significant migration occurring from New Zealand back to Australia.  was the result of conf licting signal within their dataset.
We can now confirm that the conf licts in signal they encountered are likely the result of gene f low occurring between populations and gene pools within New Zealand and between Australian and New Zealand populations. Between the New Zealand populations, pairwise Fst was relatively low (cf. difference between New Zealand and Australian populations), particularly between neighbouring populations where gene f low is more readily facilitated. This is demonstrated by ECNI and CSNI gene pools having the lowest average pairwise Fst, and NESI and SWSI also exhibiting relatively low average pairwise Fst. Interestingly, the highest pairwise Fst were not between the most geographically distant gene pools, however, but between ECNI and NESI. The seeds and pollen of mānuka are dispersed via wind [64], and it is possible that wind may more easily carry seeds and pollen from the South Island to the northern North Island, than across the North Island to the East Cape.

Evolutionary history
The genus Leptospermum is thought to have evolved in Australia during the Miocene (5 -23Mya), with the split between L. scoparium and its sister species Leptospermum trinervium estimated to have occurred 15.9 Mya (95% confidence interval = 6.7-26.8 Myr) in Australia; arrival of an ancestral L. scoparium lineage to New Zealand would Table 5. Maximum likelihood and AIC statistics for the demographic models used to investigate the divergence of Australian and New Zealand Leptospermum scoparium gene pools. The model considered to be the best fit is in bold. These models were then run 100x (See Supplementary material 8). DeltaL = the difference between the maximum estimated likelihood and the maximum observed likelihood. AIC = Akaike's Information Criterion have occurred subsequently [13,32]. Results of our demographic models are congruent with this. If we assume a generation time of 15-20 years for mānuka presuming that larger and more mature L. scoparium trees contribute more to reproduction, our results suggest that the New Zealand lineage of L. scoparium split from the Australian lineage ∼9-12 Mya. This timing also fits with the arrival and divergence of other woody angiosperm species in New Zealand, following the end of the Miocene Thermal Optima ∼15 Mya [65,66]. However, mānuka can flower from three to five years after germination, which would correspond to ∼2-5 Mya. Therefore, the population structure we have observed in New Zealand mānuka (into five gene pools) likely reflects phylogeographic patterns associated with climatic and glacial cycling of the Plio-Pleistocene (12 000 Ka -5 Mya) [35,36], and/or New Zealand's turbulent geological history of mountain building and volcanic activity [34,67]. For two gene pools (NNI, SWSI), the individual demographic models indicate that N e began increasing ∼4-7 Mya, suggesting these gene pools may have benefitted from the cooler climate of the Plio-Pleistocene, expanding their ranges in the north of the North Island and in the south western South Island, respectively.
These two regions are considered to have been relatively geologically and climatically stable over the last ∼5 Myr [68]. In particular, the southern South Island during this time would have consisted predominantly of tussock grasslands and shrub-lands, which may have provided suitable open habitat for mānuka to establish and then to expand its range [34,69,70]. If we take the estimated date of expansion of NNI from the Australian and New Zealand divergence model, it suggests that NNI did not begin expanding until more recently (∼260 000-350 140 years ago). This timing is similar to that estimated for the CSNI gene pool, with expansion estimated to have begun ∼350 000-475 000 years ago. Fires occurred regularly in acid mires during the Plio-Pleistocene in the central and northern North Island, whilst volcanic activity in the central North Island would have also caused fires and significant disturbance to landscapes due to ash fall [34,[71][72][73]. L. scoparium is a seral species, known to establish in disturbed landscapes, particularly after fire, an ancestral adaptation that evolved to enable it to survive forest fires in Australia [73]. Serotinous L. scoparium populations are more common in the North Island (correlating with the history of natural fires in New Zealand) and this may be one reason for the success and expansion of NNI and CSNI gene pools in these two regions over the last ∼350 000 years [73].
For the two remaining New Zealand gene pools (ECNI, NESI), demographic modelling suggests N e has been decreasing over the last ∼8250-19 000 years. This timing coincides with the end of the Last Glacial Maximum (LGM) (when global ice sheets were at their maximum integrated volume) [74] and warming of the climate, suggesting that as glaciers, tussock grassland and alpine shrub-land receded, so too did the distribution of mānuka in the north of the South Island and in the East Cape in the North Island. For the northern South Island in particular, there would have been considerable habitat loss at this time as sea levels rose, and the land bridge connecting the North and South Islands was inundated [34,69], potentially explaining the contraction of the NESI gene pool. Results of the demographic modelling also suggest that N e of the Australian gene pool has decreased significantly (∼98.98% decrease) over the last ∼17 000-23 000 years, however, our sampling of this gene pool is very limited (considering the distribution of L. scoparium in Australia). Sampling across the complete distribution of L. scoparium in Australia is required to draw any further conclusions about the evolutionary history of this gene pool.
It has been hypothesised, based on morphological data, that L. scoparium originated in Tasmania and simultaneously dispersed to Australia and New Zealand [13]. Although we only have one population from Tasmania, our results provide no evidence to support the hypothesis that L. scoparium originated in Tasmania and dispersed to Australia and New Zealand simultaneously. Based on our findings, it is likely that L. scoparium originated on mainland Australia, and subsequently dispersed to Tasmania and then to New Zealand, with migration events also occurring from mainland Australia to New Zealand. Wider sampling within Australia and Tasmania would help to confirm this hypothesis, however similar conclusions were reached by Buys et al. (2019) and Stephens (2006) in their phylogenetic and chemotype studies, respectively. Similarly to those of Stephens (2006), our findings suggest that there have been multiple introductions of L. scoparium from Australian populations to New Zealand. Eastward, long-distance dispersal of flora [75][76][77][78][79] and fauna [80][81][82] from Australia to New Zealand is not uncommon, and is often attributed to the Tasman Current and the Antarctic Circumpolar Current phenomenon of the Southern Hemisphere (formed following the separation of Antarctica from South America and then Australia in the mid Oligocene ∼28-35Mya) [77,83,84].

Conclusion
The findings from this study will support the commercialisation of mānuka honey based on its indigenous and geographic origins. Additionally, data produced in this study can contribute to the conservation management of the mānuka germplasm, ensuring genetic diversity is maintained in an industry where selective planting is occurring. Furthermore, the genetic structure of New Zealand mānuka as five different clusters, all distinct from one another and from Australian populations, could form the basis for a future reclassification and taxonomic definition of this species. Finally, knowledge of genome variants across its natural range can be used in future research to understand key traits such as tolerance to the tree pathogen myrtle rust (Austropuccinia psidii) (Smith et al. 2020) and adaptation to climatic and environmental conditions.

Indigenous considerations and data sovereignty
Māori, the indigenous people of New Zealand, currently have a claim under the Waitangi Tribunal (the mechanism for redress via Te Tiriti o Waitangi (The Treaty of Waitangi) -the founding document of New Zealand), relating to intellectual property pertaining to native flora and fauna [56]. This claim has not been settled to date, but current convention in New Zealand research is to recognise Māori connection and guardianship over native flora and fauna, acknowledging that this claim is still before the tribunal. Therefore, before samples for genetic analysis were collected from natural stands of mānuka grown on Māori freehold land, mana whenua (Māori who have historical and territorial rights over the land) were individually approached and were briefed about the objectives of the research study. The privilege to access taonga material was granted for the purpose of this research and further research on these samples would require consent, including access to more samples, and further analysis or disclosure of the exact identity of the trees. For this reason, our sampling sites are named according to their region of origin and the exact location of the samples and identity of the people who contributed has purposefully been occulted to address the sensitivity of this indigenous intellectual property. For further comments on the involvement and interaction of Māori and science see Morgan et al. [57] and Hudson, et al. [58].

Sampling
In total, samples from 2325 individual mānuka (n = 2265) and kānuka (Kunzea robusta) (n = 60) trees were collected from 70 sites (mānuka = 68 sites, kānuka = two sites) around New Zealand and six sites in Australia (Supplementary material. 1; 2). All trees were collected from naturally established stands, and not from commercially grown plantations. For each tree, fifteen young expanding leaves were sampled in duplicate, with at least 50 m between trees to ensure they were unlikely siblings, and metadata (e.g. latitude, longitude) were recorded. Samples were kept on silica beads in 2 mL screw cap tubes before being stored at −80 • C. For all New Zealand samples DNA was extracted using a modified CTAB protocol [85] and quantified using the Quant-iT™ PicoGreen™ dsDNA Assay kit (Invitrogen, Burlington, ON, Canada) and a SpectraMax ® Gemini EM Microplate Reader (Molecular Devices, San Jose, California, USA). Australian samples were extracted using the Qiagen DNeasy Plant Mini Kit on the QIAGEN QIAcube, the ISOLATE II Plant DNA Kit (Bioline) using buffer PA1 to manufacturer's protocol and extending the lysis incubation to 40 minutes, or by using the Machery-Nagel Nucleospin Plant II Kit with the PL2/PL3 buffer system. Results were examined using SoftMax ® Pro Software (Molecular Devices, San Jose, California, USA). DNA of each sample was normalised to the same concentration and then pooled for each collection site to obtain two micrograms of total DNA per site (Supplementary material 2).

Pooled sequencing and variant calling
All 76 DNA pools were indexed using Illumina (Illumina, Inc., San Diego, USA) commercial indices, following the manufacturer's protocol. DNA pools were sequenced using the Illumina NovaSeq 6000 System utilising S2 flow cell technology and seven lanes of 2x150 base pair (bp) output at the Australian Genome Research Facility (AGRF, Melbourne, Australia). The libraries were approximately 600 bp. Raw fastq sequencing files from the 76 pools were separated by indices and quality control was carried out using md5 checksum v3.3 [86] and FastQC v0.11.8 [87]. Read mapping was performed against the reference genome of mānuka "Crimson Glory" [55] using Bowtie2 v3.4.3 [88], applying the -end-to-end parameter for each site independently. SAMtools v1.7 [89] was utilised to convert output SAM files into BAM files, which were sorted and indexed against "Crimson Glory". Coverage statistics for each BAM file were acquired by applying the SAMtools flagstat utility. BCFtools v1.9 [90] was used to detect variants for each pseudo-chromosome of the reference genome, implementing the -m and -v options of BCFtools. Filtering of raw variants into high confidence SNPs was carried out using VCFtools v0.1.14 [91], applying the following parameter settings: a minor allelic frequency (MAF) of 5%, 2% and 0% (-maf 0.05, -maf 0.02 and -maf 0.00); no indels (-remove-indels); a minimum mean read depth of 100X for each site (-min-meanDP 100); and no missing data (-max-missing 1) (Table 1).

Population genetics and demographic modelling
Allele frequencies were calculated following SNP calling in the R v3.5.0 [92] package vcfR v1.8.0 [93]. Population structure was examined via k-means clustering (find.clusters) and a Discriminant Analysis of Principal Components (DAPC) functions from the R package Adegenet v2.1.1 [94]. The find.clusters analysis was performed using max.n = 30 and n.pca = 100, whilst an initial DAPC analysis was performed using n.pca = 100, and n.da = 100. An optimum number of Principal Components (PC) was discerned from this initial run using the optim.a.score function (also from the Adegenet package), before a final DAPC analysis was run using both optimised K and PC values. Population pairwise Fixation Indices (Fst) were calculated utilising PoolFstat v1.0.0 [60], using mpileup and sync input files. Neighbour-net networks and IBD analyses were implemented in Split-sTree4 v4.14.8 61 and ade4 v1.7-15 [95], respectively, both using the aforementioned Fst genetic distance matrix. Diversity statistics and neutrality tests (segregating sites (S), nucleotide diversity (π ), Watterson's θ and Tajima's D) were carried out using NPStats v1 [46], applying -l 10 000 -mincov 25 -maxcov 500. An allele count dataset consisting of 74 populations (filtered by MAF of 0.05, and excluding kānuka) was utilised for the TreeMix v1.13 analyses [62], where eleven migration events (0-10) were explored in 1000 SNP blocks, and the trees were rooted by the Australian population VLs. Covariance data were extracted from the TreeMix output files and an equation provided by Pickrell and Pritchard [62] (fraction f, page 5) was used to calculate the explained variance of each migration run.
To explore the demographic history within and between New Zealand and Australian gene pools (established in our DAPC, Fst and TreeMix analyses), we applied a coalescent simulation-based method in fastsimcoal v2.6.0.3 [96]. Rare variants were called in VCFtools using the settings: MAF 0.00, missing = 0, depth = 100, resulting in a dataset of 5 503 881 SNPs. The minor allele frequencies of these SNPs were averaged across the populations within each gene pool, to fully encapsulate the genetic diversity and structure of each. Unfolded Site Frequency Spectra (SFS) and two-dimensional SFS (2D-SFS) (using P011 (kānuka) as an outgroup) were calculated for each gene pool using SweepFinder2 [97] and a custom R script (Supplementary material 9), respectively. For every demographic model tested, 100 000 coalescent simulations were applied in fastsimcoal2, with maximum likelihood estimates calculated based on differences between the input observed SFS and the output expected SFS. Models were repeated 100 times, and a global maximum likelihood estimate was obtained from these independent runs and Akaike's Information Criterion (AIC) calculated for model comparison and selection.
Individual demographic models were implemented for each gene pool in fastsimcoal2, with neutral, bottleneck, expansion and contraction models explored. Multi-gene pool models were then carried out. Additionally, a 2D unfolded SFS matrix between the Australian gene pool and a representative New Zealand gene pool (Northland) was calculated to explore the demographic history of divergence and gene f low between New Zealand and Australian L. scoparium. Five models were tested: divergence followed by isolation (D-I); divergence followed by continuous gene f low (D-CGF); divergence with ancestral gene f low followed by isolation (D-AGF-I); divergence with only recent gene f low (D-RGF); and divergence with continuous gene f low as well as contraction for Australia and expansion for New Zealand (D-CGF-CE). The structure of the final model was based on findings of the initial individual demographic models.