Geographical structure of genetic diversity in Loudetia simplex (Poaceae) in Madagascar and South Africa

Ecologically dominant species are primary determinants of ecosystem function, especially in grassy ecosystems, but the history and biology of grassy ecosystems in Madagascar are poorly understood compared to those of Africa. Loudetia simplex is a C 4 perennial grass that is adapted to fire and common to dominant across Africa. It is also widespread across central Madagascar in what are often thought to be human-derived grasslands, leading us to question how recently L. simplex arrived and how it spread across Madagascar. To address this, we collected population genetic data for 11 nuclear and 11 plastid microsatellite loci, newly developed for this study, for > 200 accessions from 78 populations of L. simplex, primarily from Madagascar and South Africa. Malagasy and African populations are genetically differentiated and harbour distinct plastid lineages. We demonstrate distinct geographically clustered diploid, tetraploid and hexaploid groups. The Malagasy hexaploid populations cluster into northern and southern types. In South Africa, diploid populations in the Drakensberg are distinct from tetraploid populations in north-eastern South Africa. Different genetic clusters are associated with significantly different precipitation and temperature. We conclude that L. simplex is native to both Madagascar and South Africa, probably with a single colonization event from Africa to Madagascar followed by pre-human diversification of L. simplex populations in Madagascar. REV_HeadA=REV_HeadB=REV_HeadA=REV_HeadB/HeadA REV_HeadB=REV_HeadC=REV_HeadB=REV_HeadC/HeadB REV_HeadC=REV_HeadD=REV_HeadC=REV_HeadD/HeadC REV_Extract3=REV_HeadA=REV_Extract1=REV_HeadA BOR_HeadA=BOR_HeadB=BOR_HeadA=BOR_HeadB/HeadA BOR_HeadB=BOR_HeadC=BOR_HeadB=BOR_HeadC/HeadB


INTRODUCTION
Ecologically dominant grasses are ecosystem architects that drive the distribution and function of grassy biomes (Linder et al., 2018). Open-canopy biomes dominated by C 3 grasses have been present on all continents since 5-40 Mya, before C 4 grasslands came to prominence during the Miocene grassland expansion 3-8 Mya (Strömberg, 2011). Tropical opencanopy ecosystems are dynamic: their distribution is determined by interactions among climate, soil fertility, vegetation traits and disturbance regimes of fire and mammalian herbivory (Lehmann et al., 2011Hempson et al., 2019). The diversity of tropical grassy biome landscapes, ranging from treeless grasslands to miombo woodlands, has in a number of cases led to a lack of recognition of these systems being ancient, with an associated unique biodiversity (Parr et al., 2014;Bond, 2016;Lehmann & Parr, 2016). Particularly striking is the historically contrasting interpretation of tropical grassy biomes in Africa versus Madagascar. African savannas have long been respected as ancient iconic ecosystems (e.g. Dunphy & Leonard, 1999). In contrast, Malagasy grassy biomes have been classified as being of anthropogenic origin, appearing in the last 7000 years (Koechlin, Guillaumet & Morat, 1974;Koechlin, 1993;Lowry, Schatz & Phillipson, 1997;Moat & Smith, 2007;Gautier et al., 2018). However, there is a growing body of research into the nature of grasses and grassy ecosystems in Madagascar and the role of human activities in shaping Malagasy biodiversity (Bond et al., 2008;Vorontsova et al., 2016;Solofondranohatra et al., 2018Solofondranohatra et al., , 2020Helmstetter et al., 2020), and such new research is fundamental to developing informed land management policy.
Grasses are challenging subjects for population genetic studies due to their frequently complex histories of introgression and hybridization, polyploidy and diverse reproductive systems, often including apomixis (de Wet, 1986;Gibson, 2009). Nevertheless, it is unwise to assume that such widespread species act as uniform entities. Population genetic data are crucial to understanding the diversity and history of grassy ecosystems. After decades of research restricted to crops and sometimes forages, common wild grasses within easy reach of laboratories are beginning to receive attention; e.g. (1) in Europe, genetic signatures of populations of Festuca rubra L. reveal their post-glacial history, with higher ploidies occurring more often in genetically poor northern populations (von Cräutlein et al., 2019); (2) genetic diversity of Andropogon gerardi Vitman in North America (McAllister et al., 2015;McAllister & Miller, 2016) demonstrates a long complex history with numerous locally adapted populations and (3) several species of Triodia R.Br. in Australia show infraspecific cytotype diversity reflecting range expansion with aridification (Anderson et al., 2017(Anderson et al., , 2019. However, few modern studies have focused on African grasses beyond crop relatives, invasive species and commercially significant lawn and forage grasses such as Cynodon Rich. (Wu et al., 2004) and Urochloa P.Beauv. (e.g. Jungman et al., 2010). Investigations into the cytotype diversity and geographical structure of common African grasses of no commercial relevance have not been attempted to date.
Loudetia simplex (Nees) C.E.Hubb. (Poaceae, Panicoideae, Tristachyideae) constitutes the primary ground cover in the hills surrounding Antananarivo, Madagascar (Solofondranohatra et al., 2020;Fig. 1A). According to the standard reference on the grasslands of Madagascar (Koechlin, 1993), it dominates a 'vast and nearly flat plateau' in the central highlands and occupies thin ferritic soils in the western savannas. In tropical Africa, L. simplex is also one of the most common grasses (White, 1983), but it achieves dominance in only some of its ecosystems such as grassland in the uKahlamba Drakensberg Park (Fig. 1B), rangelands in Mufulira in Zambia (Mukutu, 2019) and the Cuito catchment area in Angola (Goyder et al., 2018). Loudetia simplex is a tufted, perennial C 4 bunchgrass 30-150 cm tall (Clayton, 1974;Clayton et al., 2006). The dense bases of its leaf sheaths protect the tussock from fire, enabling it to re-sprout from buds close to ground level. The morphology of L. simplex is variable, and the species encompasses significant morphological diversity as circumscribed by Clayton (1974). Loudetia is a morphologically fairly uniform pantropical genus of 26 species centred in Africa (Clayton & Renvoize, 1986), but it is probably polyphyletic and remains difficult to delimit phylogenetically in relation to the other members of Tristachyideae: Tristachya Nees, Trichopteryx Nees and Loudetiopsis Conert (Phipps, 1967;Clayton, 1972;Hackel et al., 2018). The base chromosome number for Tristachyideae is 10-12 (Kellogg, 2015), and chromosome counts of 20, 40 and 60 have been recorded for L. simplex, representing diploids, tetraploids and hexaploids, respectively (Moffet & Hurcombe, 1949;Li, Lubke & Phipps, 1966;Dujardin & Beyne, 1975); isolated counts of 24 have also been recorded (Rice et al., 2015).
Is L. simplex naive to Madagascar or was it brought there by people since their first arrival on the island c. 10 kya (Hansford et al., 2018)? Bosser (1969) cited all the components of the modern L. simplex as endemic and therefore native to Madagascar, and L. simplex is not included in published lists of the introduced plants and weeds of Madagascar (Kull et al., 2012;Le Bourgeois et al., 2019). Nevertheless, it lacks a recognized native status. A possible non-native origin was implied by Koechlin (1993), who stated that the grasslands of Madagascar are 'almost entirely made up of secondary communities with grasses dominating', and discussion by Lowry et al. (1997) of 'secondary grasslands with extreme floristic impoverishment, the most dominant of which are widespread, e.g. Loudetia spp.' To answer the question of whether L. simplex is native to Madagascar, we aim to explore the genetic structure of its populations in Africa and Madagascar, to understand whether spatial genetic diversity is random or geographically structured. If populations in Madagascar derive from a single introduction event, we would expect them to be more genetically uniform than populations in other regions. If they originate from multiple introductions, we would expect polyphyly of the Malagasy lineages. We use a population genetic approach to compare L. simplex from mainland Africa and Madagascar, analysing nuclear and plastid microsatellite loci, and briefly compare climate envelopes of the population clusters.

Sampling
Two hundred and thirty-six accessions from 78 populations of L. simplex from Madagascar and Africa (Supporting Information,  Hackel et al., 2018; Supporting Information, Table S1). We included seven DNA extracts from previous work (Hackel, 2017;Hackel et al., 2018). Twenty-five sites dominated by L. simplex (22 from Madagascar and three from South Africa) were sampled at the population level during the field seasons 2016-2018: at each site, five to 11 leaf samples were preserved in silica gel from individuals separated by c. 200 m; one herbarium voucher was collected from each site. Single samples were also collected, and 28 leaf samples were removed from herbarium specimens in Madagascar (TAN herbarium, three samples, herbarium acronyms fide Thiers, 2019) and South Africa (PRE herbarium, 25 samples).

genome Size eStimation
Seeds of L. simplex from Ibity in Madagascar (accession number 226251) and Burkina Faso in mainland Africa (accession number 184113) were provided by the Millennium Seed Bank Partnership (MSB, Royal Botanic Gardens, Kew) and germinated. Nuclear DNA content was estimated for one individual of each accession following Doležel, Greilhuber & Suda (2007), using a Partec Cyflow SL3 flow cytometer (Partec GmbH, Münster, Germany). Three replicates from each individual were measured separately and Petroselinum crispum (Mill.) Fuss (2C = 4.5 pg) was used as the calibration standard. The mean value of the genome size (2C) of the three replicates was calculated.

Nuclear microsatellites
Total genomic DNA was extracted using 15-20 mg of dry plant material and 20 mg of fresh leaf material according to the CTAB method of Doyle & Doyle (1987), followed by a purification step using the QIAquick PCR Purification Kit (Qiagen). DNA from two individuals of L. simplex from Madagascar (IB-09-A and AN-06-B) was sequenced on an Illumina MiSeq after double-digest restriction siteassociated library preparation following Peterson et al. (2012). Restriction enzymes used were EcoRI and MspI, and size selection was performed on a Pippin prep (468-546 bp). Restriction-associated DNA libraries were used for microsatellite discovery. Primers were designed using the msatcommander-1.0.8-beta software (Faircloth, 2008). Microsatellite sequences, which had between six and 19 motif repeats with each motif consisting of two to four nucleotides, were selected. Only microsatellite sequences nested in a sequence-fragment for which a BLAST-search (Altschul et al., 1990) led to hits representing the nuclear DNA of plants (in particular of grasses) were considered; plastid sequences were excluded. Corresponding primers that were likely to form secondary structures or that had multiple annealing sites were also excluded. The final primer pairs (Table 1) were chosen based on successful amplification of DNA-fragments of the expected size in test-PCRs. Forward primers were ordered with FAM-or JOE-fluorophore labels (Eurofins Genomics).
Amplification of the microsatellite regions was carried out in a volume of 10 μL with 10 ng genomic DNA, 6 μL 2× DreamTaq PCR Mastermix (ThermoFisher Scientific), 0.5 μL 0.4 % bovine serum albumin (w/v), 2.5 pmol reverse primer, 2.5 pmol FAMor JOE-labelled forward primer and deionized water, in a GeneAmp PCR System 9700 (Applied Biosystems). PCR-conditions were: initial denaturation at 94 °C for 3 min followed by 30 (25 for primers LS6) cycles of denaturation at 94 °C for 30 s, annealing at primer specific temperatures (Table 1) for 30 s and extension at 72 °C for 45 s and one final extension step at 72 °C for 10 min. The microsatellite specific PCR products were quantified on a 1 % agarose gel, and in the case of high product yield, diluted with deionized water to adjust them to the same level of PCR products with moderate yield.
Allele size was estimated on an ABI3730 DNA Analyzer (Applied Biosystems). An aliquot (1 μL) of the PCR product (diluted if necessary) was added to a mix of 10 μL HiDi formamide (Applied Biosystems) and 0.15 μL GeneScan 500 ROX Size Standard (Applied Biosystems) and denatured at 94 °C for 3 min. GeneMapper Software 5 (Applied Biosystems) was used for allele calling, and results were visually inspected. Peaks differing by one nucleotide were rounded to the closest allele size to avoid the overestimation of genetic variation due to stuttering.
The ploidy of populations was also estimated based on the maximum number of alleles (MNA) at a locus. The power of this method, however, depends on the number and frequency of alleles revealed at each locus (Besnard & Baali-Cherif, 2009). Considering the Drakensberg samples diploid (see results), we estimated allele frequencies for the five loci (LS2, LS4, LS5, LS7 and LS10) for which at least three alleles were revealed in the uKahlamba Drakensberg Park population (31 individuals). Based on these allele frequencies, we estimated the probability (P 3x ;Besnard & Baali-Cherif, 2009) that a non-diploid individual (i.e. a triploid) present in this population is revealed with these five loci (i.e. at least one locus showing three alleles in its genotype). This method was not applied for higher ploidies since we can precisely estimate allele frequencies only for diploids, without assuming random mating or a fixed rate of selfing (De Silva et al., 2005;Meirmans, Liu & van Tienderen, 2018).

Plastid microsatellites
We analysed polymorphisms from the maternally inherited plastid genome. Because low polymorphism was revealed in plastid DNA barcodes (rbcL and matK; data not shown), we decided to analyse 12 microsatellite loci (plastid SSR) that are more variable (Table 2). Based on a complete plastid genome of L. simplex (MF563366; Piot et al., 2018), we defined primers in regions flanking a mononucleotide stretch with a minimum of ten repeats. Four loci were simultaneously amplified following the PCR protocol described in Besnard et al. (2011) and using the universal M13 primer labelled with the YAK, 6-FAM or AT550 fluorochrome (Table 2). PCR products were then multiplexed together with GenScan-600 Liz (Applied Biosystems) and separated on an ABI Prism 3730 DNA Analyzer (Applied Biosystems). Allele size was determined with Geneious v.9.0.5 (Kearse et al., 2012). Multistate plastid SSRs were coded by the number of repeated motifs for each allele (e.g. number of T or A), as described by Besnard et al. (2011). The combination of polymorphisms at all plastid SSRs allowed us to define a plastid haplotype for each genotyped individual. We thus analysed 173 and 54 individuals of L. simplex from Madagascar and South Africa, respectively (Supporting Information, Table S1). Other taxa were also analysed (including the three L. simplex accessions from Burundi and Burkina Faso; Supporting Information, Table S1).

genetic diverSity and population differentiation
To understand genetic differentiation among populations, Bayesian clustering of individuals was performed using the software STRUCTURE v.2.3.4  (Pritchard, Stephens & Donnelly, 2000;Falush, Stephens & Pritchard, 2003;Hubisz et al., 2009), which has been shown to be the least biased clustering method in mixed-ploidy populations (Stift, Kolář & Meirmans, 2019). We used a combination of the RECESSIVE-ALLELE model and the ADMIXTURE model; the RECESSIVE-ALLELE model is appropriate when the information about allele dosage is incomplete (i.e. the number of copies of each allele is not known for all the genotypes; Falush, Stephens & Pritchard, 2007). The admixture model assumes the genome of each individual to consist of fractions of every genetic cluster, which was the most appropriate ancestry model for our dataset available in STRUCTURE. Analyses were run with a burn-in period of 5 × 10 5 and subsequent 5 × 10 5 Markov chain Monte Carlo replicates, testing 1 ≤ K ≤ 5 using GNU parallel (Tange, 2011).
The number of genetic clusters (K value) best fitting the data was selected according to the method of Evanno, Regnaut & Goudet (2005) by using the loglikelihood values and the second-order rate of change of the likelihood distribution of the K values (ΔK) inferred with the online tool STRUCTURE HARVESTER (Earl & von Holdt, 2012). The STRUCTURE analysis was performed on the total dataset and on the dataset containing only individuals from Madagascar, using the same parameters.
The assignment of individuals to genetic clusters and their membership coefficients were computed using the program CLUMPP v.1.1.2 (Jakobsson & Rosenberg, 2007), applying the Greedy algorithm. Therefore, the assignment of samples to populations was not taken into consideration. The software DISTRUCT v.1.1 (Rosenberg, 2004) was used to display the clustering results and the individual membership coefficients graphically.
GenoDive v.2.Ob27 (Meirmans & Van Tienderen, 2004) was used to account for the different ploidies in our dataset. We calculated genetic diversity indices (Nei, 1987) for the individuals occurring in mainland Africa and Madagascar, by applying the maximum likelihood method to correct for the unknown dosage of alleles (Meirmans & Van Tienderen, 2004). The following indices were calculated: number of observed alleles (N A ), effective number of alleles (Eff-N A ), expected frequency of heterozygotes (H S ) and total heterozygosity (H T ). Results with a P value < 0.05 were considered to be statistically significant. Tests for deviations from the Hardy-Weinberg proportions were *with a M13 tail (TGTAAAACGACGGCCAGT) added at the 3′ extremity (Schuelke, 2000).
only attempted for the diploid uKahlamba Drakensberg Park population, as information about allelic dosage was incomplete (Meirmans, Liu & van Tienderen, 2018;Gargiulo et al., 2019); tests for both heterozygosity excess and deficiency were performed in GENEPOP (Raymond & Rousset, 1995;Rousset, 2008). Population structure was further evaluated in GenoDive with analysis of molecular variance (AMOVA; Excoffier, Smouse & Quattro, 1992;Michalakis & Excoffier, 1996), by employing the 'ploidy independent infinite allele model' that computes the F ST -analogous ϱ (Rho) (Ronfort et al., 1998;, with 999 permutations. We also performed a pairwise population differentiation analysis within groups with the same ploidy with 999 permutations. We used the R package POLYSAT v.1.7-3 (Clark & Jasieniuk, 2011) to calculate a pairwise matrix of genetic distances (using the Bruvo distance; Bruvo et al., 2004) and to carry out principal component analyses (PCA) on the matrix. Finally, plastid DNA data were analysed separately. The probability that two L. simplex individuals taken at random from Malagasy or South African populations display a different plastid haplotype was computed as D = 1 − Σ p i 2 , where p i is the frequency of the plastid haplotype (Nei & Tajima, 1981;Nei, 1987). We then investigated plastid haplotype relationships using minimum spanning networks based on Bruvo's genetic distance (Bruvo et al., 2004). The networks were reconstructed with the R package poppr (Kamvar, Tabima & Grünwald, 2014).

environmental envelope analySiS
To look for potential relationships between genetic diversity and ecological niche, annual mean temperature (Bio_1) and annual precipitation (Bio_12) were downloaded at 30s spatial resolution from Worldclim Global Climate Data v.2 (Fick and Hijmans, 2017) for each sample location. We compared these climate variables for the four largest nuclear clusters with an analysis of variance while accounting for spatial autocorrelation.

genome Size eStimation
A comparable content of nuclear DNA was measured for L. simplex from Burkina Faso in Africa (2C = 2.23 pg) and Ibity in Madagascar (2C = 2.52 pg), tentatively suggesting the same ploidy for both samples.

nuclear microSatellite analySeS
Allele size ranges for the nuclear microsatellite markers are reported in Table 1. All loci were polymorphic across the individuals analysed. Allele calling in GeneMapper suggested different ploidies among African samples (Tables 3, S1 in the Supporting Information), with plants sharing the same apparent number of alleles occurring closer together. Dosage information was recorded when possible. However, uncertainties in allelic dosage were common, as expected with high ploidies (Dufresne et al., 2014;. Plants from the Drakensberg Region consistently showed a maximum of two different alleles at a locus, and therefore were assumed to be diploid. No significant excess of homozygosity in the uKahlamba Drakensberg Park population (H O = 0.39 vs. H E = 0.44; computed on loci LS2, LS4, LS5, LS7 and LS10) is also consistent with a diploid state and a low frequency of null alleles on these loci. The probability that a triploid individual is present in this population remains relatively low (P 3x = 51.18%), suggesting that the maximum number of alleles revealed at a locus (MNA) for a given individual is indicative of its minimum ploidy. Based on this assumption, the one sample from Burkina Faso appears to be at least hexaploid, whereas the two plants from Burundi and the plants from north-eastern South Africa are at least tetraploid (Table 3). All L. simplex from Madagascar are polyploid, as samples had up to six different peaks indicative of hexaploidy (or higher ploidy) ( Table 3).
Analysis of the genetic structure according to the Evanno method showed that the most appropriate K value for all L. simplex from Africa and Madagascar was 2, corresponding to two genetic clusters. The analysis with GenoDive led to the same result. The samples were assigned to two genetic clusters, reflecting the populations of Africa and Madagascar (Fig. 3A). When considering only the individuals from Madagascar, the most likely K value associated with ΔK was 3, whereas the K value estimated by GenoDive was 2. The barplot for K = 3 showed high levels of admixture, with some samples collected from the same location belonging to two different clusters (Fig. 3C). The barplot for K = 2 showed two clusters that matched their geographical distribution almost perfectly, samples being grouped into clusters from northern and southern Madagascar with only slightly mixed membership (Fig. 3B). K = 2 was therefore considered to be more appropriate. Genetic diversity indices averaged over the microsatellite marker loci are shown in Table 4. On average, 7.60 and 12.64 alleles per locus were observed in African and Malagasy populations, respectively. The effective number of alleles per locus was more comparable with 2.41 alleles in Africa and 3.47 alleles in Madagascar. The total heterozygosity matched the expected frequency of heterozygosity, which was slightly lower in the African populations (H T : 0.46) than in Madagascar (H T : 0.54). In the Drakensberg population, tests for deviations from the Hardy-Weinberg proportions conducted on polymorphic loci (LS2, LS3, LS4, LS5, LS7, LS9, LS10) revealed no significant heterozygote excess. A heterozygote deficit was detected at LS9 (P ≤ 0.001).
Pairwise differentiation analysis within groups with the same ploidy showed that Africa and Madagascar are significantly different from each other (P ≤ 0.001) with a moderate level of differentiation and an F ST value of 0.141 (Table 5). South African populations group into two significantly different genetic clusters (P ≤ 0.001) with a moderate level of differentiation (F ST = 0.107). The populations from southern Madagascar are also significantly different from the populations of the northern part (P ≤ 0.001); differentiation between samples from northern and southern Madagascar is low with an F ST value of 0.01. The AMOVA analysis implemented in GenoDive resulted in a value of ϱ = 0.517, indicating strong population structure.
The PCA scatter-plot of all L. simplex populations (Fig. 4A) showed that the populations from Madagascar were more densely clustered than the populations from Africa. The South African samples were divided into two sub-clusters, reflecting their different ploidies and their spatial grouping into samples from the North-East and from the Drakensberg Region. The samples from Madagascar were roughly divided into sub-clusters that matched their spatial occurrence in northern and southern Malagasy highlands. The samples from Burundi and Burkina Faso were not part of any sub-clusters, being placed between South Africa and Madagascar. The relationship between the genetic clustering, ploidy grouping and geographical location is illustrated in Figures 2 and 7. Other species of Loudetia and Tristachya nodiglumis were not close to the African or Malagasy populations (Fig. 4B).

plaStid microSatellite analySeS
Of the 12 plastid SSR loci tested, 11 were successfully amplified for all L. simplex samples from Madagascar and South Africa. Ten of these loci revealed length polymorphisms with three to nine alleles (Table 2). Plastid microsatellite genotypes are available in the Supporting Information (Table S1). Most polymorphisms were probably due to single step mutations (1-bp polymorphisms), but loci LScp-2 and LScp-12 showed longest indels (of 5 and 9 bp, respectively; Table 2) that were coded separately. When analysing all Loudetia taxa, long indels (> 10 bp) and/ or null alleles were also observed for loci LScp-2 and LScp-8, limiting their use for assessing relationships above the species level. Among the 173 Malagasy individuals of L. simplex, 50 plastid haplotypes were identified, with 40 observed in just one or two accessions. Similarly, 28 plastid haplotypes were identified among the 54 South African individuals, of which 24 were observed in just one or two accessions. Overall, the plastid haplotype diversity was higher in South Africa than in Madagascar (D = 0.902 vs. 0.773, respectively). This lower diversity in Madagascar could, however, be the result of repetitive sampling on the northern High Plateau where a single plastid haplotype was observed in 80 individuals.
Analysis of the plastid haplotype relationships demonstrated that, similarly to nuclear data, L. simplex accessions from Madagascar and South Africa form two large highly diversified clusters (Fig. 6). Other Loudetia accessions, including L. simplex from Burundi and Burkina Faso, are only distantly related to South African and Malagasy L. simplex matrilineages. Matrilineage  relationships among such distantly related taxa should be interpreted with care as these are probably subject to a high rate of homoplasy and could be biased by a complex history of insertions and deletions. The plastid haplotype network of Malagasy L. simplex accessions highlights two major clusters formed by most northern High Plateau accessions and the south-western accessions, respectively (Fig. 5A), in relative concordance with the clusters defined with nuclear microsatellites (Figs 2, 7). In contrast, the plastid haplotypes from the Central High Plateau and a few plastid haplotypes from the northern High Plateau are scattered through the network without particular clustering pattern (Fig. 5A) suggesting that complex processes of seed mediated gene flow have shaped the plastid genetic makeup of Malagasy populations. Furthermore, the network indicates that two plastid haplotypes are relatively frequent on the northern High Plateau (Fig. 5A). The most frequent plastid haplotype (80 accessions) is shared by most populations from the northern High Plateau, and six rare, closely related plastid haplotypes were also observed in a star-like pattern (Fig. 5A).
Phylogenetic relationships among the South African plastid haplotypes of L. simplex were similarly analysed. South-eastern diploid accessions cluster in the centre of the plastid haplotype network (Fig. 5B; uKahlamba Drakensberg Park, Loteni NR and Underberg localities), whereas tetraploid accessions are scattered without a strong clustering pattern (Fig. 8). The same clusters are presented geographically in Figure 2B.  Figures 3B and 4, minimum ploidy summarized from the Supporting Information (Table S1). B, Plastid haplotypes, same clusters as shown in Figure 5 defined by geographical proximity. Countries of L. simplex native occurrence are shown in green, data from WCSP (wcsp. science.kew.org). Map by Sarah Z. Ficinski. The comparisons between the annual precipitation and the annual mean temperature for the nuclear clusters are presented in Figure 9. For precipitation there are significant differences between all pairs of clusters (all P < 0.001). For temperature there are significant differences between: Drakensberg and Madagascar North clusters, Drakensberg and Madagascar South, Drakensberg and North-East South Africa, and Madagascar South and North-East South Africa (all P < 0.001); and also Madagascar North and Madagascar South (P = 0.027). For temperature there is also a marginally significant difference between Madagascar North and North-East South Africa (P = 0.057).

DISCUSSION
This study presents a population genetic analysis of the commonly dominant grass L. simplex in Madagascar and South Africa. The data from 11 nuclear polymorphic microsatellite loci and ten polymorphic plastid SSRs analysed from 211 and 230 accessions, respectively, are consistent: populations in Madagascar and South Africa are genetically distinct and variable in both locations, with strong geographical structuring of the genetic diversity. Genetic diversity in Madagascar is high. Plastid data indicate that a single maternal lineage has colonized Madagascar from Africa. Climate envelopes of the major clusters show differences suggesting that L. simplex occupies a range of ecological niches across space.

evolution and biogeographical hiStory
The flora of Madagascar is allied primarily with Africa (Buerki et al., 2013) and its grasses are allied with the African grass flora more than any other   Hackel et al., 2018). The median stem ages for clades of Poaceae endemic to Madagascar are between 1 and 5 Myr . Estimates available for the arrival of non-endemic C 4 lineages in Madagascar are similar: c. 1.4 Mya for Themeda triandra Forssk. (Dunning et al., 2017) and c. 1 Mya for Alloteropsis semialata (R.Br.) Hitchc. (Olofsson et al., 2016). The best current estimate for the split of L. simplex from African Loudetia is the Late Miocene or Early Pleistocene (3-8 Mya; Hackel et al., 2018), from an analysis of plastid DNA sequences including just two samples of L. simplex from southern and south-central Madagascar. These broadly consistent results suggest that palaeotropical perennial C 4 grasses colonized Madagascar c. 1-8 Mya, at the same time as endemic species originated in the Malagasy savannas (e.g. Salmona et al., 2020), significantly before human arrival. Our plastid SSR analysis shows that Malagasy and South African populations of L. simplex harbour clearly distinct plastid lineages and a remarkable diversity of plastid haplotypes. This not only suggests that more extensive sampling of other regions may unravel greater diversity, but also indicates that the processes of diversification have taken place over long periods of time in both regions. Loudetia simplex is thus native to both Madagascar and South Africa. Although we do not date the crown age of South African and Malagasy lineages or the time of their divergence (i.e. the dispersal event out of Africa), the high diversity of the plastid genome confirms the antiquity of the Malagasy lineage diversification, which could have co-occurred with one of the worldwide waves of the savanna biome expansion (Edwards et al., 2010;Salmona et al., 2020). These findings are in agreement with the recent work on Malagasy open habitat biota demonstrating that Malagasy grasses are highly diverse and endemic (Solofondranohatra et al., 2020). The spread of L. simplex across Madagascar through multiple climate change episodes is likely to have been complex and locally variable (Burney, 1987a, b;Burney et al., 2004). Not all the genetic diversity documented here is necessarily pre-human: the star-like shape of the network around the most frequent plastid haplotype (Fig. 5A) can be interpreted as a signal of a recent expansion of L. simplex in the northern High Plateau (Fig. 2B). Native Malagasy populations of L. simplex are likely to have expanded since the broad establishment of fire-driven agriculture in Madagascar c. 1000 years ago (Crowley, 2010) and could be a part of an anthropogenically driven landscape transition and further expansion of L. simplex out of its ancient habitat range. Further sampling across Madagascar would be necessary to understand the ancient range of L. simplex.

polyploidy and climate
Multiple chromosome counts have previously been documented for L. simplex: 2n = 60 (Moffett & Figure 5. Plastid haplotype networks of Loudetia simplex in A, Madagascar and B, South Africa. Line length is proportional to Bruvo's genetic distance between plastid haplotypes. Node size is proportional to the number of plastid haplotype observations. All edges of equal weight are represented. Numbers in nodes indicate the number of accessions sharing the plastid haplotype. Haplotypes are represented using a colour code based on their geographical origin shown in Figure 2B. Hurcombe, 1949;Davidse, Hoshino & Simon, 1986);2n = 24 (de Wet, 1958); 2n = 24, 40 and 60 (compiled by Oyen, 2012); and 2n = 20 and 40 (Kammacher et al., 1973). The most likely chromosome base number based on all the literature is x = 10 and sometimes x = 12 (in line with a base chromosome number of 10-12 for Tristachyideae, fide Kellogg, 2015); diploid, tetraploid and hexaploid populations are known in the wild. Most samples from Madagascar have five or six alleles for at least one microsatellite locus. From the same populations, a few individuals had four alleles (full counts in the Supporting Information, Table S1). The total number of alleles scored clearly indicates that the Malagasy populations are hexaploid. Genome size measurements for the two living accessions provided an additional source of evidence: the Malagasy living sample analysed had a maximum of only four different alleles per locus, whereas the sample from Burkina Faso had a maximum of six. However, similar amounts of DNA in both samples indicate the same ploidy.
South African samples of L. simplex analysed consistently show a lower number of alleles, suggesting that there are diploid and tetraploid populations (Table 3, Fig. 2A, Supporting Information, Table S1), although a lower number of alleles could also be a consequence of null alleles or higher homozygosity in southern Africa. The sole exception is represented by the single sample from Burkina Faso, which shows a hexaploid pattern, although hexaploids have also been reported from South Africa and Zimbabwe (Moffet & Hurcombe, 1949;Davidse Figure 6. Plastid haplotype network of Loudetia simplex from all geographical locations, L. arundinacea, L. filifolia, L. lanata and Tristachya nodiglumis. Line length is proportional to Bruvo's genetic distance between plastid haplotypes. et al., 1986). Other chromosone counts indicate further complexity across Africa: diploids have been reported from Cameroon (Dujardin & Beyne, 1975). Extensive additional sampling is necessary to understand the situation across Africa.
The analyses of genetic differentiation we have conducted (Figs 3, 4; Table 5) show a consistent difference in ploidy between South Africa and Madagascar (diploid/tetraploid vs. hexploid). Alternative hypotheses to account for this difference could include a hybridization or introgression event after L. simplex colonized Madagascar leading to hexaploidy without remarkable morphological changes (Keeler, 1998;Kolář et al., 2017), an autopolyploidization event (Ramsey & Schemske, 1998) or the occurrence of cryptic taxa in Madagascar and Africa. These hypotheses could also account for the higher heterozygosity in the Malagasy populations. A broad population genomic analysis of L. simplex populations from Africa and related species from Madagascar will be necessary to clarify the origin of different ploidies in different geographical regions and their relationship with the history of the colonization of Madagascar. Plastid haplotypes of South African polyploids are not closely related but instead scattered throughout the network (Fig. 8). This pattern suggests that the polyploids had a polytopic origin rather than a single origin, probably due to recurrent genetic exchanges between diploid and polyploid populations.
No consistent and globally applicable relationships have so far been observed between climate, elevation, plant genome size and polyploidy, except for the fact that larger genomes require more nutrients (e.g. Pellicer et al., 2018). The only conclusion we are able to make for L. simplex is that its geographical ploidy structure does not seem directly climate driven or habitat responsive but is more a consequence of historic constraints in the evolution of this species. Genetic clusters occur in significantly different climates suggesting a long history of local adaptation similar to that shown for Andropogon gerardi (McAllister et al., 2015;McAllister & Miller, 2016) and Triodia (Anderson et al., 2017(Anderson et al., , 2019. In the Australian populations of Themeda triandra, genomic data indicated that polyploid populations are associated with warmer climates (Ahrens et al., 2020), and field experiments found a higher polyploid fitness under heat and drought stress (Godfree et al., 2017). Apart from clarifying the origin of polyploidy in L. simplex, expanding population sampling would also shed light on the factors associated with the persistence of polyploid populations.  Figure 5A) compared with nuclear clusters (indicated in colours). The nuclear Madagascar South (cluster 1) and Madagascar North (cluster 2) correspond to these shown in Figures 2A, 3B and 4. Line length is proportional to Bruvo's genetic distance between plastid haplotypes. Node size is proportional to the number of plastid haplotype observations. All edges of equal weight are represented.  Figure 5A) mapped against ploidy (indicated in colours). Line length is proportional to Bruvo's genetic distance between plastid haplotypes. Node size is proportional to the number of plastid haplotype observations. All edges of equal weight are represented.

morphology, taxonomy and genetic cluStering
Loudetia simplex is a polymorphic species with a complex taxonomic history and 11 heterotypic synonyms (Clayton, 1974). The grasses of Madagascar were mostly described in France by Camus from specimens sent to Paris (Leandri, 1966), largely independently of the African grass taxonomy carried out in the UK, Belgium and Germany. These species names based on a few collections each were revised by Bosser, the only resident agrostologist in Madagascar with extensive field experience. Bosser's (1966) revision of Loudetia in Madagascar stated that the variation seen across Africa is so polymorphic that Hubbard's (1936) infrageneric classification of Loudetia is almost arbitrary. Bosser (1966) recognized three endemic taxa in Madagascar now included in the modern concept of L. simplex: L. simplex subsp. stipoides (Hack.) Bosser dominating central highlands, the smaller L. madagascariensis (Baker) Bosser with filiform leaves occurring at > 1000 m elevation and L. perrieri A.Camus from a single collection with a variant lower glume. The distinction between L. simplex subsp. stipoides and L. madagascariensis was maintained by Bosser (1969) until L. madagascariensis was subsumed into a broader L. simplex by Clayton (1974), who noted that 'variation in East Africa readily engulfs' all segregate taxa of L. simplex.
Clusters identified in this study lend support to some of the earlier taxonomic divisions between African and Malagasy L. simplex, even though no macromorphological distinction is apparent. Our results are also in agreement with a genetic difference between L. simplex, the related species L. filifolia, L. flavida, L. lanata and Tristachya nodiglumis, lending support to the current classification. The separation of Malagasy populations into northern and southern clusters does not, however, have any correlation with Bosser's (1966Bosser's ( , 1969 recognition of a higher elevation taxon L. madagascariensis as separate from L. simplex subsp. stipoides. Since L. perrieri was defined on the basis of a single specimen with no exact locality data, assessing its relationship to the clusters identified by this study is not possible. Analysis of leaf anatomy (Lubke & Phipps, 1973) also found that the Malagasy L. madagascariensis and L. simplex subsp. stipoides did not consistently cluster with the African L. simplex subsp. simplex.
Improving population sampling from lower elevations in the western and southern parts of Madagascar and a greater part of Africa (especially western Africa) would lend greater power to this analysis. Phylogenomics and population genomics would allow a reconstruction of the spatiotemporal history of this species. Integration of functional traits into this analysis could also provide deeper insights into the significance of morphological variability for ecological dominance of L. simplex, in the context of the evolutionary history of the group.

CONCLUSIONS
We show that L. simplex populations of Madagascar are genetically different from those occurring in continental Africa and occupy different environmental niches. This genetic diversity pattern is a clear indication that Malagasy populations colonized the island long before human arrival, probably via a single colonization event. We conclude that L. simplex is a native Malagasy species. Malagasy ecosystems dominated by L. simplex are in need of more in-depth studies to ascertain the detailed history of this species in Madagascar and the drivers of its dominance.
Widespread and common grasses have the potential to serve as excellent models for the reconstruction of biome history. Prerequisites for such a model species should include easy access to collection localities, a small genome of low complexity and ideally a diploid genome. Genetic population work on such grasses would allow reconstruction of past population size fluctuations and population splits over hundreds to millions of years in the past. was enrolled during 2017-2018. The Howard Lloyd Davies Legacy Fund supported the RADseq work. We thank the staff of the Kew Madagascar Conservation Centre, Stuart Cable (RBG Kew), Vololoniaina Jeannoda (University of Antananarivo) and Parc de Tsimbazaza for their long-term collaboration and support for Madagascar Poaceae research. The Direction Générale des Forêts, Madagascar National Parks, and Ezemvelo KZN Wildlife generously granted our research permits. Field collections in South Africa were made possible by a Newton grant awarded to Caroline Lehmann and Gareth Hempson. The microsatellite marker primers were designed with the help of Méline Saubin (RBG Kew). Jaume Pellicer Moscardó, Robyn Faye Powell and María Conejero (RBG Kew) kindly performed the estimation of nuclear DNA contents. The curators of BM, K, PRE and TAN herbaria provided access to their collections; curators of TAN and PRE gave permission to sample herbarium specimens in their collections. Sarah Z. Ficinski drew and edited the figures. JS, US and GB are members of the EDB lab that is supported by LABEX TULIP (ANR-10-LABX-0041) and CEBA (ANR-10-LABX-25-01) and LIA BEEG-B (Laboratoire International Associé -Bioinformatics, Ecology, Evolution, Genomics and Behaviour, CNRS). They were also funded by an ERA-NET BiodivERsA project: INFRAGECO (Inference, Fragmentation, Genomics, and Conservation, ANR-16-EBI3-0014).