DNA barcoding British Euphrasia reveals deeply divergent polyploids but lack of species-level resolution

Background and aims DNA barcoding is emerging as a useful tool not only for species identification but for studying evolutionary and ecological processes. Although plant DNA barcodes do not always provide species-level resolution, the generation of large DNA barcode datasets can provide insights into the mechanisms underlying the generation of species diversity. Here, we use DNA barcoding to study evolutionary processes in taxonomically complex British Euphrasia, a group with multiple ploidy levels, frequent self- fertilization, young species divergence and widespread hybridisation. Methods We sequenced the core plant barcoding loci, supplemented with additional nuclear and plastid loci, in representatives of all 19 British Euphrasia species. We analyse these data in a population genetic and phylogenetic framework. We then date the divergence of haplotypes in a global Euphrasia dataset using a time-calibrated Bayesian approach implemented in BEAST. Key results No Euphrasia species has a consistent diagnostic haplotype. Instead, haplotypes are either widespread across species, or are population specific. Nuclear genetic variation is strongly partitioned by ploidy levels, with diploid and tetraploid British Euphrasia possessing deeply divergent ITS haplotypes (DXY = 5.1%), with haplotype divergence corresponding to the late Miocene. In contrast, plastid data show no clear division by ploidy, and instead reveal weakly supported geographic patterns. Conclusions Using standard DNA barcoding loci for species identification in Euphrasia will be unsuccessful. However, these loci provide key insights into the maintenance of genetic variation, with divergence of diploids and tetraploids suggesting that ploidy differences act as a barrier to gene exchange in British Euphrasia, with rampant hybridisation within ploidy levels. The scarcity of shared diploid-tetraploid ITS haplotypes supports the polyploids being allotetraploid in origin. Overall, these results show that even when lacking species-level resolution, DNA barcoding can reveal insightful evolutionary patterns in taxonomically complex genera.


INTRODUCTION
descriptive statistics, and then tested the cohesiveness of taxa using analysis of molecular included the number of haplotypes, as well as hierarchical AMOVA in groups according to: (1) ploidy levels (diploid vs tetraploid); (2) geographic regions (Wales, England, Scotland); 1 9 9 (3) species. AMOVAs were performed on all taxa, and repeated for ploidy levels and   due to differences among populations (F CT ) for a given number of genetic clusters (K-value).

0 9
We considered the best grouping to have the highest F CT value after 100 repetitions. This included as a separate partition with a restriction site (binary) model. We sampled every 2 2 7 1000 th generation and used a burnin of 2,500,000, and default priors. We confirmed chain  gives crude dates due to the lack of available fossils for calibration, but these estimates allow 2 3 5 us to compare between very recent (postglacial) divergence, and much older divergence events. We only analysed ITS sequences, due to the lack of support obtained for the plastid from 0.38 x 10 -9 to 8.34 x 10 -9 substitutions/site/year. Data analyses were run for 50,000,000 2 4 8 generations, logging every 50,000 th generation. We compared the fit of the clock models 2 1 4 the network present in tetraploids is haplotype H18, found in a single sample of E. ostenfeldii.

0 6
Within the tetraploid cluster, widespread haplotype H2 is at the centre, surrounded by other European taxa, and is sister to a mixed clade of alpine diploid species and tetraploid E. Node 1). The median crown age of the major group of diploids (also including Nearctic 1.00, Fig. 3, Node 2). Due to low support of internal branches, the age of the most recent nearest dated node gaining support is the broader clade of Euphrasia, which includes the 3 3 7 diploid and tetraploid groups, in addition to two additional clades that include divergent 95% HPD = 6.5 -9.8 Ma (Fig. 3, Node 3). sequencing was performed for this region and it was excluded from further analyses.

4 5
The final matK alignment was 844 bp with one indel, and the rpl32-trnL region was 630 bp significantly different from zero (-0.27). taxa that seldom co-occur in the wild. One possibility is that the species are not discrete genetic entities, and that the current In contrast to the factors above, it seems that self-fertilization, hybridity and incomplete haplotype. This pattern of local fixation of haplotypes, and scarcity of widespread haplotypes, as the prevalence of hybridisation in genetic data (Stone, 2013, Liebst, 2008, point to genomic regions that have atypical inheritance and patterns of evolution (i.e. plastids).

5 0
Analyses of many nuclear genes (or entire genomes) would be particularly valuable for The deep divergence of ITS haplotypes between diploid and tetraploid Euphrasia suggest our phylogeny make the divergence between ploidy groups difficult to date, but it must Hodges SA, Arnold ML. 1994. Columbines: a geographically widespread species flock.