High molecular diversity in the true service tree ( Sorbus domestica ) despite rareness: data from Europe with special reference to the Austrian occurrence

(cid:2) Background and Aims Sorbus domestica (Rosaceae) is one of the rarest deciduous tree species in Europe and is characterized by a scattered distribution. To date, no large-scale geographic studies on population genetics have been carried out. Therefore, the aims of this study were to infer levels of molecular diversity across the major part of the European distribution of S. domestica and to determine its population differentiation and structure. In addi-tion, spatial genetic structure was examined together with the patterns of historic and recent gene ﬂow between two adjacent populations. (cid:2) Methods Leaf or cambium samples were collected from 17 populations covering major parts of the European native range from north-west France to south-east Bulgaria. Seven nuclear microsatellites and one chloroplast minisatellite were examined and analysed using a variety of methods. (cid:2) Key Results Allelic richness was unexpectedly high for both markers within populations (mean per locus: 3 (cid:3) 868 for nSSR and 1 (cid:3) 647 for chloroplast minisatellite). Moreover, there was no evidence of inbreeding (mean F is ¼ –0 (cid:3) 047). The Italian Peninsula was characterized as a geographic region with comparatively high genetic diversity for both genomes. Overall population differentiation was moderate ( F ST ¼ 0 (cid:3) 138) and it was clear that populations formed three groups in Europe, namely France, Mediterranean/Balkan and Austria. Historic gene ﬂow between two local Austrian populations was high and asymmetric, while recent gene ﬂow seemed to be disrupted. (cid:2) Conclusions It is concluded that molecular mechanisms such as self-incompatibility and high gene ﬂow distances are responsible for the observed level of allelic richness as well as for population differentiation. However, human inﬂuence could have contributed to the present genetic pattern, especially in the Mediterranean region. Comparison of historic and recent gene ﬂow may mirror the progress of habitat fragmentation in eastern Austria.


INTRODUCTION
The true service tree (Sorbus domestica), a light-demanding, insect-pollinated, rosaceous species, is distributed in the Mediterranean as well as in parts of France, Switzerland, Germany and Austria (Kutzelnigg, 1995), and also has a very scarce occurrence in the British Isles. It is still unclear to what extent this current distribution is natural in view of the species' cultivation since Roman times (Rotach, 2003). There is both historic (Kausch-Blecken von Schmeling, 2000) and recent evidence (Termentzi, 2008) for use of the fruits for cider production and for medical purposes, which could have enhanced the species' dispersal by humans after the last glacial period. However, rareness and scattered distribution characterize its present range, mainly as a consequence of preference for high forests instead of coppices with standards (Frochot et al., 2008) and for dense coniferous forests instead of open broad-leafed stands (Ferrini et al., 2008) during the last century. Theoretically, habitat fragmentation decreases the number of species (Mac Arthur and Wilson, 1967) and erodes genetic variation within and among populations due to increasing genetic drift, elevated inbreeding and reduced gene flow, and eventually increases the probability of local extinction of sub-populations (Young et al., 1996). This has been recently confirmed in Fagus sylvatica (Jump and Penuelas, 2006). Traits conferring adaptation to a fragmented habitat are, among others, longdistance pollen and seed dispersal (Nason et al., 1998;Levey et al., 2005). In S. domestica, rare long-distance dispersal events in a Swiss population were reported (maximum 16 km for pollen and maximum 12 km for seed dispersal, respectively; Kamm et al., 2009). From an evolutionary point of view, it can be assumed that this tree species is well adapted to a patchy habitat structure and that these patches are part of a metapopulation dynamic (Rotach, 2003). These dynamics are characterized through their demographic and genetic stochasticity where the degree of fragmentation and the size of ther remaining populations play a key role for the risk of local extinction (Hanski, 1998). Moreover, the persistence of a few large populations, which could act as refugia, is necessary for protecting genetic variation in the long term and to avoid extinction (Gilpin, 1991). Similar effects have been observed in several studies for the related species Sorbus torminalis, which shares life characteristics in terms of habitat structure and reproduction ecology. A main outcome of these studies was the finding of maintenance of high genetic diversity even in populations of small census size (e.g. Hoebee et al., 2006;Angelone et al., 2007).
Up to now, only a few population genetic and morphometric studies in S. domestica have been carried out, which mainly focused on a small geographical scale (e.g. Kamm et al., 2009Kamm et al., , 2010Kamm et al., , 2011Nyari, 2010;Brus et al., 2011). A study which encompasses populations covering the main part of its distribution has been lacking. Hence, this study has two objectives: (1) in a first step we investigate levels of molecular diversity and differentiation in S. domestica by analysing samples from the western to the eastern European distribution at both the chloroplast and the nuclear DNA level; and (2) in the second step we zoom in on two putative natural populations located in Austria, which are geographically close to each other (about 60 km) to find out more about their population history by scrutinizing spatial genetic structure and the extent of historic and recent gene flow between them.

Population sampling
Leaf or cambium samples were collected from 17 populations of Sorbus domestica L. covering major parts of the European native range from France (including Corsica), Switzerland, Italy (including Sicily and Elba), Slovenia, Croatia, Serbia, Bosnia-Herzegovina, Bulgaria and Austria (Table 1). For some populations it was not possible to obtain an adequate sample size because of the scattered nature of the species, but we considered this in our analysis. One of the Austrian populations -AT (NC) -is not a natural population, but a national collection of individuals originating from different regions including Burgenland (eastern Austria), Lower Austria and Vienna, and was established for long-term ex situ conservation of the species. For samples from the central and southern part of France, we also resorted to clonal archives (French National Collection, clone bank of INRA Avignon). Despite the fact that these individuals were collected across a wider geographical area, we pooled them into two populations (southern France and central France), which seems to be justifiable given the dispersal distances (Kamm et al., 2009) as well as the results of the cluster analysis (see below; Supplementary Data Fig. S1 shows the respective origins of these individuals; the original spatial positions of these trees in the field were reconstructed with data from INRA Avignon). There is a huge number of intermediate species in the genus Sorbus, which are thought to have originated through hybridization. However, until now, no hybrids between S. domestica and any other Sorbus species have been observed (e.g. Nelson-Jones et al., 2002;Ludwig et al., 2013). Therefore, we can exclude a bias through introgression. Localization of individual trees was supported by forest inventory data. Fresh leaf material was sampled from trees and stored in silica gel. For the Austrian populations, we used exclusively cambium samples, because sampling was done outside of the growing season. A piece of approx. 1 cm 2 was taken from the stem base and stored in silica gel until DNA extraction. We sampled only trees with a dbh (diameter of breast height) between 30 and 50 cm to obtain an approximately even-aged sample.

DNA extraction, PCR and genotyping
Dried material was shock-frozen in liquid nitrogen for 3 min and crushed for 1 min at 20 Hz using a Qiagen Tissue Lyser device (Qiagen, Germany). DNA was extracted from 40-60 mg of dry leaf or cambium material using a DNeasy Plant Mini Kit (Qiagen, USA). The given protocol provided by the manufacturer was slightly modified by using 600 lL of AP1 buffer and 150 lL of elution buffer (two elution steps instead of one). We determined the concentration and purity of DNA using an ND-1000 spectrophotometer (NanoDrop, USA). DNA was stored at 4 C. Initially we used eight nuclear microsatellite loci and one chloroplast (cp) minisatellite. The nuclear microsatellite loci used were MSS5, MSS16, BGT23b, MS14h03, CH01h01, CH01h10, CH02c09 and CH02d08, formerly also employed by Kamm et al. (2009). The cp-minisatellite is a 22 bp repeat with a (CATTATATTATTGATTTTAGTT) n motif (see Supplementary Data Table S1 for further information on the used markers). It was found by screening the rps16 region, which had also been described as polymorphic for several other related species (e.g. Oxelmann et al., 1997;Chester et al., 2007). For the nuclear microsatellites each reaction volume contained approx. 10 ng of DNA, 1Â PCR buffer, 1 mM MgCl 2 , 0Á15 lM of each primer (forward and reverse), 1Á6 mM dNTPs, 0Á6 U of polymerase (Peqlab, Germany), and ddH 2 O to a total volume of 10Á75 lL. Primers were combined in two multiplex-PCR amplifications with four primers in each set. The reaction volume for the chloroplast minisatellite contained approx. 10 ng of DNA, 1Â reaction buffer, 0Á5 mM MgCl 2 , 1 lM of each primer, 0Á8 mM dNTPs, 0Á5 U of polymerase, and ddH 2 O to a total volume of 12Á5 lL. PCR conditions for nuclear markers were initialized by a denaturation step at 94 C for 5 min and followed by 30 cycles of 94 C for 45 s, 50 or 60 C (depending on the locus) for 45 s, 72 C for 1 min and a final extension step at 72 C for 10 min. For the chloroplast marker, PCR started with denaturation at 94 C for 5 min, followed by 35 cycles of 94 C for 1 min, 58 C for 1 min, 72 C for 30 s and a final extension step at 72 C for 10 min. PCRs were carried out on a PTC100 (MJ, USA). Fragment analysis was done through capillary gel electrophoresis using a CEQ8000 sequencer (Beckman-Coulter, USA). For the analysis of the cp-minisatellite, we sequenced a sub-sample of 100 individuals to make sure that haplotypes were identical by descent. No signs of homoplasy were found.

Data analysis
Microsatellite data were checked for the occurrence of null alleles, genotyping errors and large allele dropouts using Micro-Checker (van Oosterhout et al., 2004). Deviations from Hardy-Weinberg proportions were assessed using Genepop (Raymond and Rousset, 1995) with 10 000 dememorizations, 100 batches and 10 000 iterations using Monte Carlo Markov chain (MCMC) simulations. Linkage disequilibrium was tested using Fstat 2.9.3 (Goudet, 1995). Descriptive population genetic statistics such as allele frequencies, effective number of haplotypes/alleles, unbiased haplotype diversity, number of private haplotypes/alleles, observed and expected heterozygosity and F is values were calculated with Genalex 6Á5 (Peakall and Smouse, 2012). Allelic richness with a rarefaction to ten individuals was calculated with Fstat for nuclear microsatellites and using Contrib (Petit et al., 1998) for the cp-minisatellite.
We used two methods to infer the relationship among populations. First we computed an unweighted pair group method arithmetic average dendrogram (UPGMA) from 1000 bootstrapped matrices based on Cavalli-Sforza and Edwards' chord distances (Cavalli-Sforza and Edwards, 1967) created by Microsatellite Analyser (MSA) (Dieringer and Schlötterer, 2003). To find the 'lowest common denominator' we constructed a consensus tree following the majority rule out of the 1000 produced matrices using the programs Neighbour and Consense in the Phylip 3.63 package (Felsenstein, 1989). Additionally, we performed a principal co-ordinate analysis (PCoA) to visualize configurations among interpopulation genetic distances. This was carried out in Genalex using Nei's standard genetic distance (Nei, 1972).
To infer population structure, we used the software Structure 2.3.4 (Pritchard et al., 2000), which uses a Bayesian clustering algorithm, where individuals are assigned to a pre-defined number of clusters (K). Assignment is done in such a way that deviations from Hardy-Weinberg proportions and gametic phase disequilibrium within clusters are minimized. This analysis was performed with K-values from 2 to 6 and a total run length of 1 000 000 with a burn-in period of 200 000. Each run was repeated four times for reasons of iteration. To find the appropriate number of clusters, we used the DK statistic of Evanno et al. (2005) implemented in the software Structure Harvester (Earl and von Holdt, 2012).
To obtain information about the levels of differentiation among populations, we performed an analysis of molecular variance (AMOVA) using Arlequin 3.5 (Excoffier and Lischer, 2010) where global F ST and R ST values were calculated. Groups for AMOVA were defined according to their geographical proximity and also by the above-mentioned tree construction method, PCoA and individual-based population assignment. Allelic richness, observed and expected heterozygosity, tree construction and PCoA as well as AMOVA were performed only for populations with a sample size !10. To extract information also from populations with lower sample size, we included them in another approach and calculated the allelic distance (D 0 ) according to Gregorius (1974) among four groups (Austria, Balkan, France and Mediterranean). D 0 is the mean of differences (absolute values) in relative allele frequencies between populations and can vary between 0 and 1 (D 0 ¼ 1, populations are genetically completely different; D 0 ¼ 0, populations are genetically identical). D 0 was not calculated pairwise, but as a distance of one group of populations against the residual gene pool of the remaining groups.
Finally, we inferred spatial genetic structure for the withinpopulation analysis for the two natural Austrian populations using the kinship coefficient of Loiselle (1995) using SPAGeDi 1.4 (Hardy and Vekemans, 2002). Individual sampling locations were permuted 10 000 times. The software Migrate-n (Beerli, 2006) was used to investigate the amount and direction of historic gene flow by calculating the parameters h and M, where h is four times the effective population size multiplied by the mutation rate per site and generation, and M is the immigration rate divided by the mutation rate. The latter parameter gives information on how much more important migration is relative to mutation to bring new variants into a population. Multiplying h by M then gives the number of migrants into a population per generation. In this case 'historic' refers back to the date of the most recent common ancestor which is assumed to be 4N e generations (for diploid organisms). We used the Bayesian inference method implemented in Migrate-n and also the continuous Brownian motion model. Runs were started with F ST values to estimate h and M. We assumed a constant mutation rate and used the following MCMC settings: 500 000 recorded steps, 100 000 discarded trees per chain, and an increment of 20 and 500 bins for h and M, respectively. In the following runs, we used the newly estimated values for h and M as calculated by the software until the posterior distributions for both parameters converged. Instead of conventional convergence diagnostics, we used the width of the 95 % confidence interval as the evaluation criterion: if these intervals were overlapping in at least three final runs for both parameters, we assessed the results as accurate. The run with the highest marginal likelihood was used for interpretation.
Recent gene flow was inferred with Bayesass 3.0 (Wilson and Rannala, 2003) where the probability of migration for the last two generations within and between populations is calculated. Bayesass uses a Bayesian method to estimate recent migration rates, population allele frequencies, inbreeding coefficients and individual migrant ancestries. It assumes that levels of differentiation between populations are high and that migration rates are low for an accurate estimation. A further assumption is that all populations which are considered to exchange migrants have been sampled. These assumptions are well fulfilled by our two model populations. We performed 10 000 000 iterations with a burn-in period of 1 00 000 to let the Markov chain reach a steady state, and chose a sampling interval of 1000. To be sure that our results were reliable, we repeated the runs with different mixing parameter values (from 0Á1 to 0Á5) and also started each time from a different initial random seed. These analyses were performed exclusively for the two Austrian populations, because here we had sufficient sample sizes (>40) and nearly equal census sizes (approx. 150 individuals in both populations), and because we were able to exclude a potential bias from non-sampled populations, that nevertheless might have contributed to gene flow (Beerli, 2004).

Chloroplast and nuclear genetic diversity
The cp-minisatellite revealed a total of five haplotypes with a minimum of two and a maximum of four haplotypes per population. Haplotype 244 was most frequent and occurred in all populations with a sample size >10. Haplotype 222 was absent in northern Italy and Slovenia and most frequent in Bulgaria and central Italy. Haplotype 266 was common in all populations except Switzerland, Serbia and Bulgaria. The shortest and the longest haplotype (200 and 288, respectively) showed peculiarities concerning their occurrences. While haplotype 200 occurred only twice (once each in Croatia and in Austria-Wolkersdorf), haplotype 288 occurred more frequently, but only in northern Italy and in Elba. Private haplotypes were not observed. Haplotype richness was highest in Elba and Croatia as well as in north-western and central France, whereas Bulgaria, Serbia, Slovenia and Switzerland showed low levels of haplotype richness (see Table 2 and Fig. 1).
In the nuclear microsatellite analysis, locus CH02d08 showed evidence for the occurrence of null alleles in some populations and was excluded from further analysis. The remaining seven loci showed signs neither of null alleles or large allele dropouts nor of genotyping errors due to the occurrence of  stutter bands. Furthermore, all populations were in linkage equilibrium. All samples within populations originated from sexual reproduction as no identical multilocus genotypes were found. A total of 78 alleles with a minimum of five and a maximum of 19 alleles per locus were observed. Only populations in Elba and northern Italy were fixed for one allele at locus MSS16. Allelic richness was slightly higher in northern Italy, in Corsica, in southern France as well as in Bosnia-Herzegovina and Bulgaria. In contrast to high cpDNA diversity, lower levels of allelic richness were observed in Elba. In Slovenia, low allelic richness was congruent with low levels of cpDNA diversity. Additionally, only the population in Slovenia showed a significant excess of heterozygotes (see F is values in Table 2). Private alleles were found in nine populations with a maximum of three private alleles in Bulgaria. Frequencies of private alleles ranged from 0Á013 (north-western France) to considerably high values of 0Á214 (in Elba). From the 17 observed private alleles, 11 had a frequency <0Á05 (data not shown). Allelic richness within populations was not correlated between both genomes (R 2 ¼ 0Á006).

Genetic differentiation among populations
The UPGMA dendrogram based on Cavalli-Sforza and Edwards (1967) chord distance revealed distances between populations corresponding to geographic adjacency in the case of the French and Austrian group. Populations from the Mediterranean and Balkan group (except Bosnia-Herzegovina) were pooled in an unresolved clade when the majority rule was used as the decision criterion (Fig. 2). A slightly different result was provided by PCoA using Nei's standard genetic distance (Nei, 1972), where the population from southern France clustered more clearly with populations from northern Italy, Corsica, Slovenia, Croatia and Bosnia-Herzegovina compared with the UPGMA dendrogram. Populations from Bulgaria and Elba were not assigned explicitly to this group (Fig. 3). The first two axes of the PCoA explained 74Á1 % of variation (58Á26 and 15Á84 %, respectively). Allelic distance according to Gregorius (1974) revealed an average of 0Á297. Populations of the Mediterranean group showed the least differentiation from the remaining gene pool (0Á199), followed by populations from the Balkan region (0Á253). The Austrian and French populations were most dissimilar from the residual gene pool (0Á328 and 0Á355, respectively; Fig. 4). Structure analysis showed consistent high assignment values among different runs and the results were partly in contrast to geographical proximity, but supporting the results from the UPGMA dendrogram as well as from PCoA. Populations from Austria and France built more distinct clusters than populations from Balkan and Mediterranean regions (Fig. 5). Based on the rate of change of the likelihood distribution, a value of K ¼ 3 was chosen as the most likely number of clusters ( Supplementary Data Fig. S2). The AMOVA revealed an F ST value of 0Á138 and an R ST value of 0Á102 when groups were defined according to their geographical proximity. For this arrangement, the percentage variation among groups decreased roughly when taking into account squared allele size differences and even reached a negative value, suggesting that stepwise mutations did not play a key role for differentiation even on a large geographical scale (Table 3). When groups were rearranged according to the results of the chord distance of Cavalli-Sforza and Edwards, Nei's standard genetic distance as well as individual population assignment, variation patterns among and within groups changed slightly, but R ST never exceeded F ST , indicating the absence of a phylogeographic structure that is based on stepwise mutations (Supplementary Data Tables S2-S4).

Spatial genetic structure and historic vs. recent gene flow
Mean values of distance intervals for spatial genetic analysis reached from 207 to 3690 m in Merkenstein and from 397 to 4598 m in Wolkersdorf, and seven distance classes were defined for both populations. Kinship coefficients fluctuated in the same order of magnitude in both populations (-0Á015 to 0Á015), but did not deviate significantly from the permuted mean value in any of the seven distance classes (Supplementary Data Figs S3 and S4).
Analysis of recent gene flow showed that the majority of individuals originated from self-recruitment in both populations (96Á8 and 96Á5 %, respectively) and only one individual in Wolkersdorf was found to be a potential first-generation migrant from Merkenstein (posterior probability of migrant ancestry 72 %, data not shown). We did not find potential second-generation migrants from Merkenstein to Wolkersdorf or the reverse (Table 4). In contrast, by analysing historic gene flow, there was strong evidence that both populations previously exchanged a considerable number of migrants during many generations and that the direction of migration had been highly asymmetric (approx. 12 individuals per generation from Merkenstein to Wolkersdorf vs. only three in the opposite direction) ( Table 5).

Range-wide genetic pattern and diversity
The analysed populations had nuclear diversity levels similar to those reported for other tree species with similar life characteristics, in particular with a comparable scattered population structure such as Pyrus pyraster (Volk et al., 2006), Malus sylvestris (Cornille et al., 2012) or Sorbus torminalis (Demesure et al., 2000;Hoebee et al., 2006). Furthermore, almost all populations showed evidence of outbreeding (F is values, Table 2). Considering that the degree of scattered occurrence is even higher for S. domestica compared with the mentioned species, we may explain this pattern by the existence of molecular  Gregorius (1974). The distance of each group to the residual gene pool is represented by the radius and the indicated numbers in parentheses. The angle of each segment shows the relative sample size. The thin line shows the average allelic distance. mechanisms of self-incompatibility (SI) systems, observed in many rosaceous species (De Nettancourt, 2001;Cheng et al., 2006;Schüler et al., 2006), and in combination with its longdistance dispersal strategy (Kamm et al., 2009). These traits could have been responsible for maintaining genetic diversity at a high level. Until now, however, pre-zygotic SI systems have not been investigated in S. domestica. Nevertheless, we can assume its existence, considering data obtained from seed orchards, where numerous potentially mating trees had significantly more viable seeds per fruit compared with isolated trees (Bariteau et al., 2006). Even when pre-zygotic SI breaks down and selfing occurs, embryonic lethal factors such as albinism may effectively prevent inbreeding (Kamm et al., 2012). Based on the data from the nuclear microsatellites, we found regions of comparatively high genetic diversity in southern and south-eastern Europe (Italy, southern France, Serbia and Bulgaria). However, this is not surprising, because these areas were potential refugia during the last glacial maximum for a wide range of taxa (e.g. Hewitt, 1996;Taberlet et al., 1998;Leroy and Arpe, 2004). An interesting result is that the decline in allelic richness from these potential refugia to the northern distribution limit (e.g. from southern France to north-western France) or to at least more northerly located populations (e.g. from Serbia to Austria) was only weak. This is in contrast to other studies that describe a clear pattern of 'southern richness and northern purity' [see Hewitt (1999) for a general overview and Comps et al. (2001) for an example dealing with tree species]. Unfortunately, we cannot confirm whether this pattern also holds true for other populations from the northern distribution limit, because some important populations (e.g. from northern Germany or Great Britain) were missing in our data set. However, cpDNA data showed a slightly opposite pattern. Only the northern Italian population showed high molecular diversity on both the nuclear and chloroplast level, while cpDNA diversity in the other two potential refugia (southern France and south-eastern Europe) was lower than in more distantly located populations. How can this pattern and the fact that even a small island like Elba encompassed the highest number of haplotypes be explained? An appropriate scenario is that historic trade of S. domestica fruits has shaped the current distribution and has been partially responsible for the diversity pattern. The Apennine Peninsula and also Elba, located in the central Mediterranean area and also as the centre of the Roman Empire, may have acted as a receiver rather than as a donor of fruits from western and south-eastern Europe, especially from countries with direct or indirect access to the Mediterranean Sea. There is strong evidence that the Romans were heavily dependent on imports of food to feed the growing population and that these imports came partly from distant provinces (Temin, 2006;Kessler and Temin, 2007). Support for this hypothesis is also provided by the results from PCoA and individual population assignment which revealed that one cluster contains nearly all populations near to the Mediterranean Sea (except Slovenia). This cluster is also evident in the UPGMA dendrogram, but represents an unresolved clade and we presume that historic admixture through humans (20-50 generations ago) could be the reason. Considering the mutation rates   for microsatellites (10 -6 -10 -2 per generation) (Schlötterer, 2000) and assuming an average generation time of 80-100 years for S. domestica, we may conclude that this time frame is insufficient for splitting into further clades. The Mediterranean group also had the lowest allelic distance to the residual gene pool (Fig. 4), even though it had the smallest sample size. Similar relationships among populations from distant regions within the Mediterranean were also observed for other tree species with long-standing economic importance such as Olea europaea (Belaj et al., 2002) or Castanea sativa (Villani et al., 1991).
Haplotype 288 was private to the Italian region, it occurred neither in central Europe nor in the south-east. Three possible scenarios can be discussed: (1) this haplotype persisted during the last glacial maximum in the southern part of the Apennine Peninsula and spread northward during more favourable conditions, but was not able to pass the Alps; (2) given the repeat number of this haplotype, we can assume that it is the youngest of all observed haplotypes (Schlötterer, 2000) and may have developed rapidly, when the former land bridge of the Adriatic Sea had already vanished; and (3) it is of unknown origin from a region, which was not sampled. Considering the low mutation rate for the chloroplast genome (Provan et al. 1999) we may exclude hypothesis (2), but it remains unclear if the presence of this haplotype is of natural or anthropogenic origin.
While the French populations showed a slight decrease in nuclear allelic richness from south to north, the cpDNA diversity pattern was different, with higher allelic richness in the central and in the north-western part of the country. A similar picture can be drawn for the Austrian populations compared with those of southern Europe. A 'melting pot hypothesis' as suggested by Petit et al. (2003) must be at least in part rejected as the explanation. Despite their higher cpDNA diversity, these populations harboured no haplotypes which were not also found in the southern regions.
It was obvious that the population in Slovenia is peculiar. Unexpectedly low levels of both nuclear and chloroplast diversity and a significant heterozygote excess compared with Hardy-Weinberg expectations were found. We cannot deduce its population history, but the data strongly support an anthropogenic origin.

Population differentiation and structure
Overall population differentiation based on allele identity was moderate (F ST ¼ 0Á138) and is in accordance with values reported for tree species with similar life characteristics concerning long-distance gene flow such as S. torminalis (Oddou-Muratorio et al., 2001a;Angelone et al., 2007) or M. sylvestris (Cornille et al., 2012) on a comparable geographical scale. When taking into account the allele size under the assumption of a stepwise mutation model, R ST could not explain more of the variation than F ST either on a large (among groups) or on a relatively small geographical scale (among populations within groups). Although there has been considerable criticism about the comparison of F ST and R ST concerning gene flow estimation, mutation model assumptions and sampling variance (e.g. Whitlock and McCauley, 1999;Balloux and Lugon-Moulin, 2002;Hardy et al., 2003), this pattern might be explained through the extensive gene flow ability of the species (Kamm et al., 2009) in combination with human-mediated seed transfer at an even larger scale. As a consequence, migration (either natural or anthropogenic) always counteracts population differentiation based on stepwise mutations. This is also mirrored by the relatively low number of private alleles, which was expected to be higher, especially in the potential refugia. The fact that our whole sample set was differentiated into only three clusters with detectable transitions provides further support for extensive gene flow. Only the Austrian populations and north-western and central France formed more or less distinct clusters (Fig. 5). This is not surprising, as these populations probably have been genetically well connected during their population history.
Spatial genetic structure and historic vs. recent gene flow in the Austrian populations Spatial genetic structure did not deviate significantly from a random permuted pattern, indicating that nearby individuals are not more related to each other than more distant individuals. While in barochorous broad-leafed species, clusters of related individuals are common (e.g. Geburek and Tripp-Knowles, 1994), a weak spatial genetic structure of a zoochorous species is very likely. In consequence, we may presume effective pollen and seed dispersal in these two populations.
The pattern of historic gene flow showed frequent exchange of migrants with conspicuous asymmetry (approximately three individuals per generation from Wolkersdorf to Merkenstein, but 12 individuals in the opposite direction). It can be expected that both populations were of greater size in the past and also better connected by single individuals that might have acted as stepping stones. Thus, it would have been much easier for migrants to bridge the current distribution gap of approx. 60 km. Since there are no significant natural landscape barriers between these populations except the river Danube (the landscape is characterized by low elevations and is of undulating shape), which were more likely to have been passed in one direction, we presume that this asymmetric pattern is to some extent the result of human-mediated seed transfer. This is also mirrored in the cpDNA haplotype distribution, which shows significant differences between these two populations, while it seems to be well balanced among populations across France for instance. Fruits of S. domestica had some former importance for the production of wine and cider, which has a long tradition in eastern Austria, in particular around the area of Wolkersdorf.
Compared with historic gene flow, the pattern of recent gene flow was different. Only one individual had a moderate probability to be a first-generation migrant. Its migration direction confirmed the asymmetry towards the northern located population Wolkersdorf. The decline in recent gene flow can have several explanations: habitat fragmentation took place intensively in this area during the last 200 years and a considerable part of the land use changed towards agriculture and urban development, resulting in a severe reduction of mature S. domestica trees and reduced movement ranges for both pollinators and seed dispersers (e.g. wild boar, roe deer, martens and foxes). Usually, dispersal distances achieved through these species are far below the distance that actually lies between Merkenstein and Wolkersdorf [e.g. Ellenberg (1978) reported 1-50 ha home ranges for roe deer, and Briedermann (1990) stated that wild boar regularly moves 5 km d -1 ]. Moreover, these species would have to cross extensive distances across an open area, because the size and number of forest patches left in between have been small. The only frugivores that are theoretically able to cover such distances are birds, mainly migratory species, for which mean distances of 60 km per day were reported (Ellegren, 1993). Another factor probably is the change in forest management within remaining patches. The early successional S. domestica in both populations is indeed under strong competition from late-successional and economically more important tree species such as beech, oak and pine. Hence, dominant individuals with well developed crowns are rare in the forest. This has led to reduced fruiting success and thus to a lower probability of being visited by frugivores. Oddou-Muratorio et al. (2005) have shown for S. torminalis that mating success of single trees is heavily dependent on tree size and competition status (i.e. the larger the individuals the larger is their fecundity).

Conclusions
Our study showed high levels of molecular diversity for S. domestica despite its scattered nature, with a likely human influence on its present diversity pattern. The key results of this study are: (1) S. domestica populations form three distinct groups in Europe (French, Mediterranean/Balkan and Austrian); (2) the Apennine Peninsula encompasses comparatively high genetic diversity in both genomes; and (3) the Austrian populations build a distinct cluster in Europe and showed significant decline in their exchange of migrants during the last two generations in comparison with several previous generations. Additional population samples from the Iberian Peninsula, the Carpathians and from south-eastern Europe, but also from further north, could help to provide a more definitive picture concerning diversity and the post-glacial colonization history of S. domestica in Europe.

SUPPLEMENTARY DATA
Supplementary data are available online at www.aob.oxfordjournals.org and consist of the following. Figure. S1: origin of individuals from the French National Collection and scattered single trees from central and southern France. Figure S2: change of the log probability between k-values 2-4. Figure S3: microspatial genetic structure for the population Merkenstein. Figure  S4: microspatial genetic structure for the population Wolkersdorf. Table S1: overview of all markers used in the study. Table S2: AMOVA results according to groups from PCoA based on Nei's standard genetic distance. Table S3: AMOVA results according to groups from UPGMA based on Cavalli-Sforza and Edwards chord distance. Table S4: AMOVA results based on genetic clustering calculated with Structure.