Genetic differentiation and hybrid identification using microsatellite markers in closely related wild species

Despite extensive morphological diversity and different floral syndromes, two wild Petunia species are closely related, present high genetic similarity, and field observations suggest natural hybridization between them. Here we described helpful tools for evolutionary studies addressing genetic population, interspecific hybridization, and plant speciation. Based on previously described microsatellites of Petunia hybrida we were able to identify private alleles characterizing species and their putative hybrids. These profiles could also be useful to study gene flow, population structure, genetic conservation and landscape.


Introduction
Genetic diversity and population structures are relevant for understanding the evolutionary history, breeding systems, geographical distributions and ecological requirements of species. In evolutionary studies, molecular markers that differentiate close species and their putative hybrids greatly increase our understanding of the genetic basis of speciation and the effects of introgression on species integrity.
The Petunia genus encompasses 14 taxonomically accepted wild species, which are distributed exclusively in southern South America (Stehmann et al. 2009) and present low genetic variability (Ando et al. 2005;Kulcheski et al. 2006;Lorenz-Lemke et al. 2010;Fregonezi et al. 2013), and one artificial hybrid known worldwide, the commercial Petunia hybrida that is the result of crossings between Petunia axillaris and Petunia interior . Moreover, species of this genus are commonly used as a scientific model species because of their intrinsic features (Gerats and Vandenbussche 2005).
Petunia axillaris is the only species of the genus that presents white corollas (Stehmann et al. 2009) and is pollinated by hawkmoths (Ando et al. 1995;Venail et al. 2010;Klahre et al. 2011) (Fig. 1). This species is widespread in the Pampas grasslands of southern South America and currently consists of three taxonomically accepted subspecies: P. axillaris ssp. axillaris, P. axillaris ssp. parodii and P. axillaris ssp. subandina. These subspecies are morphologically distinguishable from each other by floral traits such as corolla tube length, relative positioning of the stamen and corolla, and corolla diameter (Ando 1996;Turchetto et al. 2014a, b) (Fig. 1). Plastid data analysis revealed low genetic differentiation among these subspecies, whereas the nuclear genome contained greater genetic variability (Turchetto et al. 2014a, b). The three subspecies occur in adjacent geographic regions, and the P. axillaris ssp. axillaris could be found in the same narrow region as another congeneric species, Petunia exserta, that presents red and birdpollinated flowers Stehmann et al. 2009;Segatto et al. 2014a) (Fig. 1). Although they grow in sympatry, P. exserta and P. axillaris ssp. axillaris are partially isolated in adjacent microhabitats: whereas P. exserta plants grow only inside the shaded cracks (shelters) of sandstone towers where they are protected from direct rain and sunlight, P. axillaris ssp. axillaris individuals grow in open and sunny habitats (Fig. 1).
Previous studies addressing the genetic diversity of these species showed lower indices of genetic variability in P. exserta, in agreement with its narrow endemic status , and extensive divergence between the species. Moreover, natural hybridization between subandina morphology; (F and G) typical Petunia exserta morphology and habitat (shelters); (H) landscape of Serra do Sudeste, highlighting the sandstone towers and habitats of P. exserta (black arrow 1) and P. axillaris ssp. axillaris (black arrow 2). these species has been suggested based on their morphology as well as plastid and nuclear genomes data Segatto et al. 2014a).
Although the genetic diversity and population structure of these taxa have been previously addressed Segatto et al. 2014a;Turchetto et al. 2014a, b), here we provide a new approach and obtain more precise taxa characterization by using microsatellites derived from expressed sequence tag-simple sequence repeat (EST-SSR) markers. Bossolini et al. (2011) developed, mapped and used 83 EST-SSRs to construct linkage maps in Petunia and these markers cover the seven chromosomes of Petunia. However, despite their great potential for evolutionary studies, the markers have not been used to access the genetic variability in wild Petunia species. Expressed sequence tag-simple sequence repeat markers offer advantages over genomic DNA-based markers: they can be applied to analyse the functional diversity; are more conserved among species; and are obtained from gene-rich regions of genome (Zhang et al. 2005;Guo et al. 2006). They can also be used efficiently to assess genetic relationships (Gupta and Rustgi 2004) and to compare closely related taxa (Fluch et al. 2011).
The main goals of this work were to evaluate genetic polymorphism in a subset of the Petunia EST-SSRs in wild P. axillaris subspecies and P. exserta, thereby identifying alleles private to each taxon and to employ these genetic profiles to detect putative natural hybrids.

Plant material and genomic DNA isolation
We collected young leaves of individuals from the three natural occurring P. axillaris subspecies and P. exserta. The young leaves were dried in silica gel and pulverized in liquid nitrogen to allow genomic DNA extraction with cetyltrimethylammonium bromide (CTAB) (Roy et al. 1992). Additionally, we included genomic DNA from the previously analysed natural hybrids (P. axillaris ssp. axillaris × P. exserta; Lorenz-Lemke et al. 2006) in our analyses (Table 1). The quality and quantity of the isolated DNA was estimated by measuring the absorbance at 260 and 280 nm on a Nanodrop Spectrophotometer (NanoDrop 1000 spectrometer, Thermo Scientific Corp., San Jose, CA, USA). The DNA was finally electrophoresed on a 1.0 % agarose gel using GelRed TM (Biotium, Inc., Hayward, CA, USA). Samples were stored until further study at 220 8C.

Expressed sequence tag-simple sequence repeat amplification
The genomic DNA of each individual was used to amplify the previously characterized EST-SSR markers ; available online at http://www.botany.unibe. ch/deve/caps/index.html) scattered throughout the genome. Initially, we selected 34 primers based on their chromosomal location and polymorphic index content to test the cross-transferability among taxa and putative hybrids. Only 14 were successfully amplified and were subsequently used to estimate genetic variability. Polymerase chain reactions (PCRs) were conducted in a final volume of 10 mL containing 10 ng of genomic DNA as a template, 200 mM of each dNTP (Invitrogen, Carlsbad, CA, USA), 1.7 pmol of fluorescently labelled M13(-21) primer, 3.5 pmol of reverse primer, 0.35 pmol of forward primer with a 5 ′ -M13(-21) tail, 2.0 mM MgCl 2 (Invitrogen), 0.5 U of Platinum Taq DNA polymerase (Invitrogen) and 1× Platinum Taq reaction buffer (Invitrogen). The PCR conditions were as follows: an initial denaturation at 96 8C for 3 min; 32 cycles of 96 8C for 15 s, 50 -52 8C for 30 s and 72 8C for 1 min; and a final extension cycle at 72 8C for 7 min. The forward primers were FAM-, NED-, or HEXlabelled. The amplified product was visualized on a 2 % ultra-resolution agarose gel stained with 2 mL 0.001 % of GelRed (Biotium Inc.). The DNA fragments were denatured and size-fractionated using capillary electrophoresis on a MegaBACE 1000 automated sequencer (GE Healthcare Biosciences, Pittsburgh, PA, USA) with an ET-ROX 550 internal size ladder (GE Healthcare). The manufacturer's software was used to identify the alleles. The primer sequences, repeat motif, fragment size range as estimated by the number of base pairs, and respective annealing temperatures are shown in Table S1 [Supporting Information].

Statistical analysis
We used the FSTAT 2.9.3.2 software (Goudet 1995) to evaluate summary statistics such as the number of alleles per locus (A), gene diversity, allelic richness (RA) and number of private alleles (E) per locus. We estimated the frequencies of null alleles and the polymorphic information content using the CERVUS 3.0.3 software (Marshall et al. 1998;Kalinowski et al. 2007).
The tests for outlier loci were performed with the BAYESCAN 2.1 software (Foll and Gaggiotti 2008) based on the multinomial-Dirichlet model. The BAYESCAN software incorporates the uncertainty of allele frequencies due to small sample sizes. Selection was introduced by decomposing the locus-population F ST coefficients into a population-specific component (b) shared by all loci and a locus-specific component (a) shared by all populations using a logistic regression approach. Selection is assumed to occur when the a component is necessary to explain the diversity observed at a locus.
To investigate the genetic similarity between taxa, we carried out a principal coordinates analysis (PCoA) using the GENALEX 6.4 software Smouse 2006, 2012). We used a distance matrix based on the alleles that were shared among individuals to depict the relationships among individuals in all taxa; the original data were bootstrapped 1000 times using the MICROSAT software (Minch et al. 1997). In addition, an unweighted neighbour-joining (NJ) tree (Saitou and Nei 1987) was constructed based on the mean bootstrapped matrix of microsatellite alleles shared among the 126 individuals calculated from 14 EST-SSRs using the MEGA6 software (Tamura et al. 2013).

Evaluating hybrid individuals
We amplified and analysed 13 putative hybrid individuals from a contact zone between P. axillaris ssp. axillaris and P. exserta that presented intermediate morphological characters, such as corolla colour, reproductive organs size and relative position, as previously described (  (Pritchard et al. 2000) to unravel the underlying genetic structure in complex groups that are not necessarily populations (Jombart et al. 2010). Discriminant analysis of principal components is not based on pre-defined population genetics models and makes no assumptions about Hardy -Weinberg equilibrium or linkage disequilibrium. We set an a priori group number of five in the DAPC analysis. We used a factorial correspondence analysis (FCA) as performed in GENETIX version 4.05 (Belkhir et al. 2004) to plot multilocus genotypes and visualize species discreteness based on the most distinctive loci. In this analysis, each row (individuals) and each column (alleles) were depicted as a point. The hyperspace had as many dimensions as there were alleles for each locus, and the algorithm attempted to find independent directions in this hyperspace. We also ran a clustering analysis, as implemented in the STRUCTURE 2.3 software (Pritchard et al. 2000), to compare with the DAPC results. The parameters were correlated allele frequencies (Falush et al. 2003) and no prior population information was used. The number of groups (K ) was evaluated from 1 to 10, with 10 independent runs per K value, to determine the maximum value of the posterior likelihood [ln P(D)] and the best value of K. Each run was performed using 2.5 × 10 5 burn-in periods and 10 6 Markov chain Monte Carlo repetitions after burn-in, and the convergence was assessed. The optimal value of K was calculated using the maximum value of DK (Evanno et al. 2005). We used the CLUMPP 1.1.2 software to summarize the results of the optimal K value based on the average pairwise similarity of individual assignments across runs using Greedy's method and the G ′ statistic (Jakobsson and Rosenberg 2007). We used the DISTRUCT 1.1 software (Rosenberg 2004) to visualize the STRUCTURE results after processing with the CLUMPP software. Substructures within each main cluster were detected by the same approach using STRUCTURE.

Primer validation and cross-transferability
Fourteen of the 34 previously described loci reached our criteria for markers that had the potential to discriminate among the four wild Petunia taxa and their putative interspecific hybrids. The 14 EST-SSR loci revealed 143 alleles in 126 sampled individuals (Table 2). Petunia axillaris ssp. axillaris presented the highest diversity index, whereas P. axillaris ssp. subandina was the least diverse (Tables 2 and 3). The combined probability of identity for the 14 EST-SSR loci was almost zero for all taxa, which indicated that two unrelated individuals would not share the same multilocus genotype. The PIC values (Table 3) ranged from 0.42 (PM184) to 0.93 (PM177), which showed that these markers are polymorphic across taxa. We found private alleles in all taxa, with a higher value for P. axillaris ssp. axillaris (22 alleles), followed by P. exserta and P. axillaris ssp. parodii (nine alleles each) and P. axillaris  ssp. subandina (seven alleles). The majority of the private alleles found in P. exserta or P. axillaris ssp. axillaris was present in their putative hybrids. In total, 12 private alleles were shared between P. axillaris ssp. axillaris (eight alleles) or P. exserta (four alleles) and their hybrids [see Supporting Information- Table S2]. Moreover, hybrid individuals shared one private allele found in P. axillaris ssp. subandina, which grows far 1000 km. The PM21 and PM88 loci presented two and one private alleles in the putative hybrids, respectively, which suggests that these alleles came from parental plants that were not sampled in our study [see Supporting Information- Table S2].
The frequency of null alleles varied from 20 to 79 % across all taxa. Null alleles were present at high frequency (.5 %) in all taxa. The tests for outlier loci among all 126 individuals revealed that PM177 (P ¼ 1), PM88 (P ¼ 0.829), PM74 (P ¼ 0.983) and PM192 (P ¼ 1) were under selection with negative a values, which suggests the occurrence of a balancing or purifying selection. However, this result should be considered carefully because we did not sample a representative population.

Genetic similarity
Principal coordinates analysis (Fig. 2) was conducted to assess the clustering of individuals based on EST-SSR polymorphisms. Plotting the first two axes showed that P. axillaris ssp. axillaris and P. exserta were closely related in the coordinate space and that P. axillaris ssp. parodii formed a wider scattered group, whereas the other two subspecies each formed only one exclusive group. The NJ tree (Fig. 3) of the 126 individuals obtained from a matrix of shared microsatellite alleles presented two major clusters: the first grouped P. axillaris ssp. axillaris and P. exserta individuals (Group A); whereas individuals of P. axillaris ssp. parodii and P. axillaris ssp. subandina preferentially composed the second group (Group B). Group A was divided into two subgroups, one corresponding to P. exserta individuals and only two individuals of P. axillaris ssp. axillaris and another composed of only individuals of the P. axillaris ssp. axillaris. Some P. axillaris ssp. axillaris and P. exserta individuals clustered with Group B (Fig. 3). The tree topology showed a close relationship between P. exserta and P. axillaris ssp. axillaris, in agreement with PCoA results.

Evaluating hybrid individuals
The polymorphism of these 14 EST-SSR loci indicated that putative hybrid individuals with a morphology between P. axillaris ssp. axillaris and P. exserta presented similar genetic profiles to those observed for morphologically typical P. exserta or P. axillaris ssp. axillaris individuals   (Fig. 4). In the DAPC (Fig. 4A) morphological hybrids grouped more closely to P. exserta individuals. Typical P. exserta individuals were more close related to P. axillaris ssp. axillaris than to the other P. axillaris subspecies. In the FCA (Fig. 4B) the proximity of P. axillaris ssp. axillaris and P. exserta was more apparent, and the hybrids were placed within the distribution of these taxa. This analysis considers the entire allele to be a representation of one individual for plotting in the hyperspace. The STRUCTURE analysis (Fig. 4C) confirmed these results. In STRUCTURE analysis, the best K ¼ 4 was obtained by Evanno's method with the inclusion of putative hybrids in the P. exserta group, and the other clusters were associated with the P. axillaris subspecies.

Discussion
In general, EST-SSRs have been found to be significantly more transferable across taxonomic boundaries compared with the traditional 'anonymous' SSRs (Tiffin and Hahn 2002;Varshney et al. 2005;Woodhead et al. 2005). In the present study, we aimed to evaluate a set of microsatellite markers developed from EST-SSRs to obtain information about their cross-transferability in different wild Petunia species. We intended to identify markers that can be used in population dynamic analyses, breeding systems and interspecific hybridization events.
Here, we successfully cross-amplified 14 EST-SSR loci and estimated the genetic differentiation of close related wild Petunia taxa (the P. axillaris subspecies and P. exserta). These loci were obtained from an EST database with sequences that were isolated preferentially from the roots and floral organs of Petunia hybrida .
Overall, P. axillaris ssp. axillaris had a higher number of alleles per locus and allelic richness (Tables 2 and 3), which illustrated the high diversity of this taxon. This is consistent with the findings of previous studies of plastid sequences, cleaved amplified polymorphic sequence (CAPS) nuclear markers Turchetto et al. 2014a), and the phylogenetic position of these taxa (Reck-Kortmann et al. 2014). The mean number of alleles per locus obtained using EST-SSRs in these species (10 for a complete data set) was higher than that obtained in Petunia integrifolia based on genomic SSRs (a mean of three alleles per locus; Kriedt et al. 2011). A recent study involving wild Petunia species and commercial P. hybrida  showed that P. integrifolia species complex share several alleles and also have a low mean number of alleles per locus (five in Petunia inflata, four in P. integrifolia and three in Petunia interior). Here, we found that all taxa contained private alleles, which indicated that EST-SSR loci are a useful tool for evolutionary analyses of these species. The morphological and ecological differentiation between P. axillaris and P. exserta is clear, and although they share plastid haplotypes, there is no doubt about the placement of these lineages at the species level ). We included samples from different locations to represent the entire geographic distribution of all taxa and to minimize the population effect (population inferences were not included here).
Flower morphology affects the behaviour of pollinators by advertising for reward and restricting access. The maintenance of morphological traits is responsible for  preserving the genetic boundaries between species, although past hybridization events could have occurred. The individuals mentioned in this study presenting an intermediate morphology were considered to be interspecific hybrids due to their morphological traitsespecially the corolla colour and position of reproductive organs, plastid polymorphism and positional distribution inside caves. These individuals had been analysed previously and were considered hybrids between P. exserta and P. axillaris . The putative hybrids presented alleles that were private to P. axillaris ssp. axillaris or P. exserta (Table 3), thereby highlighting the ability of these markers to identify hybridization events between these species. However, only one marker private to P. axillaris ssp. axillaris and P. exserta was shared between hybrid individuals (PM188; Supporting Information- Table S2). These results provide evidence that the putative hybrids evaluated may not be F1 hybrids and could have backcrossed with the parental species or crossed among themselves. Studies analysing the morphology of individuals, in addition to canonical morphological data coupled with analysis of these markers, can be useful for understanding the evolution of these closely related wild species. In the STRUCTURE analysis (Fig. 4), putative hybrids showed signs of admixture with P. axillaris ssp. axillaris, but the same level of admixture has been found in the P. exserta group. This pattern could be explained by backcrossing with parental species or by shared ancestral polymorphisms. Moreover, the putative morphological hybrids occurred only inside caves (P. exserta habitat) and it is possible that the hybrids have habitat or ecological restrictions. Another more controversial possibility is that speciation with gene flow occurred, and some of the morphological traits were not uniform as demonstrated to other species (Wu 2001). The consequences of hybridization differ depending on whether the species diverged with gene flow or underwent a secondary contact (Abbott et al. 2013). These EST-SSRs enable the unravelling of these evolutionary processes in future population studies.
In P. axillaris, the maintenance of morphological traits is responsible for preserving the boundaries of the subspecies. The clear ecological and morphological differentiation of the P. axillaris subspecies is primarily associated with corolla tube length (Turchetto et al. 2014a, b), and differences in tube length raise interesting questions about the feeding strategy of pollinators. Under controlled greenhouse conditions, Manduca sexta moths prefer P. axillaris flowers with a larger limb size, which may simply reflect its better visibility under low light conditions (Venail et al. 2010). Moreover, tube length is associated with a CAPS marker derived from the Flavonoid hydroxylase 1 (HF1) gene, and the differences between P. axillaris subspecies are associated with different genotypes at this locus (Turchetto et al. 2014b). The HF1 gene is also informative for distinguishing Petunia species (Chen et al. 2007). Further experiments, particularly considering morphological traits under field conditions, are necessary to evaluate the remaining questions involving P. axillaris and P. exserta and their pollinators; highly polymorphic EST-SSRs will be informative in these studies.
The close relationship between P. axillaris ssp. axillaris and P. exserta (Kulcheski et al. 2006;Lorenz-Lemke et al. 2006;Reck-Kortmann et al. 2014;Segatto et al. 2014a) was confirmed and supported by this study. Moreover, the EST-SSRs used were able to differentiate taxa. Our results suggest that these EST-SSR markers may constitute a useful tool for evolutionary and ecological studies involving these species. The original description of these EST-SSRs ) used very few Petunia individuals that had been maintained in a greenhouse for decades. Here, we demonstrated that these markers may cross-amplify wild individuals and are able to differentiate species through private alleles in each species or even by combined genotypes.
Evolutionary studies require many universal primers that can be used in multiple species and that allow comparisons between close related taxa to address questions of population divergence and speciation processes (Noor and Feder 2006). Expressed sequence tag-simple sequence repeats are easily transferable across species (Liewlaksaneeyanawin et al. 2003;Chapman et al. 2009) and may be a useful tool in evolutionary studies, especially of genera such as Petunia, with low genetic diversity and great morphological variability (Fregonezi et al. 2013). The EST-SSR markers analysed in Picea abies (Fluch et al. 2011) presented a high degree of variability and were well suited for an analysis of the stress-related functional variation present in this species. Although EST-SSRs are related to transcribed regions, they can evolve neutrally. In fact, several studies have found that the population structure measures of EST-SSRs are very similar to those derived from anonymous SSRs (e.g. Woodhead et al. 2005). Moreover, this type of molecular marker has been used for hybrid identification, population genetics studies, distinguishing close related taxa and studying functional genetic diversity (Naresh et al. 2009;Chapman et al. 2010;Barati and Arzani 2012;Li et al. 2013;Yang et al. 2013).
Expressed sequence tag-simple sequence repeat polymorphisms are associated with transcribed regions of the genome and reflect the genetic diversity inside or adjacent to genes (Clark et al. 2003). The major criticism regarding the use of these markers to estimate population parameters is that divergent selection would increase the differentiation among and reduce variability within populations, whereas balancing selection could produce the opposite effect. Chagné et al. (2004) showed that estimates of population differentiation based on EST-SSRs are comparable with those based on anonymous SSRs or large-scale comparative analyses. Moreover, only a very small percentage of genes experiencing positive selection has been identified (Li et al. 2013;Yang et al. 2013). Thus, one should expect that only a small fraction of EST-SSRs would be subject to selection.
In this study, only four loci were found to be under selective pressure. Two of them are of unknown function (PM177 and PM88), while PM192 and PM74 are located in transcription factors [see Supporting Information- Table S1]. More analyses, including population studies of each taxon, may clarify the role of these markers in functional differentiation.
The effects of morphological differences on the feeding strategy of the pollinators and adaptive genetic differentiation to maintain the processes of speciation involving close related species with the ability to hybridize in nature raise interesting questions, and Petunia species are an excellent model for this area of study. Further experiments, especially analyses of morphology, gene flow and crossbreeding systems, are necessary to answer the remaining evolutionary questions about the Petunia species and speciation processes within this genus. Understanding the processes involved in the diversification and speciation of wild Petunia species is not only important for ecological and environmental issues but may also be useful for breeding commercial hybrid species; the EST-SSRs tested here constitute useful tools for these purposes.

Conclusions
The EST-SSR markers scattered throughout the Petunia genome are very efficient tools for characterizing the genetic diversity in wild taxa of this genus and are able to classify interspecific hybrids based on the presence of private alleles. These factures indicate that the markers presented here are helpful tools for evolutionary studies.