-
PDF
- Split View
-
Views
-
Cite
Cite
Chien-Hsun Huang, Caifei Zhang, Mian Liu, Yi Hu, Tiangang Gao, Ji Qi, Hong Ma, Multiple Polyploidization Events across Asteraceae with Two Nested Events in the Early History Revealed by Nuclear Phylogenomics, Molecular Biology and Evolution, Volume 33, Issue 11, November 2016, Pages 2820–2835, https://doi.org/10.1093/molbev/msw157
Close - Share Icon Share
Abstract
Biodiversity results from multiple evolutionary mechanisms, including genetic variation and natural selection. Whole-genome duplications (WGDs), or polyploidizations, provide opportunities for large-scale genetic modifications. Many evolutionarily successful lineages, including angiosperms and vertebrates, are ancient polyploids, suggesting that WGDs are a driving force in evolution. However, this hypothesis is challenged by the observed lower speciation and higher extinction rates of recently formed polyploids than diploids. Asteraceae includes about 10% of angiosperm species, is thus undoubtedly one of the most successful lineages and paleopolyploidization was suggested early in this family using a small number of datasets. Here, we used genes from 64 new transcriptome datasets and others to reconstruct a robust Asteraceae phylogeny, covering 73 species from 18 tribes in six subfamilies. We estimated their divergence times and further identified multiple potential ancient WGDs within several tribes and shared by the Heliantheae alliance, core Asteraceae (Asteroideae–Mutisioideae), and also with the sister family Calyceraceae. For two of the WGD events, there were subsequent great increases in biodiversity; the older one proceeded the divergence of at least 10 subfamilies within 10 My, with great variation in morphology and physiology, whereas the other was followed by extremely high species richness in the Heliantheae alliance clade. Our results provide different evidence for several WGDs in Asteraceae and reveal distinct association among WGD events, dramatic changes in environment and species radiations, providing a possible scenario for polyploids to overcome the disadvantages of WGDs and to evolve into lineages with high biodiversity.
Introduction
The evolution of new life forms and the increase in biodiversity require initial genetic variation within populations and subsequent natural selection (Stebbins 1999) or fixation (Kimura 1983; Hughes 2007). Genetic variations are generated by mutation and recombination; if the variations are beneficial, then they are more likely to be passed on to new generations. Systematic comparisons of genes, proteins, and their interactions between different organisms are beginning to yield insights into the molecular basis of diversity (Sharma et al. 2011; Zhang et al. 2013). The rapid advances of genomics also greatly facilitate the understanding about the relatednesses and differences between organisms at a genome-wide level (Paterson et al. 2010).
Whole-genome duplications (WGDs), or polyploidizations, duplicate all genes simultaneously and provide abundant genetic materials for evolutionary processes such as chromosomal rearrangement (Pontes et al. 2004; Madlung et al. 2005), neofunctionalization (the acquisition of a new function in a duplicate copy) (Blanc and Wolfe 2004a), subfunctionalization (the division of the original function in the two duplicates) (Cusack and Wolfe 2007), and gene conservation owing to dosage effects (the increased production of a beneficial gene product) (Freeling 2009; Bekaert et al. 2011; Hudson et al. 2011). These processes contribute to genomic novelty, speciation, adaptive radiation, and organismal complexity (Stebbins 1940; Levin 1983; Soltis et al. 2003; Rieseberg and Willis 2007; Maere and Van de Peer 2010; Mayrose et al. 2011; Arrigo and Barker 2012). Such changes likely allow organisms to consequently take advantage of new ecological opportunities or to cope with new environmental challenges (Ohno 1970; Hahn 2009; Maere and Van de Peer 2010; Schranz et al. 2012; Fawcett et al. 2013). For example, the angiosperm family Brassicaceae experienced rapid and nested radiations of five major lineages based on a recently resolved phylogeny (Huang et al. 2015) after WGD at the crown node of this family (Henry et al. 2006; Schranz and Mitchell-Olds 2006; Edger et al. 2015).
However, recently formed polyploids appear to be an evolutionary dead end. Observations supporting this idea include increased abnormalities through improper meiotic chromosome pairing between two subgenomes of a recent polyploid, resulting in genomic instability that detrimentally affects fertility and fitness (Madlung et al. 2005). Even when the polyploids are stable, suitable mating partners are often lacking, leading to the failure in establishing a viable population for propagation (Levin 1975; Vanneste et al. 2014). The genomic instability and the minority cytotype may be responsible for the lower speciation rates and higher extinction rates of polyploids compared with diploids (Mayrose et al. 2011).
Nevertheless, ancient WGDs have been detected in many evolutionary lineages associated with major species radiations, including vertebrates, fish, and flowering plants (e.g., Levin 1983; McLysaght et al. 2002; Soltis et al. 2003; Dehal and Boore 2005; Rieseberg and Willis 2007; Mayrose et al. 2011). In particular, Jiao et al. (2011) found that the ancestor of all extent angiosperms likely experienced an ancient WGD event, as did the ancestor of the extant seed plants. WGD events have also been found in many specific plant lineages by analyses of sequenced genomes of Arabidopsis (Arabidopsis thaliana), soybean (Glycine max), poplar (Populus trichocarpa), and several grasses including rice (Oryza sativa), sorghum (Sorghum bicolor), Brachypodium distachyon, and maize (Zea mays) (Blanc et al. 2000; Paterson et al. 2000; Vision et al. 2000; Simillion et al. 2002; Bowers et al. 2003; Paterson et al. 2004; Maere et al. 2005; Tuskan et al. 2006; Schmutz et al. 2010; Tang et al. 2010). The fact that two of the most successful groups of life forms, angiosperms, and vertebrates, are paleopolyploids (Putnam et al. 2008; Jiao et al. 2011) argues strongly for the long-term positive effects of polyploidizations. Considering the aforementioned contrasting fates of polyploids and the resulting controversies (Soltis and Burleigh 2009; Soltis et al. 2009; Abbasi 2010; Van de Peer et al. 2010), the effect of WGD and the underlying mechanisms in evolution and diversification remains an area of vigorous research.
The sunflower family (Asteraceae or Compositae) is one of the largest angiosperm families, with 24,000–35,000 species, accounting for ∼10% of all angiosperm species, with diverse habitats, floral structures, and biochemistry (Funk et al. 2009b). Species of Asteraceae range from tiny annual herbs to large trees and have predominated for millions of years in numerous biomes worldwide, primarily in open habitats with seasonal climates such as the Mediterranean region, prairies, steppes, mountains, and deserts (Funk et al. 2009b; Mandel et al. 2015). Asteraceae species are generally known for their attractive inflorescences, the capitula or flower heads, which promote outcrossing, pollination specialization and fitness and are thought to contribute to the evolutionary success and rapid tribal radiation of this large family (Marshall and Abbott 1984; Stuessy et al. 1986; Sun and Ganders 1990; Endress 1999; Cubas 2004; Sargent 2004; Andersson 2008). Several other characters were also proposed to possibly promote the adaptive radiation of Asteraceae, such as the pappus that assists seed dispersal and secondary metabolites providing defense against herbivores (Stuessy and Garver 1996; Jeffrey 2007), but there is little evidence for a link between these innovative traits and species radiation.
A paleopolyploidization event early in Asteraceae history was proposed to have affected the genes involved in the innovation of the capitulum (Chapman et al. 2008; Chapman et al. 2012). Barker et al. (2008) analyzed distributions of Ks values for paralogs of 18 species in four tribes, one in each of four subfamilies (Asteroideae, Cichorioideae, Carduoideae, and Mutisioideae) and found that Ks peaks at ∼0.75 in these species after correction for rate heterogeneity, suggesting that a WGD event might be shared by these four subfamilies. Another putative event was reported for the clade encompassing Heliantheae s. l. and Eupatorieae by comparisons of phylogeny with chromosome numbers (Smith 1975; Robinson 1981; Yahara et al. 1989; Berry et al. 1995; Gentzbittel et al. 1995; Baldwin et al. 2002). Barker et al. (2008) also analyzed Ks distributions using EST datasets of several Helianthus species and detected evidence for a WGD event with corrected Ks of ∼0.37 shared by members of this genus. A putative event also observed in the Ks distribution of Gerbera hybrida with Ks of ∼0.56 (Barker et al. 2008). Thus, ancient duplication events have been reported for Asteraceae, but the relatively small number of taxon samples did not allow precise positioning of those events. In addition, it is not clear whether these events are related to the evolutionary success of Asteraceae in species diversity, habitats, and morphology.
Morphological and molecular analyses both indicate Asteraceae to be monophyletic (Panero and Funk 2008; Funk et al. 2009b; Panero et al. 2014). To date, the most comprehensive phylogeny is a meta-tree (Funk and Specht 2007; Funk et al. 2009a) constructed using a base tree of 10 chloroplast loci (Panero and Funk 2008; Baldwin 2009; Funk and Chan 2009; Pelser and Watson 2009), including ∼900 of the 1700 genera in the family. However, genetic information provided by phylogenetic markers used in these previous studies were insufficient to resolve relationships of several areas. Comparing with chloroplast sequences, nuclear genes provide genetic information from both parents and are known to be highly effective in resolving phylogeny (Zhang et al. 2012; Wickett et al. 2014; Zeng et al. 2014; Huang et al. 2015; Yang et al. 2015); however, large datasets for Asteraceae are quite limited. Advances in next-generation sequencing technology has provided a means to produce large datasets, facilitating the phylogenetic reconstruction with abundant nuclear markers and the investigation of possible polyploidy events using large scale information from transcriptomic datasets.
Here, we generated transcriptomes for 62 Asteraceae species and two close relatives as outgroups (Calyceraceae Nastanthus ventosus and Menyanthaceae Menyanthes trifoliate, members of Asterales), and retrieved public datasets of 11 additional Asteraceae members, resulting in a total of 75 species for the investigation of the phylogeny and polyploidizations of Asteraceae. By carefully screening to avoid possible hidden-paralogs, which are single-copy genes resulting from loss of distinct paralogs in different taxa after duplication, 175 nuclear genes and a subset of 71 genes were selected to reconstruct a family-wide Asteraceae phylogeny. A highly supported phylogeny including 18 tribes of six subfamilies is produced from both concatenation and coalescence methods. We estimated divergence times with extensive outgroups and 12 fossil constraints, and found this family to originate during late Cretaceous with a rapid radiation of at least 10 core-subfamilies within 10 My. We also present evidence for multiple ancient WGD events, including nested duplications in the early history of this family, and several independent events in different lineages. The resulting history of Asteraceae reveals two rounds of WGDs that each coincide with an extinction event in the Earth history and followed by two kinds of biodiversity burst in the family. Altogether, our results reveal repeated WGDs during the evolution of Asteraceae, and also support the hypothesis that the success of this proliferative family might have resulted from the combined effect of WGD and rapidly changing environments as indicated by the mass extinction events. The results here on the history of Asteraceae may shed new light on the evolutionary success of polyploids and mechanisms associated with species diversification.
Results
A Highly Supported Family-Wide Asteraceae Phylogeny Using Nuclear Orthologs
Low-copy nuclear genes have been successfully used to reconstruct phylogenetic relationship of land plants (Wickett et al. 2014), angiosperms (Zhang et al. 2012; Zeng et al. 2014), Brassicaceae (Huang et al. 2015), Caryophyllales (Yang et al. 2015), and many others. Here, we sequenced transcriptomes for 62 Asteraceae species (supplementary table S1, Supplementary Material online) in three largest subfamilies (Asteroideae, Cichorioideae, and Carduoideae), a small subfamily (Gymnarrhenoideae), and two early divergent subfamilies (Mutisioideae and Barnadesioideae), representing 98% of the species richness of the family (Funk et al. 2009b). Also two species Nastanthus ventosus (Calyceraceae) and Menyanthes trifoliate (Menyanthaceae) were included as closely related outgroups within Asterales. The illumina HiSeq2000 sequencing yielded 13,245,776 to 62,351,587 reads, and the subsequent de novo assembly of these datasets resulted in 25.492 to 119.998 contigs with average lengths of 546–1064 nucleotides. Datasets for 11 other Asteraceae species were retrieved from public resources (supplementary table S2, Supplementary Material online). Among the 75 organisms included in this study (supplementary table S2, Supplementary Material online), 18 species representing distinct tribes and an outgroup were adopted for selection of single-copy putative orthologs by using OrthoMCL (Li et al. 2003) and subsequent filtering procedures (see Materials and methods section). A total of 175 orthologous groups (OGs) were identified (supplementary table S3, Supplementary Material online) for phylogenetic tree reconstructions; among them, 71 OGs having gene length longer than 900 amino acids on an average were also used in separate analyses.
Both maximum-likelihood (ML) and Bayesian inference (BI) analyses using concatenated supermatrices of the two sets of OGs lead to trees with the same topology for all 75 species (fig. 1). Furthermore, coalescence analyses with gene trees from both sets of OGs (supplementary fig. S1, Supplementary Material online) resulted in phylogenetic relationships completely identical to the supermatrix trees. Overall, the topological hypothesis here presents monophyly of each subfamily and tribe and supports their relative positions that were previously defined using morphological characters and/or chloroplast sequences (Panero and Funk 2008; Funk et al. 2009b; Panero et al. 2014). The selected nuclear markers seemed also effective in resolving relationships within tribes; our results largely agree with relationships obtained using chloroplast sequences (Funk et al. 2009a) and also provide resolution for some genera within tribes Astereae, Eupatorieae, and Heliantheae of Asteroideae. Detailed descriptions about the phylogeny and comparison with previous reported topologies are presented in supplementary material, Supplementary Material online. It is noteworthy that our 71 OGs contain sufficient taxonomic signals to infer identical and robust Asteraceae phylogeny and could be considered as phylogenetic marker genes for further studies. Future investigation using nuclear markers with more taxa in various subgroups will aid in reconstructing a more complete phylogeny.

Asteraceae phylogeny using 175 or 71 low-copy orthologous nuclear genes inferred by ML and BI methods. Values indicate the BS support and posterior probabilities from ML and BI analyses, respectively. Branches without values are with maximum support from all analyses, and asterisks indicate maximum support from the corresponding analysis.
Estimation of Divergence Time Suggests the Asteraceae Origin in Late Cretaceous
Molecular clock analysis was performed to estimate divergence times of Asteraceae by penalized-likelihood (PL) method implemented in r8s (Sanderson 2003) using concatenation of 71 nuclear markers (fig. 1). We used a total of 12 fossil calibrations in this analysis and included 27 more outgroup species in addition to the two species used in phylogeny reconstruction for implementation of fossils of other plant lineages (supplementary fig. S2 and table S2, Supplementary Material online). To strive for a relatively accurate estimation of divergence times for this large and diverse family, four Asteraceae fossils were used here (supplementary fig. S2, Supplementary Material online). Assignments of fossils (supplementary table S4, Supplementary Material online) were in accordance with Smith et al. (2010) and Magallón et al. (2015) except those of Asteraceae described in supplementary material, Supplementary Material online. A pollen fossil of Tubulifloridites lilliei type A (76 − 66 Mya) was recently found in deposits from the late Cretaceous of Antarctica (Barreda et al. 2015, 2016; Panero 2016; Panero and Crozier 2016) (also see supplementary material, Supplementary Material online), but its assignment to specific lineage was controversial. To test effect of different possible assignments, we performed estimations with three different positions of calibration for T. lilliei type A (supplementary fig. S3, Supplementary Material online): (1) crown Asteraceae, which is the nearest position applicable in the context of our samplings regarding the original proposed assignment to crown Dasyphyllum based on the single-most parsimonious tree (Barreda et al. 2015), (2) at the most recent common ancestor (MRCA) of Asteraceae + Calyceraceae, and (3) at the MRCA of Asteraceae + Calyceraceae + Menyanthaceae. The estimated mean and 95% highest posterior distribution (HPD) ages of several nodes are listed in supplementary table S5, Supplementary Material online.
Using the 71 nuclear gene dataset with the above calibrations, our estimated ages showed ∼20 Mya for crown Asteraceae using all three calibration sets (supplementary fig. S3, Supplementary Material online); in particular, the age of ancestor of Asteraceae + Calyceraceae + Menyanthaceae using calibration 3 was much older than the age constraint of T. lilliei type A at the same node, thus T. lilliei type A calibration at this position had no practical influence in the estimation. Results from calibrations 1 and 2 were very similar to each other with only 2–6 My differences. Our ages of crown Asteraceae and the MRCA of Asteraceae + Calyceraceae using T. lilliei type A with calibration 1 were 10 − 14 My younger than scenario 1 of Barreda et al. (2015) and this is possibly due to the different position for the T. lilliei type A fossil in the context of our incomplete sampling, whereas the difference in the age of core Asteraceae (from Mutisioieae to Asteroideae, but not Barnadesioideae) is smaller (∼6 Mya). Therefore, our current results should be viewed as tentative and the ages of Asteraceae deep nodes could be older if more Dasyphyllum or Barnadesieae species are sampled. Our ages using T. lilliei type A with calibration 2 are very close to scenario 2 of Barreda et al. (2015) when the position of T. lilliei type A is the same in the two analyses. Interestingly, the ages reported by a recently published paper (Panero and Crozier 2016) using chloroplast sequences without implementing T. lilliei type A are very close to our results using this fossil, showing only 7 My (calibration 1) and 1 My (calibration 2) differences in the age of crown Asteraceae. These age estimates all support that Asteraceae very likely originated during late Cretaceous, older than earlier estimates (Bremer et al. 2004; Kim et al. 2005; Barres et al. 2013; Beaulieu et al. 2013; Magallón et al. 2015; Tank et al. 2015).
Our results showed that branches leading to crown nodes of Asteraceae, subfamilies, even most of the tribes included here were nested during Paleocene to Eocene. It was the hottest and most humid period in Cenozoic Era including Early Eocene Climate Optimum (EECO) and Paleocene–Eocene Thermal Maximum (PETM) (Zachos et al. 2008). Especially, most subfamilies of core Asteraceae (i.e., 10 out of 13 subfamilies, except Barbadesioideae and Famatinanthoideae) diverged within ∼10 My (61 − 50 and 57 − 47 Mya for calibrations 1 and 2, respectively; supplementary table S5, Supplementary Material online). Most tribes included here diverged during Eocene, including Heliantheae alliance separated from their sister clade in very late Eocene, followed by subsequent divergences of these tribes during Oligocene.
Ks Distributions of Paralogs Suggest Multiple Polyploidizations in Asteraceae History
The large number of simultaneously generated duplicates (paralogs) from WGDs tends to exhibit a cluster of Ks values, providing a means to detect potential WGD (Blanc and Wolfe 2004b; Kagale et al. 2014; Vanneste et al. 2014; Cannon et al. 2015). It was previously proposed by Barker et al. (2008) that Asteraceae experienced ancient WGDs from the Ks analysis of paralogs from 18 species representing four tribes, one of each of four subfamilies (the three largest plus Mutisioideae). To detect further evidence for ancient polyploidy in Asteraceae, we identified paralog pairs by using BLAST to identify the reciprocal best hits (RBH) for each species (see Materials and methods section) and monitored their Ks distributions (supplementary fig. S4, Supplementary Material online). We followed Kagale et al. (2014) to use log-transformed Ks values in the analyses for better resolution on the distribution of low Ks values. Furthermore, we performed mixture model analyses to identify the potential WGD events by assuming that each Ks peak represents a large-scale duplication event following a Gaussian distribution (Blanc and Wolfe 2004b; Schlueter et al. 2004). Datasets with the number of total RBH gene pairs fewer than 3000 were not included in analyses below (fig. 2a, white columns) to avoid possible biases due to the large differences between the number of paralog pairs in these species and those of others. Overall, the results provide evidence for WGDs in the history of most of the species examined here (fig. 2 and supplementary fig. S4, Supplementary Material online).

Identifying and mapping gene duplications by Ks distribution and phylogenomic analyses. (a) Right: the presence of peaks was determined according to Ks distributions from RBH gene pairs. Blocks colored with red, green or purple indicate the presence of peak in the corresponding Ks range. For those with Ks of 0.2 − 0.6, the peaks mark with one or two asterisks are considered as representative or large duplications when the number of paralogs is higher than 900 or 1500, respectively. For those with more than one component in this Ks range, we take the counts of the ones most prominent or corresponding to a peak in the contour. Gray blocks denote the absence of a recognized peak within the range. Numbers of total paralog pairs are provided for each species (the rightmost column); datasets with numbers fewer than 3000 were not used here. Left: WGDs inferred by phylogenomic analysis. Values are percentage of total duplicated gene families for the adjacent nodes; only those > 2% are shown with the actual numbers of duplicated gene trees presented in parenthesis. Gene duplications were recorded only when the two sub-branches shared two or more taxa with BP values > 50 for the node. Results with BP > 70 share the same conclusion (supplementary fig. S6, Supplementary Material online). WGD events with stronger evidence are highlighted with colored circles at the node followed by colored branches, and the ambiguous ones are depicted with open circles. Phylogeny here is according to our chronogram in supplementary fig. S2Supplementary Material online. (b) Scatterplot of parameters from the fitted Gaussian mixture model of Ks distributions of BLAST (left) or RBH (right) gene pairs. For each Gaussian component of the fitted model, the mean of log 2 (Ks) value was plotted against standard deviation (SD), whereas X-axis labels here are Ks value instead of log 2 (Ks). Classification using Mclust package resulted in five clusters in both datasets, while Clusters I, II, and III are in the range of Ks between 0.2 − 2.
We also employed a recently reported clustering analysis using statistical modeling with means and standard deviations of Gaussian components from multiple species (Kagale et al. 2014) as an exploratory tool for potential ancient events and their Ks ranges. Peaks of Ks > 3 might have resulted from WGDs well before the diversification of Asteraceae with possible saturation effect and are not further discussed in this study. Our result included three major clusters (fig. 2b): Cluster I (Ks 0.2 − 0.6), Cluster II (Ks 0.7 − 1.4), and Cluster III (Ks ∼2). We mapped peaks of each species onto our phylogeny to ascertain whether members of the same clade have Ks peaks within the same range of values (fig. 2a). For recent duplications with Ks of 0.2 − 0.6 (red blocks), we denoted with asterisks when the numbers of paralogs were > 900 (one asterisk) or 1500 (two asterisks). Specifically, all species of Eupatorieae, Madieae, Heliantheae, and Helenieae showed peaks in this Ks range, suggesting the possibility of a shared WGD in their common ancestor. We also observed obvious presence of peaks for many of the members of subfamily Mutisioideae, tribe Calenduleae, and a subtribe of Senecioneae (i.e., Tussilaginae, represented here by Ligularia fischeri and Petasites hybridus) (fig. 2a); these peaks were all with large numbers of paralogs, providing evidence for large-scale parallel/independent duplications in these lineages. Notably, Barker et al. (2008) found a substantial amount of rate heterogeneity across the Asteraceae with a nearly 30% difference in background molecular evolutionary rate between the fastest (Heliantheae) and slowest (Carduoideae) lineages among their samplings, providing an explanation for the wide-range in mean Ks of clusters and peaks of species that are suggested to share common ancient duplication event.
All Asteraceae species showed Ks peaks between ∼0.7 − 1.4, except Tragopogon porrifolius, Tragopogon dubius, Proustia cuneifolia, and Dasyphyllum diacanthoides, all of which have in their Ks distributions a very strong peak near by that could possibly mask an adjacent peak. These common peaks supported a potential paleopolyploidization across the family or at least core Asteraceae (Asteroideae–Mutisioideae), consistent with a previous study (Barker et al. 2008), which observed Ks value of 0.74 lying within ranges of our Cluster II. The third group of Ks peaks at ∼2 in Cluster III is shared not only by members of Asteraceae but also by the two outgroup species (fig. 2a and supplementary fig. S4, Supplementary Material online), suggesting an ancient duplication before the split of Asteraceae and these Asterales lineages. In addition to these three clusters, there is another group of Ks peaks (fig. 2b, blue dots) between ∼0.1 and 0.3; although some of these might not represent true WGD, others might be evidence for relatively recent lineage-specific WGDs.
Placement of Polyploidizations on the Asteraceae Phylogeny Using Gene Family Trees
Results from Ks distributions and clustering analysis suggest that the ancestors of Asteraceae and some major branches might have experienced one or more ancient WGDs. To test this hypothesis and to investigate the timing of such possible WGDs, we performed a phylogenomic analysis as done in several recent studies (Jiao et al. 2011; Jiao et al. 2012; Cannon et al. 2015; Li et al. 2015; Yang et al. 2015) and recognized as a reliable method for investigating the presence and positions of WGDs (Kellogg 2016). We used all the datasets in determining the phylogeny with six additional outgroup species (supplementary table S2, Supplementary Material online). Homologous gene groups (families) were identified using BLASTp and OrthoMCL among these 81 species, and gene families with species coverage less than 85% were removed. The remaining 6,700 gene families were subjected to ML tree reconstructions followed by comparison with the organismal tree (fig. 1) to detect gene duplications with timings relative to lineage divergence (for detail see Materials and methods section). The result is presented as the percentage of duplicated gene families in fig. 2a adjacent to nodes. Nodes potentially having WGD events with > 2% duplicated gene families were listed; among them, there were five showing prominent proportions (> 5%) of gene duplications (nodes 1, 3, 5, 14, and 15; see below), which also correspond well to the observations of Ks peaks shared by clades. We further investigated each of the duplicated gene trees with three possible topologies regarding the presence or absence of the two subclades following a node (fig. 3a); among them, Type I topology provides stronger support than Types II and III for a gene duplication at the node of interest (see supplementary material, Supplementary Material online for detailed descriptions). Note that only clades with more than three species were included in this follow-up investigation.

Distribution of three types of topologies of duplicated homolog trees. (a) Three possible topologies of duplicated homolog trees. For the two sub-clades derived from one node, red clade represents the large subclade (containing A and B), whereas blue clade indicates the small sub-clade (C). Resulting duplication nodes based on Types II and III have the possibilities to be biased in misplacement of the member of small subclade due to long-branch attraction or others. Detailed descriptions are in supplementary material, Supplementary Material online. (b) Percentages of the three types of homolog trees supporting duplications for the corresponding WGD nodes. Nodes with > 2% Type I trees are denoted with asterisks. Note that only clades with more than three species are available for this investigation.
Among the 6.17% and 7.35% trees supporting duplications at nodes 3 and 5, respectively, we found 5.18% and 2.54% trees have the highly supportive Type I topology (fig. 3b), while the 4.63% supporting trees for node 4 only contains 1.71% of Type I. Barker et al. (2008) observed an ancient WGD before the divergence of core Asteraceae. Here, we found signals for ancient WGD events for three adjacent nodes at the base of the family (nodes 3–5), while those of the core-Asteraceae and Asteraceae–Calyceraceae (which is not shared by Goodeniaceae; unpublished data) were relatively strong and constitute “nested duplications”. The signal for crown Asteraceae remains ambiguous and awaits further investigations with more samplings. The detection of a possible additional WGD shared by Asteraceae and its sister family Calyceraceae is intriguing. Truco et al. (2013) found Lactuca sativa to be an ancient hexaploid based on the high-density linkage map of its genome as compared with grape, while hexaploids are probably from two successive genome duplications rather than a single triplication event (Harlan and deWet 1975; Ramsey and Schemske 1998). The nested duplications of core-Asteraceae and Asteraceae–Calyceraceae provide another case of successive events.
In addition, WGD at nodes 1, 14, and 15 were supported by relatively high percentage of corresponding trees. There were 7.71% genes trees showing duplications at MRCA of the Heliantheae alliance (node 1) with 3.36% having Type I topology. Previously a WGD was proposed to be shared by members of Heliantheae alliance as supported by cytology, phylogeny and Ks distributions (Smith 1975; Robinson 1981; Yahara et al. 1989; Berry et al. 1995; Gentzbittel et al. 1995; Baldwin et al. 2002; Barker et al. 2008); our results further restricted its timing to that after the divergence of Athroismeae and the Heliantheae alliance. In our result, there were also 4.5% homologs duplicated at the node containing Centipeda (node 2), but most of them belonged to Type II (fig. 3b), which had the possibility to inferred wrongly from the derived node due to possible biased placement of Centipeda in a group with one of the duplicated clades. One possible event was at the MRCA of Ligularia/Petasites (21.94%, node 14), representing a duplication event for ancestor of subtribe Tussilagininae. It was suggested that the base chromosome number for this subtribe was x = 30 (Robinson et al. 1997), supporting a possible WGD early in its evolution. The other potential event was shared by the two Tragopogon species (6.63%, node 15), intimating a possible WGD at the level ranging from subtribe Scorzonerinae to this genus in the context of our samplings, whereas allopolyploidizations have been observed in members of Tragopogon (Soltis et al. 2012, 2016).
We also detected WGD signal for a Gnaphalieae subclade except Leontopodium (node 11), having 2.72% supporting trees with 2.29% belonging to Type I. Interestingly, the three genera are relatively large in this tribe. However, obvious peaks can be observed on Ks distribution of Anaphalis margaritacea and Helichrysum petiolare, but not Pseudognaphalium affine. For nodes 9 and 13, only low percentage (< 2%) of gene duplications were inferred by Type I gene trees, and 6 and 2 of their members, respectively, do not have peaks at the Ks range of Cluster I. Other nodes (6, 7, 8, 10, 12, 16, and 17) showing 2–4.6% of gene trees showing duplication had only two terminals in our samplings and did not permit further investigations for types of topology; future investigations with more taxa are beneficial for a robust conclusion of WGDs for these clades. Nevertheless, it is interesting that all species of Calenduleae and Mutisioideae examined here showed strong peaks within Ks ranges of Cluster I, but only fewer than 20 homologs duplicated at ancestors of these crown groups; therefore, WGDs might have occurred for part of these clades (e.g., node 12 for Calendula and Tripteris and node 17 for Gerbera and Mutisia), although the evidence is not strong. Nevertheless, these results intimate multiple possible independent WGDs for members of these clades.
Detection of Two Major Net Diversification Rate Accelerations during Asteraceae History
Comparing with the numerous species in the Asteraceae family, our transcriptome data lack sufficient taxon samplings to profile species richness for various subfamilies or tribes. Thus, we took the summarized meta-tree of Funk et al. (2009b) based on several previous studies (Panero and Funk 2008; Baldwin 2009; Funk and Chan 2009; Pelser and Watson 2009) for a hypothesis on phylogeny with comprehensive inclusion of major lineages and their species richness (supplementary fig. S5, Supplementary Material online), which was then subjected to diversification rate shift analysis using Modeling Evolutionary Diversification Using Stepwise Akaike Information Criterion (AIC) (MEDUSA; Alfaro et al. 2009). Our result showed three positions of acceleration (red circles) and three occasions of slowdown (blue circles) in net diversification rate (fig. 4). The slowdowns happened at branches leading to specific subfamilies and tribes with fewer than three species. There was moderate acceleration in the Mutisioideae lineage. One of the major acceleration events was located at the node leading to Asteroideae–Carduoideae clade with almost 3-fold of the background rate (0.175 vs. 0.064 species/My), and the other was at the clade of tribes Madieae, Eupatorieae, Perityleae, Millerieae, and Heliantheae (i.e., the Heliantheae alliance clade except the basal tribe Helenieae) with 4-fold of the background rate (0.247 species/My). In our analysis, we did not include some of the basal subfamilies and thus the results in rate shift are tentative, but we could identify a potential relationship between WGDs and net diversification rate shifts in accordance with WGD analyses in the context of our samplings, demonstrating two accelerations each after two of the duplication events reported in this study (fig. 5).

Detecting diversification rate shifts by MEDUSA. Number of taxa (n.taxa) and phylogeny of lineages not sampled here are according to Funk et al. (2009b) and is also illustrated as a hypothesized phylogeny in supplementary fig. S5, Supplementary Material online. Partitions with higher diversification rate are colored with orange, red or pink, and those with lower rate are with blue. Positions of rate shifts are depicted with red (acceleration) and blue (slowdown) circles numbered by the order of rate shifts added by the stepwise AIC procedure. Terminals representing the combination of multiple tribes are indicated with * (Neurolaeneae, Tageteae, Chaenactideae, and Bahieae) and ** (Corymbieae, Doronicum, Abrotanella, Senecioneae, Calenduleae, Gnaphalieae, Astereae, and Anthemideae). The improved AIC scores (ΔAIC), fitted models, estimated net diversification rates (Rate), and relative extinction rates (ϵ) are shown on the table in the upper left corner. NA, relative extinction rate cannot be estimated because the shifts fitted to the pure-birth (yule) model; bg, background.

A proposed evolutionary history of Asteraceae. The top part of this figure shows a stacked deep-sea benthic foraminiferal oxygen-isotope curve adapted from fig. 2 in Zachos et al. (2008) corresponding to the evolution of global climate over the last 65 My. Red stars indicate the phylogenetic positions of major putative paleopolyploidizations presented here, pink strips specify the estimated window in which duplications occurred (according to the ages in supplementary fig. S2, Supplementary Material online). Open stars and gray strips are WGD events showing consistency between results from Ks distributions and phylogenomic analyses but with more ambiguous supports from the percentage of duplicated homolog trees. Cases having inconsistent results for the two analyses are not shown in this figure. The positions of species radiations are highlighted as referred to a core-shift in net diversification rate (yellow circle) (fig. 4; Panero and Crozier 2016) and rapid branching of at least 10 subfamilies of core-Asteraceae (yellow strip). Chromosomal base numbers and biogeography distributions (or optimization) are according to Semple and Watanabe (2009) and Funk et al. (2009b). Green and blue dotted lines indicate KT extinction event and Eocene-Oligocene extinction event, respectively. The warm period of PETM and EECO is highlighted with a brown strip. Subfamilies not sampled here are integrated with gray dashed lines according to the hypothesized position in the meta-tree by Funk et al. (2009b). Pal, Paleocene; Oli, Oligocene; P, Pliocene; Q, quaternary; ETM, Eocene thermal maximum.
Discussion
The previously proposed Asteraceae phylogeny was based on analyses using chloroplast sequences (Funk et al. 2009a, 2009b). Our phylogenetic analyses using both concatenation and coalescence methods with two sets of orthologous nuclear genes from 64 newly generated transcriptomes and additional public datasets yielded a hypothesis with strong supports for monophyly of all subfamilies and tribes included here. The nuclear gene-based phylogenetic relationships among tribes and subfamilies are in complete agreement with the current conclusions according to chloroplast sequences (Panero and Funk 2008; Funk et al. 2009a; Panero et al. 2014), showing an example for the consistency between results using the nuclear and chloroplast genomes. We have also resolved relationships among genera of Asteroideae, including some that were previously ambiguous from analyses using chloroplast sequences, demonstrating the effectiveness of these informative nuclear markers.
In addition to a robust phylogeny, our results on divergence times of Asteraceae at a family-wide scale (supplementary figs. S2 and S3, Supplementary Material online) and the detection of strong evidence for multiple polyploidizations across the family (figs. 2 and 3) allow us to reconstruct the evolutionary history of Asteraceae (see fig. 5) combining the observations on species diversifications (fig. 4; Panero and Crozier 2016) (see below).
Using both Ks distribution and gene tree analyses, we found evidence for multiple WGD events across the family. Our findings support two nested duplications: one at core Asteraceae and the other before the split of Asteraceae from Calyceraceae. We also observed several additional events: at the crown node of the Heliantheae alliance, at clades Tussilaginae and Tragopogon-Scorzonerinae, and within Gnaphalieae. A subclade within Mutisioideae or Calenduleae might each have an ancient WGD, but other possibilities also existed with members of these clades.
In combination with the chronograms here, two WGDs occurred along the backbone around the time of global mass extinction events, shortly before species radiations with two kinds of evidence (a radiation of 10 subfamilies and high species richness; see below for more details). The earlier duplication event near the basal node of Asteraceae is close to the Cretaceous–Paleogene (KT) mass extinction event (fig. 5 and supplementary fig. S2, Supplementary Material online). Net diversification rate acceleration was detected by analysis using the MEDUSA program subsequent to the mass extinction and WGD events (figs. 4 and 5). This differs from the estimated core shifts in net diversification rate in the Phytomelanic Fruit (PF) and Vernonioid clades by Panero and Crozier (2016), but not near the base of the family, likely due to their samplings of more basal lineages. However, dramatic loss of ancient taxa can bias the result when determining diversification dynamics using only extant taxa (Quental and Marshall 2010; Morlon 2014). This is particularly relevant because the early diverging Asteraceae lineages had originated during the period of global catastrophe and evolved under the subsequent stressful environments, which resulted the extinction of ∼67% of the species (Jablonski and Chaloner 1994). On the contrary, at least 10 out of 13 subfamilies diverged very rapidly within 10 My (fig. 5, yellow strip); a similar time frame was also reported in studies using chloroplast sequences (within 13 − 10 Mya) (Barreda et al. 2015; Panero and Crozier 2016). This radiation of subfamilies is indicative of a burst in biodiversity.
The descendant clades after WGD and KT extinction events can be further divided into two groups showing distinct patterns of evolution. According to the observations of extant species, members of the more proliferative clades (Asteroideae, Cichoroideae, and Carduoideae) have a relatively low chromosomal base numbers of 9 − 10 in contrast to 27 of the basal clades (e.g., Mutisioideae, and other basal subfamilies not sampled here); furthermore, most of these large clades are distributed in Africa and Eurasia, whereas the members of the basal clades are found in South America, the proposed origin of Asteraceae (Funk et al. 2009a). It is possible that after polyploidization the common ancestor of these large clades might have experienced a large-scale chromosome rearrangement (Inoue et al. 2015), resulting in fewer chromosomes and other changes that facilitated adaptation to new habitats. On the contrary, Mutisioideae retained a high chromosomal base number (27) and remained in South America (Funk et al. 2009b); nevertheless, polyploidization might still have played a role in enhancing the ability of Mutisioideae species to adapt to environmental changes in their original habitat. In comparison, the relatively long branch of Barnadesioideae for the split of its genera after divergence from other subfamilies (Panero and Crozier 2016; our unpublished data) might be the result of extinction of ancient relatives under stressful environment.
Similarly, the WGD event at the base of the Heliantheae alliance was also close to the Eocene–Oligocene boundary (fig. 5), when an extinction event named Grande Coupure occurred in Europe (Hooker et al. 2004) under a short period of abrupt cooling icehouse climate (Zachos et al. 2001). The extant species of this WGD clade have largely distributed to North America (in contrast to Africa for the basal sister tribes), and their chromosomal base numbers are about twice of those of their sister clade (17 − 19 comparing with 7 − 10; fig. 5), providing additional support for paleopolyploidization at the MRCA of the Heliantheae alliance (Smith 1975; Robinson 1981; Yahara et al. 1989; Berry et al. 1995; Gentzbittel et al. 1995; Baldwin et al. 2002). Following this node is the diversification acceleration contributed by the great species richness (fig. 4), which is also supported by the core shift in net diversification rate of the PF clade by Panero and Crozier (2016). In addition, there is a period of delay between this WGD and the subsequent acceleration in diversification rate, consistent with the idea that significant lag-times often exist between WGDs and species radiations (Schranz et al. 2012). The extraordinary species richness of this lineage again demonstrates the possible role of the combined effect of WGD (and its subsequent modifications) and dramatic environmental changes on evolution. Notably, the subsequent global grassland expansion along with increasing aridity and seasonal change during Oligocene likely created various open niches for multiple lineages and facilitated diversifications after these events.
A possible correlation between polyploidy and KT extinction event has been proposed previously (Fawcett et al. 2009; Vanneste et al. 2014). The authors analyzed 38 plant genomes and three transcriptomes, with evidence for 20 independent WGDs across 19 observed plant families, and revealed a wave of successful genome duplications that were significantly and non-randomly clustered between 55 and 75 Mya (Vanneste et al. 2014), showing a strong indication of an association with the KT boundary (66 Mya). In the two analyses, they dated the ages of WGD paralogs with correction for substitution rate heterogeneity among different lineages, using Lactuca sativa as the only representative of Asterales; therefore, they could not locate the positions of WGD along the Asteraceae phylogenetic branches. The older ages of Asteraceae from this study as well as Panero and Crozier (2016) suggest that the WGD identified in Fawcett et al. (2009) and Vanneste et al. (2014) very possibly occurred after the divergence of Asteraceae (i.e., within the family), as is well supported by our result (61.3 Mya for node 3, fig. 2a). There was also a broader clustering centered at ∼22.9 Mya (Vanneste et al. 2014), also consistent with our results of a WGD at the clade of Heliantheae alliance. Overall, our findings in Asteraceae revealed that at least two independent scenarios occurred within a family (Asteraceae) with very high biodiversity for nearly ten percent of angiosperms, providing support for the hypothesis of major geological events and polyploidy. Vanneste et al. (2014) suggested that environmental stresses around the time of polyploidizations might have promoted the establishment of polyploids and the realization of their evolutionary potential, thus alleviating the rigid contrast in the proposed evolutionary fates of polyploids. Notably, several other WGDs supported by analyses here are not related to a mass extinction event and no obvious species radiations could be detected. Thus, establishment of polyploids are not restricted to the time of stressful environments, but stressful environments could allow polyploids manifest evolutionary potential, perhaps competing better with diploid counterparts and becoming the origins of species radiation.
In summary, the ability of Asteraceae to become one of the most successful plant families might be facilitated by multiple WGD events, particularly those that occurred near geological times of dramatic changes in the Earth environments. The WGD events likely allowed the evolution of a large number of new gene functions; at the same time, the stressful environments after mass extinctions might be strongly selective helping in establishing sustainable populations successfully (Voss et al. 2012; Chao et al. 2013) while also generating many open niches. Following WGD, some descendants of the recently formed polyploids were provided with new niches with different environmental conditions and diversified into very large subfamilies or tribes, whereas other descendants remained in relatively stable habitats.
Materials and Methods
Taxon Sampling and Transcriptomics
Information for the taxa included is listed in supplementary table S2, Supplementary Material online. Total RNA extraction with ZR plant RNA MiniPrep kit and subsequent sequencing by illumina HiSeq2000 and assembling using Trinity (Grabherr et al. 2011) and TGICLv2.1 (Pertea et al. 2003) were performed as described in Huang et al. (2015). Redundant contigs in each transcriptome were removed using CD-HIT 4.6 (Fu et al. 2012) with the threshold of 0.98. Sequence Read Archive (SRA) data were retrieved from GenBank (http://www.ncbi.nlm.nih.gov/sra/; last accessed August 5, 2016).
Selection of Putative Orthologs as Phylogenetic Markers
We used OrthoMCL v1.4 (Li et al. 2003) to identify the single-copy orthologous sets (SCOS) from 18 Asteraceae species of different tribes of Asteraceae (Arctotheca calendula, Bellis perennis, Carduus nutans, Centipeda minima, Centromadia pungens, Dasyphyllum diacanthoides, Eutrochium purpureum, Gymnarrhena micrantha, Helenium autumnale, Inula linariifolia, Leontopodium smithianum, Mutisia acuminate, Onoseris weberbaueri, Osteospermum jucundum, Proustia cuneifolia, Senecio vulgaris, Taraxacum officinale, and Xanthium strumarium) and one outgroup species Acicarpha spathulata. The resulting 271 SCOS among these 19 species were used as seeds in HaMStR (Ebersberger et al. 2009) to find orthologs in other species with E values less than e−20. Nucleotide sequences of the 271 OGs were aligned and trimmed using MUSCLE v3.8.31 (Edgar 2004) and trimAl v1.2 (Capella-Gutiérrez et al. 2009) with the default settings, respectively.
Possible sequence biases of long-branch attraction and saturation of partitions were detected by TreSpEx (Struck 2014), in which the long-branch attraction was determined by standard deviation and the average of upper quartile of long-branch scores, and the saturation degree was determined by slopes and R2 values of linear regressions of patristic distances against uncorrected distances p, with alignments of the 271 gene sequences. Genes showing high degrees of misleading signals as indicated by the four parameters were pruned, resulting in 175 OGs (supplementary table S3, Supplementary Material online). To avoid the possible effect of hidden-paralogs, persisting single-copy genes resulted from the loss of distinct paralogs in different species, a filtering process was performed by using the ML tree of 175-gene concatenation as a basic topology to investigate each of the 175 single-gene trees and remove the gene of particular species that shows significant differences in groupings within major clades. The filtered set of 175 OGs, and the 71 OGs with alignment length > 900 nucleotides, were both used in subsequent phylogeny reconstructions. Nucleotide sequences used for phylogenetic analyses in this study are available in the TreeBASE website (http://treebase.org/treebase-web/home.html) under the accession code S19664.
Phylogeny Reconstruction
Alignments of the nucleotide sequences of the 175 and 71 OGs were concatenated using SeaView (Gouy et al. 2010). Subsequent phylogenetic analyses were performed with GTRGAMMAI model as suggested by Modeltest (Posada and Crandall 1998) for both 175- and 71-gene sets. Phylogenetic reconstructions using concatenations of 175 and 71 genes by BI and ML methods and coalescence analysis were carried out using MrBayes 3.2.1 (Ronquist and Huelsenbeck 2003), RAxML 7.0.4 (Stamatakis 2006), and Astral 4.4.4 (Mirarab et al. 2014), respectively, as described before (Huang et al. 2015).
Fossil Calibrations and Divergence Time Estimation
We used 12 fossil constraints in our divergence time estimation (supplementary table S4, Supplementary Material online) according to the assignments by Smith et al. (2010) and Magallón et al. (2015), except those of the four Asteraceae fossils as described in supplementary material , Supplementary Material online. All the fossil calibrations were implemented as minimum constraint, whereas the tricolpate pollen grains (about 125 Mya; Hickey and Doyle 1977) were used to fix the age of crown group eudicots (Forest 2009; Magallón and Castillo 2009). Divergence times were estimated using r8s v1.7.1 (Sanderson 2002) as previously described (Huang et al. 2015), with the ML tree with branch lengths generated by RAxML using 71 OGs as the input tree. Oryza sativa was the most distant outgroup species in our analysis and was pruned as required by r8s. Mean and 95% HPD ages of nodes of interest were estimated and summarized across the 100 BS trees (supplementary table S5, Supplementary Material online).
Synonymous Substitution Rate of Paralogs and Mixture Model Analysis
Paralogs were determined from gene pairs identified by using BLASTP with an E-value cutoff of e−5 followed by filtering with > 70% match in alignment length and > 50% sequence identity, resulting in “BLAST” gene pairs; among them, the RBHs within each species were used as “RBH” pairs. We estimated Ks values between pairs of paralogs according to Blanc and Wolfe (2004b) by calculations based on codon alignments using the ML method implemented in codeml of the PAML package (Yang 2007) under the F3x4 model (Goldman and Yang 1994). To identify significant peaks, Gaussian mixture models were fitted to the distributions of log 2 transformed Ks values using the R package Mclust (Fraley and Raftery 2002, 2003), which determined the best-fitted model according to Bayesian Information Criterion (BIC). Histograms of log 2 (Ks) distribution and the estimated components of the fitted model of each species are provided in supplementary fig. S4, Supplementary Material online. Observations on Ks distributions found overfitting or skewed peaks owing to the multiple overlapping Gaussian components in specific cases; such complications were resolved as follows to combine the overlapping components as Gaussian mixtures. For a Gaussian mixture distribution with mixing proportions that sum to 1, means , and standard deviations , the combined mean were determined to be as and the combined SD as (Kagale et al. 2014). The resulting mean and SD were then plotted as single points in fig. 2b and single component in supplementary fig. S4, Supplementary Material online. A scatterplot was generated with the mean value of the log 2 (Ks) and SD of each Gaussian component, and their classifications were determined using Mclust with the fitted model selected via BIC.
Locating Gene Duplications by Comparing Gene Family Trees with Species Tree
Homologs of 81 species (supplementary table S2, Supplementary Material online) were identified by performing an all-against–all BLASTp comparison among the protein sequences of them with e-value cutoff of 10−5. In Paranoid 4.1 (O'Brien et al. 2005) was then applied on each BLAST match to calculate a global protein identity to filter out those with poor similarities (<50%) or gene coverage (<50%). Homologous groups were predicted using OrthoMCL v1.4 (Li et al. 2003) with the inflation value of 2.0. For each group composing of four or more genes, protein sequences were aligned using MUSCLE v3.8.31 (Edgar 2004) with default parameters and were further trimmed by using trimAl v1.4 (Capella-Gutiérrez et al. 2009) with the options “-gt 0.1 –resoverlap 0.75 –seqoverlap 80”. The alignment of corresponding nucleotide sequences was built guided by the trimmed protein alignment. Then, phylogenies for homolog groups with species coverage higher than 85% were reconstructed by RAxML with GTRCAT model for 100 BS replicates.
The resulting gene trees were reconciled with the species trees to calculate the numbers of gene duplication at each node, a method commonly used in many recent studies in detecting the presence and positions of large duplication events (Jiao et al. 2012; Cannon et al. 2015; Li et al. 2015; Yang et al. 2015) and is recognized as one of the more reliable method (Kellogg 2016). In detail, in each gene family tree, a node was recognized as valid for further analyses when the BS support was higher than 50 (fig. 2) or 70 (supplementary fig. S6, Supplementary Material online). Then, the last common ancestor (LCA) was determined for each of the two sub-branches. A gene duplication was then counted when meeting two requirements: (1) the LCAs of the two sub-branches had the same or close depth (the difference is smaller than or equals to one), where the depth of node was defined by the number of steps traveling from the node to the root on the species tree; (2) the two sub-branches shared two or more taxa. Numbers of gene duplications were summarized on the species tree by iterating all single gene family trees.
Diversification Rate Shift
Diversification rates were investigated by MEDUSA (Alfaro et al. 2009) using ultrametric tree data based on our date phylogeny with the missing taxa incorporated based on Funk et al. (2009b); details in the implemented phylogeny and species richness are in fig. 4 and supplementary fig. S5, Supplementary Material online. For the branches not sampled in this study (whose ages are unknown), because they could diverge from any points along the corresponding main branches, we set these clades to be branched out from their corresponding stem nodes, that is to set their ages the same as their basal sister clades, in order to eliminate their effect on the species richness of the sampled and dated clades here. The age of Goodeniaceae was set to the midpoint of Calyceraceae and Menyanthaceae close to those obtained by Panero and Crozier (2016) and our unpublished data. Corrected AIC criterion and mixed model of birth-death or Yule were implemented in the calculation.
Acknowledgments
We thank Drs Holly Forbes, Xuejun Ge, Jinshuang Ma, Michael Monosov, Weimin Ni, Jun Wen, Ji Yang, Liping Zeng, Ning Zhang, and Shu Zhang for help with sampling of plant materials. We also thank the two anonymous reviewers for providing valuable comments. This work was supported by grants from the National Natural Science Foundation of China (Grant no. 91131007 to H.M. and 31371330 and 31570224 to J.Q.); the China Postdoctoral Science Foundation (Grant no. 2014M551316); and funds from the State Key Laboratory of Genetic Engineering at Fudan University.
References
Author notes
Associate editor: Hideki Innan