A Phylogenomic Assessment of Processes Underpinning Convergent Evolution in Open-Habitat Chats

Abstract Insights into the processes underpinning convergent evolution advance our understanding of the contributions of ancestral, introgressed, and novel genetic variation to phenotypic evolution. Phylogenomic analyses characterizing genome-wide gene tree heterogeneity can provide first clues about the extent of ILS and of introgression and thereby into the potential of these processes or (in their absence) the need to invoke novel mutations to underpin convergent evolution. Here, we were interested in understanding the processes involved in convergent evolution in open-habitat chats (wheatears of the genus Oenanthe and their relatives). To this end, based on whole-genome resequencing data from 50 taxa of 44 species, we established the species tree, characterized gene tree heterogeneity, and investigated the footprints of ILS and introgression within the latter. The species tree corroborates the pattern of abundant convergent evolution, especially in wheatears. The high levels of gene tree heterogeneity in wheatears are explained by ILS alone only for 30% of internal branches. For multiple branches with high gene tree heterogeneity, D-statistics and phylogenetic networks identified footprints of introgression. Finally, long branches without extensive ILS between clades sporting similar phenotypes provide suggestive evidence for the role of novel mutations in the evolution of these phenotypes. Together, our results suggest that convergent evolution in open-habitat chats involved diverse processes and highlight that phenotypic diversification is often complex and best depicted as a network of interacting lineages.


Introduction
Molecular phylogenetics has unveiled many previously unknown examples of convergent evolution-here meant to refer to a phenotypic pattern in which non-sister species are phenotypically more similar to each other than to their respective sister species (following Arendt and Reznick 2008;Stern 2013). Under such an evolutionary outcome, species relationships based on morphometrics, coloration, behavior, or other ecological traits are discordant with the history of descent reflected in the species tree (Elmer and Meyer 2011;Aliabadian et al. 2012;Martin and Orgogozo 2013;Stern 2013;Jarvis et al. 2014;Schweizer et al. 2019aSchweizer et al. , 2019bPaterson et al. 2020). While the many observations of such discordances across the tree of life witness the abundance of convergent evolution, insights into the underlying processes remain more elusive.
Phylogenetic information from genomic data now provides unprecedented power to consolidate patterns of convergent evolution and obtain insights into the underlying processes. Many examples of putative convergent evolution are yet based on phylogenies reconstructed from a restricted number of genetic markers (Cresko et al. 2004;Colosimo et al. 2005;Aliabadian et al. 2012;Stern 2013;Brusatte et al. 2015). Since phylogenetic relationships at different positions in the genome, referred to as "gene trees", can vary substantially, many gene trees inevitably deviate from the species' history of descent Mol. Biol. Evol. 40(1):msac278 https://doi.org/10.1093/molbev/msac278 Advance Access publication December 29, 2022 1 Alaei Kakhki et al. · https://doi.org/10.1093/molbev/msac278 MBE reflected in the species tree (Degnan and Rosenberg 2006;Toews and Brelsford 2012). Hence, the mismatch of single gene trees with phenotypic similarities alone does not provide conclusive evidence for convergent evolution (Doyle 1997;Degnan and Rosenberg 2006;Lamers et al. 2012). Confirming instances of convergent evolution, therefore, requires species tree reconstructions from genome-wide variation. Once the evidence for convergent evolution is corroborated by the species tree, we can move forward to investigate the processes underlying gene tree heterogeneity that may also underpin convergent evolution.
Convergent evolution can occur via three processes: First, phenotypic similarities can evolve through independent mutations in the same or different genes ("parallel evolution" sensu Stern 2013) (Arendt and Reznick 2008;Martin and Orgogozo 2013;Stern 2013). In Mexican cavefish (Astyanax mexicanus), for instance, the evolution of decolored brown phenotypes and albinism in separate caves occurred through different mutations in the MC1R and OCA2 genes (Protas et al. 2006;Gross et al. 2009;Stahl and Gross 2015). Similarly, in plants, isoforms of PEPC found in C4 photosynthesis, and similar floral traits important for pollination have evolved multiple times independently (Whittall et al. 2006;Christin et al. 2007;Hoballah et al. 2007;Besnard et al. 2009;Preston and Hileman 2009).
The second, and likely most frequent process leading to convergent evolution that also accounts for most gene tree heterogeneity is incomplete lineage sorting (ILS; Stern 2013 includes this under "collateral evolution"), that is, the retention of alleles and traits that were already present in the ancestral lineage (Cresko et al. 2004;Colosimo et al. 2005;Stern 2013;Van Belleghem et al. 2018). ILS is prevalent in radiations characterized by large effective population sizes and the fast succession of speciation events, such as in the evolution of neoavian birds (Jarvis et al. 2014;Suh et al. 2015;Suh 2016) or in the diversification of sticklebacks (Colosimo et al. 2005;Jones et al. 2012;Roberts Kingman et al. 2021). In such cases, a high proportion of ancestral variation may be retained over subsequent species splits and segregate in the independently segregating gene pools of daughter species (Maddison 1997). Selection or drift in non-sister species may fix the same genotype (and phenotype), while sister species may fix a different genotype/ phenotype. For instance, in Humans 1% of the genome is genetically more similar to orangutans than to chimps due to ILS, even though these primates are characterized by small effective population sizes (Hobolth et al. 2011).
Third, in hybridizing lineages, convergent evolution and gene tree heterogeneity may be underpinned by introgression (the exchange of genetic material between species) that mingles genotypes and phenotypes among species (Stern 2013 includes this under "collateral evolution") (Song et al. 2011;Heliconius Genome Consortium 2012;Stryjewski and Sorenson 2017;Malinsky et al. 2018). In particular, introgression between non-sister species may result in these species being phenotypically more similar than they are to their respective sister species, such as exemplified by wing-pattern mimicry in Heliconius butterflies (Pardo-Diaz et al. 2012;Edelman et al. 2019), and by plumage coloration of Munia finches and of members of the Black-eared Wheatear (Oenanthe hispanica) complex (Stryjewski and Sorenson 2017;Schweizer et al. 2019a). Importantly, in an increasing number of instances, such as in Heliconius butterflies, Yellowstone wolves, Darwin's finches, and cichlid fish, introgression has exchanged alleles between species and resulted in the formation of beneficial phenotypes (Grant et al. 2005;Genner and Turner 2012;Lamers et al. 2012;Wallbank et al. 2016;Enciso-Romero et al. 2017). Given that over the last decade genomic studies have contributed increasing evidence for the abundance of such adaptive introgression, hybridization (the interbreeding of different species) may underpin convergent evolution more often than previously appreciated (Campagna et al. 2017;Han et al. 2017;Meier et al. 2018;Marques et al. 2019a).
Multiple factors influence which of these processes were most likely involved in specific cases of convergent evolution. These factors include the evolutionary time scale under consideration, the speed at which successive speciation events occurred, effective population sizes, and the opportunity for genetic exchange according to biogeographic history. Waiting times for beneficial mutations are long (Hermisson and Pennings 2005;Barrett and Schluter 2008;Hedrick 2013). Independent mutations with the same phenotypic effect are thus usually exceedingly rare (Eyre-Walker and Keightley 2007) and only over the course of millions of years may occur in sufficient number to be a source of convergent evolution (Hedrick 2013; but see Xie et al. 2019). Therefore, at short evolutionary time scales, convergent evolution may more often involve the recruitment of standing genetic variation (Barrett and Schluter 2008), notably from the pool of ancestral variation segregating in extant species, or variation introgressed from other species (Stern 2013); especially since in young lineages ancestral variation is still segregating and because reproductive isolation may still be incomplete between young species.
Phylogenomics can provide important indirect insights into the potential contribution of ILS, introgression, and novel mutations to convergent evolution: First, the species tree provides initial clues on whether speciation events occurred over short enough time scales for ancestral variation to be passed to descent lineages and thus remain incompletely sorted in important proportions beyond speciation events. Second, insights into the extent of ILS and presence of introgression can be gained from levels of gene tree heterogeneity (Funk and Omland 2003;Degnan and Rosenberg 2006;Jarvis et al. 2014;Nater et al. 2015;Suh et al. 2015;Suh 2016) and symmetries of gene tree frequencies (Hibbins and Hahn 2022). Gene tree heterogeneity is high under both ILS and introgression, but the two processes leave different proportions of alternative gene trees, based on which they can be distinguished Hibbins and Hahn 2022). In the presence of extensive ILS or of introgression, a parsimonious approach attributes the source of convergent evolution to these processes, even though independent mutations cannot be excluded as the source of convergent evolution MBE (Cresko et al. 2004;Colosimo et al. 2005;Pardo-Diaz et al. 2012;Stryjewski and Sorenson 2017). The absence of detecting these processes, conversely, would indirectly suggest novel mutations as a potential source of convergent evolution. Therefore, surveys of gene tree heterogeneity and symmetries of gene tree proportions represent a promising avenue to probe the potential of the alternative processes to contribute to convergent evolution.
Here, we reconstructed the species tree and assessed the contribution of ILS and introgression to gene tree heterogeneity in open-habitat chats (genera Campicoloides, Emarginata, Myrmecocichla, Oenanthe, Pinarochroa, and Thamnolaea), a monophyletic group of songbirds displaying a high incidence of convergent evolution (Mayr and Stresemann 1950;Aliabadian et al. 2012;Schweizer et al. 2019aSchweizer et al. , 2019b. The phylogenetic relationships among open-habitat chats inferred from mitochondrial data were entirely unexpected from a morphological perspective . Species similar in plumage coloration and other traits were often spread far apart across the mitochondrial phylogeny, suggesting convergent evolution of phenotypic similarities (Outlaw et al. 2010;Aliabadian et al. 2012;Schweizer and Shirihai 2013;Schweizer et al. 2019aSchweizer et al. , 2019b. For a limited subset of species studied, genome-wide variation (ddRAD data) confirmed the mitochondrial relationships (Schweizer et al. 2019b). Furthermore, hybridization resulted in substantial introgression in the O. hispanica complex (Schweizer et al. 2019a) and is suspected to have played a role in phenotypic and species evolution in the Oenanthe picata complex (Panov 2005). In these instances, introgression between non-sister taxa may well explain convergent evolution. However, genomic data is essential to corroborate and refine the species tree and assess the incidence of ILS and/or introgression across open-habitat chats.
Based on whole-genome resequencing data from 50 taxa of 44 open-habitat species (supplementary table S1, Supplementary Material online), we aimed to obtain insights into the potential roles of alternative processes in driving convergent evolution in these songbirds. To this end, we 1) reconstructed the species tree, 2) estimated gene tree variation across the genome, and 3) explored ILS and introgression as drivers of the underlying high gene tree heterogeneity. Our results reveal a comprehensive picture of open-habitat chat evolution involving high rates of ILS and multiple instances of introgression particularly in wheatears (genus Oenanthe). Footprints of ILS and introgression as well as considerable divergence times between the main clades of wheatears with convergent evolution suggest that, most likely, a combination of ILS, introgression, and novel mutations explains the convergent evolution observed in wheatears.

Sampling, Nuclear Data Preparation, and Mitogenome Assembly
To achieve an almost complete taxon sampling, we resequenced the genomes of 50 open-habitat chat taxa from 44 of 47 recognized species ( fig. 1; supplementary table S1, Supplementary Material online). A Saxicola maurus genome was included as outgroup (Sangster et al. 2010;Zuccon and Ericson 2010). We mapped the sequencing reads to the reference genome assembly of Oenanthe melanoleuca (Peona et al. 2022) and followed GATK best practices for nuclear data preparation. Mapping efficiency was not correlated with the degree of divergence from the reference genome, but data obtained from DNA extracted off museum skins mapped at a lower percentage (linear model, d XY : t = −0.41, P = 0.68; tissue museum : t = −6.56, P < 0.001; R 2 = 0.53). After mapping, sequencing coverage ranged from 4.6x to 40.6x, with an average coverage of 12.2x ± 6.2x (supplementary table S1, Supplementary Material online). We extracted mitochondrial sequence data for all 13 protein-coding genes and two rRNA genes from the resequencing data using MitoFinder 1.2 (Allio et al. 2020). To ensure that results did not depend on filtering strategy, all analyses were run with four sets of differently filtered data (see Materials and Methods).

Species Tree Reconstruction Based on Nuclear Genomic Data
We first set out to reconstruct and root the species tree based on regions of the genome least likely affected by mapping biases. To this end, we extracted data from genomic intervals hosting avian Benchmarking Universal Single-Copy Orthologs (BUSCO). This resulted in data from 7,335 BUSCO, with alignment lengths varying from 89,898 to 140,640 kb (depending on filtering strategy) for ML analyses of concatenated data, respectively, 2,091 BUSCO with alignment lengths varying from 10,575 to 15,290 kb for LD-pruned data free of inter-locus recombination for multispecies coalescent-based species tree reconstruction. Results were consistent among filtering strategies. Hence, we only report results based on the most stringent filtering of read depth (ii, minimum read depth, DP = 5; minimum percentage of the window covered by data, PW = 50%; missing data per site, MD = 15%). Both, maximum likelihood (ML) analyses in IQ-TREE 2 based on concatenated data and multispecies coalescent analyses in ASTRAL-III (based on BUSCO ML gene trees) established sub-Saharan species of the genera Campicoloides, Emarginata, Myrmecocichla, Pinarochroa, and Thamnolaea as the sister clade to all other openhabitat chats (supplementary fig. S1a, Supplementary Material online). For the subsequent analyses we excluded the Saxicola outgroup and rooted the trees on the sub-Saharan clade.
We then moved to reconstruct the species tree based on an as broad representation of the genome as possible.
To this end, we extracted alignments including variant and invariant sites for non-overlapping 10 kb windows. We henceforth refer to these windowed data as "loci". Analyses included only loci that fulfilled filtering criteria for read depth, alignment length, data missingness (see Materials and Methods), and absence of evidence MBE for intra-locus recombination. Furthermore, we subsampled filtered loci to be at least 10 kb apart to ensure free inter-locus recombination. Depending on filtering strategy, this left us with 5,267-6,791 loci with total alignment lengths of 34,556-52,243 kb (supplementary table S2, Supplementary Material online). We identified branches in the "anomaly zone" (Degnan and Rosenberg 2006) in several clades of wheatears: in the hispanica and picata complexes, and in the isabellina clade ( fig. 1). Nevertheless, the polytomy test based on local quartet supports in ASTRAL-III showed no evidence for polytomies in the species tree (P = 0 for all branches). The ML tree based on concatenated data and the multispecies coalescentbased species tree were fully supported and in agreement both with each other (except the position of Thamnolaea cinnamomeiventris within the sub-Saharan clade) and with Bold branches indicate internal branches for which ILS alone is statistically sufficient to explain the observed gene tree heterogeneity. Stars indicate branches that are in the phylogenetic anomaly zone. The character states of three selected characters: Sexual dimorphism (SD), monomorphic female-type (white), monomorphic male-type (pale green), dimorphic (dark green); Migratory behavior (Mig), sedentary (white), short-distance migrant (pale green), long-distance migrant (dark green); and throat coloration (throat), white (white), black (pale green), and polymorphic (white and pale green). Drawing courtesy of Chris Rose (www.chrisrose-artist.co.uk) with permission from Bloomsbury Publishing Plc.

Mitogenomic Relationships and Mito-Nuclear Discordances
We were interested in whether previously inferred relationships based predominantly on single mitochondrial genes Schweizer and Shirihai 2013;Alaei Kakhki et al. 2016) were supported by full mitogenomes and in inferring mito-nuclear discordances.
Mitogenomic relationships were in remarkable agreement with previously inferred phylogenetic relationships based predominantly on individual mitochondrial genes Schweizer and Shirihai 2013;Alaei Kakhki et al. 2016), yet showed several discordances with the species tree recovered from nuclear data ( fig. 2). Mito-nuclear discordances in wheatears were found in several places across the species tree but were mostly restricted to the placements of tip taxa: 1) In the lugens complex, nuclear data placed Oenanthe lugens persica within the complex, whereas the mitogenome placed it with Oenanthe xanthoprymna and Oenanthe chrysopygia.
2) In the picata complex, Oenanthe albonigra that by mitochondrial data was considered a sister taxon to the picata complex, was placed within the latter as a sister taxon to the phenotypically almost identical Oenanthe picata picata by nuclear data. 3) In the hispanica complex, Oenanthe cypriaca was placed as sister to either O. melanoleuca or Oenanthe pleschanka in nuclear and mitogenomic data, respectively. 4) In the isabellina clade, Oenanthe heuglini clustered as sister to either Oenanthe isabellina or O. bottae by nuclear or mitogenomic data, respectively. 5) Moreover, O. leucura and O. leucopyga formed a sister clade to the Oenanthe lugubris/lugentoides clade according to the nuclear species tree, but mitogenomes placed them consecutively at the root of the clade including Oenanthe finschi and the lugens complex. To understand whether nuclear gene trees were entirely discordant with mitogenomic relationships or in part reflected the latter, for each of the above discordances we checked for nuclear gene trees that agreed with the mitogenomic tree. This showed that for most of the mito-nuclear discordances, roughly 15% of the gene trees agreed with the mitogenomic relationships (picata complex: 14.40%, 4,282 of 29,730 gene trees; hispanica complex: 13.13%, 3,905 of 29,730 gene trees; isabellina clade: 15.71%, 4,671 of 29,730 gene trees; lugens complex: 2.77%, 824 of 29,730 gene trees).

Time Trees
In addition to species' relationships, we were interested in understanding the time scales at which species diverged. Due to the lack of appropriate fossils, we resorted to first estimating a time-calibrated mitochondrial phylogeny based on the 13 mitochondrial protein-coding genes, for which substitution rates are available (Lerner et al. 2011). The analysis in BEAST 2.6.6 showed high convergence of all parameters in three independent runs after 25% of the trees were discarded as burn-in (ESS >300). The results showed a high agreement with previous results obtained from single genes (Alaei Kakhki et al. 2016 We then used the diversification time of the openhabitat chats estimated from mitochondrial data as a time constraint in dating analyses based on nuclear data. For these analyses, we first provided the topology and branch lengths obtained from ML analyses of concatenated BUSCO data along with 1.8 Mb high-confidence nuclear data (see Materials and Methods) to generate the time-calibrated tree with RelTime-ML (Kumar et al. 2018). Compared with the mitochondrial results, the nuclear data mostly estimated similar divergence times between clades and shorter divergence times within clades (Pearson's r = 0.93, P < 0.001) ( fig. 1, supplementary fig.  S3, Supplementary Material online). Second, we performed dating analyses for windowed loci across the genome the same way as for BUSCO by providing 3.8 Mb highconfidence data. Divergence times based on BUSCO strongly correlated with ones estimated from windowed loci (Pearson's r = 0.99, P < 0.001) (supplementary fig. S4, Supplementary Material online). A test in which we re-ran the estimation of mitochondrial divergence times in RelTime-ML the same way as for nuclear data yielded the same divergence times as estimated in BEAST, thus confirming that differences in divergence times between mitochondrial and nuclear data are not due to the approach but reflect the different data types.

Extensive Gene Tree Heterogeneity
Having established the species tree, we aimed to quantify the levels of gene tree heterogeneity in wheatears to understand whether the processes generating gene tree heterogeneity could underly convergent evolution in this core group of open-habitat chats that displays the highest incidence of convergent evolution.
Several lines of evidence demonstrate extensive gene tree heterogeneity in wheatears. Remarkably, not a single gene tree out of 29,730 gene trees matched the species tree. Furthermore, many branches of the species treeincluding ones with local posterior probability 1-showed a high number of conflicting bipartitions compared with MBE concordant bipartitions, as evidenced by low Internode Certainty All (ICA) scores (supplementary fig. S5, Supplementary Material online), with ICA ranging from 1 to 0.35 and average ICA of 0.65 ± 0.19 (mean ± standard deviation). The high gene tree heterogeneities highlighted by ICA were further supported by low percentages of gene trees recovering the topology of the species tree at these internodes, as estimated by the gene concordance factor (gCF) ( fig. 1) that ranged from 1 to 0.06 with an average of 0.52 ± 0.30 (mean ± standard deviation). ICA and gCF were highly correlated (Pearson's r = 0.94, P < 0.001) (supplementary fig. S5, Supplementary Material online).
As expected, evidence for extensive gene tree heterogeneity was highest in clades with branches classified as within the phylogenetic anomaly zone. These included the lugens, picata, and hispanica complexes, the isabellina clade, and the placement of O. leucopyga and O. leucura.

Contributions of ILS to Gene Tree Heterogeneity
Next, we aimed to understand to which extent the levels of gene tree heterogeneity observed in wheatears can be explained by ILS alone. To this end, we first tested whether the multispecies coalescent without hybridization

MBE
adequately explains the gene tree heterogeneity observed across the entire species tree. The Tree Incongruence Checking in R (TICR) test (Stenz et al. 2015) showed an excess of outlier quartets (P < 0.01), indicating that a model including ILS but not introgression does not adequately explain the observed gene tree heterogeneity. This suggests that introgression occurred during the evolutionary history of wheatears.
Therefore, we moved on to infer for each branch in the species tree separately whether ILS alone may explain the level of gene tree heterogeneity. To this end, for each internal branch, we estimated the number of gene trees supporting the first and second alternative topologies, based on the rationale that under ILS the first and second alternative gene tree topologies should be supported by an equal number of gene trees . We identified 11 out of 37 internal branches (30%) for which the number of gene trees supporting the two alternative topologies were not significantly different (colored branches in fig. 1). At these 11 internal branches, ILS alone can thus explain gene tree heterogeneity, while asymmetries at the other 26 internal branches may need to invoke other processes.

Contributions of Introgression to Gene Tree Heterogeneity
Given that gene tree heterogeneity at many branches could not be explained by ILS alone, we set out to infer footprints of introgression across wheatears. To this end, we first applied the approach based on D-statistics (Durand et al. 2011) implemented in Dsuite, using > 58 million biallelic SNPs. This approach estimates D and f4 statistics across all possible combinations of trios in wheatears and then performs an f-branch test to assign gene flow to specific internal branches. The f-branch test suggested multiple events of introgression ( Finally, we corroborated the evidence for introgression in the hispanica, lugens, and picata complexes with multispecies coalescent network analyses in phyloNet, allowing for 0-5 introgression events. According to the Bayesian Information Criterion (BIC), models involving reticulation events better fit the data than strictly bifurcating trees in all three complexes (supplementary table S3

Discussion
The present study provides first genomic insights into the speciation history of open-habitat chats and into the processes involved in shaping gene tree heterogeneity that may also underpin the high incidence of convergent evolution in this group of songbirds. Our analyses reveal unambiguous species relationships despite considerable gene tree heterogeneity, including several mito-nuclear discordances that result from a combination of ILS and introgression. These relationships reconstructed from genomic data provide the strongest evidence yet for abundant convergent evolution in open-habitat chats, exemplified for three phenotypes in figure 1.
We first discuss how mito-nuclear discordances and incidences of introgression together with known histories of hybridization and biogeography mold into a comprehensive picture of open-habitat chat evolution. We close by concluding based on the indirect evidence presented here that convergent evolution in open-habitat chats likely involved a combination of ILS, introgression and novel mutations in independent lineages. Together, our results paint a picture of genomic and phenotypic evolution that is in part marked by the sharing of ancestral variation and an exchange of genetic variation between species. Our study therefore contributes to the increasing body of evidence that phenotypic and species evolution not only proceed from novel mutations but abundantly re-use genetic variation present in ancestral and related species (Seehausen et al. 2014;Meier et al. 2018;Marques et al. 2019b).

Mito-Nuclear Discordances, Patterns of Introgression, Hybridization History, and Biography Mold into a Coherent Picture of Complex Open-Habitat Chat Evolution
The species relationships inferred from nuclear genomic data were in good agreement with previous phylogenies based predominantly on single mitochondrial markers Schweizer and Shirihai 2013) and thereby confirmed the biogeographic history of openhabitat chats (Alaei Kakhki et al. 2016). Still, we recovered several species relationships discordant between the nuclear genome and the mitogenome (Toews and Brelsford 2012). In the light of 1) the histories of introgression also uncovered here, 2) the previously known hybridization history deduced from observed instances of hybridization, and 3) the here confirmed biogeography, most of these mito-nuclear discordances can be well embedded in a coherent history of open-habitat evolution.
The  . 4). However, as an exception for wheatears, even from a perspective of plumage coloration, the nuclear species tree implies a more parsimonious history of phenotypic evolution, as O. albonigra and O. p. picata display almost identical plumages. The high mitochondrial similarity of all subspecies currently treated under O. picata according to Panov (2005) may be a result of introgressive hybridization. Indeed, the high abundance of admixed phenotypes in zones of contact between the members of this species complex (Panov 2005) suggests a high incidence of hybridization. Different from the hispanica complex, where taxa meet in restricted zones, lineages of the picata complex all mold together in a relatively large area in southern Central Asia, and their degree of reproductive isolation is largely unknown. Further population genomic insights are required from the picata complex to obtain detailed insights into its history of hybridization and phenotypic evolution.
The evolution of the lugens complex was marked by two incidences of introgression that likely underpin the mitonuclear discordance observed in this complex ( fig. 4)  , a species with a very restricted range that is interesting from the perspective of phenotypic evolution: this species turns out to be highly similar to O. l. lugens at the genomic level, which contrasts with its marked phenotypic divergence ( fig. 4). This result is similar to the situation observed, for instance, in Hooded and Carrion Crows (Corvus cornix and Corvus corone, respectively) (Poelstra et al. 2014) and opens interesting questions on the evolutionary history of this taxon's coloration.
Finally, in the hispanica complex, the incomplete sorting of mitochondrial variation was previously well documented (Randler et al. 2012;Alaei Kakhki et al. 2018), and footprints of introgression came as no surprise: The complex is characterized by pervasive hybridization of O. melanoleuca with O. pleschanka in several geographic regions (Haffer 1977;Panov 1992) and population genomic analyses suggest rates of introgression of up to almost 20% between these species (Schweizer et al. 2019a). Research is underway to uncover the detailed histories of hybridization in this Eurasian wheatear complex.
The thus far discussed mito-nuclear discordances were all accompanied with high levels of gene tree heterogeneity (most within the phylogenetic anomaly zone). However, most of these cases were not explained by ILS alone but went along with footprints of introgression. Still, part of the observed mito-nuclear discordances might well be a consequence of ILS. In the picata complex, for instance, lineage divergence occurred in rapid succession ( fig. 1), and ILS might well be an alternative explanation for the mitochondrial divergence of the O. albonigra mitogenome. In addition, in the clade including O. heuglinii and the very widespread O. isabellina, species split in fast succession and the high levels of ILS likely explain the observed mito-nuclear discordance.
Taken together, our results demonstrate that the speciation history of open-habitat chats is similarly complex as their phenotypic evolution. Multiple events of introgression at both extant and ancestral time scales, along with abundant ILS, contributed to reticulate evolution and thus a mosaic of genomic variation in several clades of wheatears. Our study thus adds to an increasing number of examples (Enciso-Romero et al. 2017;Han et al. 2017;Meier et al. 2017;Lamichhaney et al. 2018) highlighting that species diversification is often complex and rather than by a linear process is at least in part a network of interacting lineages (Marques et al. 2019b)

Diverse Routes to Convergent Evolution in Open-Habitat Chats
The reconstruction of relationships among open-habitat chats using genomic data has a deep impact on our understanding of phenotypic evolution in these songbirds: the species tree provides firm evidence for an extraordinary incidence of convergent evolution ( fig. 1). For numerous traits, including plumage coloration, sexual dimorphism, and migration behavior, not related species display more similar phenotypes than sister species ( fig. 1). Almost  Furthermore, our results suggest (directly for introgression and ILS, indirectly for novel mutations) that convergent evolution in open-habitat chats is unlikely explained by a single process but may need to invoke all three processes (Hedrick 2013;Natarajan et al. 2015;Pease et al. 2016;Konečná et al. 2021;Montejo-Kovacevich et al. 2021), with the most likely processes depending on both demography and the phylogenetic scale.
For ILS to substantially contribute to convergent evolution, species must usually diverge in fast succession and maintain critically high effective population sizes to pass on ancestral variation and maintain it in daughter lineages. In open-habitat chats, such fast radiations occurred predominantly at rather recent time scales. The shortest split intervals are observed (in increasing order) in the picata, hispanica, and lugens complexes ( fig. 1). However, convergent evolution of species in the lugens complex and of the picata complex is only found with other clades but not within the complexes. Given that the levels of ILS at the root of the lugens complex are restricted, ILS is unlikely to have contributed to convergent evolution with other clades of wheatears sporting, for instance, similar plumages (see for instance the aforementioned example including O. warriae). Convergent evolution is, however, observed for back and neck-side coloration in the hispanica complex (Schweizer et al. 2019a), and could be explained by ILS of ancestral variation.
Likewise, introgression would need to have happened between taxa with similar phenotype to explain convergent evolution. Our analyses indeed uncovered several instances of in part substantial introgression (figs. 3 and 4). However, despite suggesting that introgression upon hybridization provided the opportunity to exchange phenotypes between species, none of the inferred introgression events can be tied to concrete examples of convergent evolution. This raises the question, whether the methods applied here are underpowered to infer footprints of introgression relevant to phenotypic evolution of open-habitat chats, or, indeed, introgression played a limited role in these songbirds' phenotypic evolution.
Finally, we may need to invoke novel mutations to explain at least part of the observed convergent evolution, because phenotypic similarities are found between rather divergent species and inferred instances of high ILS and introgression cannot easily explain them. Many if not most phenotypic similarities in open-habitat chats are found in the rather distant major phylogenetic clades that diverged around 5 Ma (for instance the examples provided at the entry of the discussion). The time tree suggests that the relevant split events did not occur within short evolutionary time scales. Accordingly, levels of ILS are rather low for at least one of the relevant nodes ( fig. 1). Although gene tree heterogeneity was nonnegligible for the larger of the two major wheatear clades, gene trees were mostly concordant for the root nodes of the wheatear clade including the hispanica complex and the Oenanthe oenanthe and O. isabellina clades ( fig. 1). Moreover, the phenotypically similar species occur in geographically well separated ranges and introgression between them is thus rather unexpected. In conclusion, unless the approaches used here to detect the ILS and introgression are underpowered, the indirect evidence provided by our results suggests that many incidences of convergent evolution at such time scale may have involved independent novel mutations.

Conclusion
In the present study we set out to probe gene tree variation for footprints of ILS and introgression with the goal of understanding how ILS and introgression may have contributed to convergent evolution in open-habitat chats. Our results reveal a complex speciation history and provide conclusive evidence for abundant convergent evolution in open-habitat chats. While we cannot conclude on the involvement of specific processes in the evolution of specific convergent evolution, the indirect evidence gained from the structure of the species tree and inferred levels of ILS and introgression suggest that convergent evolution in open-habitat chats likely occurred via all three possible processes, namely ILS, introgression, and novel mutations. Thereby, our results contribute to a growing body of evidence that evolution makes use and re-use of all resources it has at hand, including both standing (ancestral or heterospecific) as well as novel genetic variation.
Finally, the approach applied here based predominantly on a characterization of gene tree heterogeneity outlines an avenue to probe the processes governing convergent evolution in a wide range of systems. Even though the evidence for the involvement of these processes is indirect, ultimately, at a comparative scale this evidence may provide valuable insights into the relative contributions of ILS, introgression, and novel mutations to convergent evolution.  (Gill et al. 2020 (Sangster et al. 2010;Zuccon and Ericson 2010), was included as an outgroup to root the open-habitat chat species tree. We followed the taxonomy of the IOC World Bird List (v12.1) (Gill et al. 2020) except for the picata complex, where we treat subspecies picata, capistrata, and opistholeuca separately, following Panov (2005).

Materials and Methods
We extracted DNA from blood stored in ≥96% ethanol or Queen's Lysis buffer or tissues stored in 96% ethanol for taxa for which fresh material was available, or from toepads or dried skin from skin-preparation sutures for taxa for which only museum samples were available (supplementary table S1, Supplementary Material online). From blood and tissue samples DNA was extracted using the DNeasy Blood and Tissue Kit (Qiagen) or the MagAttract HMW DNA kit (Qiagen) following the manufacturer's protocol with exception of an adapted digestion of blood samples as reported in Lutgen et al. (2020). DNA from toepads and dried skin was extracted using the QIAamp DNA Micro Kit (Qiagen) with an adapted digestion protocol that ensures high quantities of DNA (dx.doi.org/10.17504/ protocols.io.bm4mk8u6). DNA concentrations were quantified on a Qubit fluorometer (dsDNA BR assay, Thermo Fisher Scientific) and DNA integrity was evaluated on a TapeStation (Agilent). We prepared sequencing libraries using the ThruPLEX DNA-Seq Kit (Takara), the Illumina DNA Prep Kit, the Illumina DNA PCR-free Kit, or the Chromium Genome Library kit (10X Genomics) for intact DNA, or for fragmented DNA with the ACCEL-NGS 1S DNA Library Prep Kit (Swift Biosciences) (supplementary table S1, Supplementary Material online). All libraries were sequenced (150 bp paired-end) on Illumina NovaSeq6000 instruments with a target coverage of ca. 15x.

Adapter Trimming and Mapping of Resequencing Data
Prior to further analysis, for all but the linked-read sequencing data, we trimmed adapters and merged overlapping paired-end reads using fastp 0.20.0 (Chen et al. 2018). For linked-read sequences, we trimmed the first 22 bp on the R1 read to eliminate the 10X indexes. We then mapped the reads to the reference genome assembly of Oenanthe melanoleuca (Peona et al. 2022) using BWA 0.7.17 (Li 2013) and marked duplications with PicardTools 2.9.1 (http://broadinstitute.github.io/picard). After excluding duplicates, the average sequencing coverage per individual ranged from 4.6x to 40.6x (mean and median 12.2, standard deviation 6.20) (supplementary table S1, Supplementary Material online).

Base Quality Score Recalibration (BQSR), SNP Calling, and SNP Genotyping
Data preparation followed the GATK 4.1.4.1 (McKenna et al. 2010) best practices pipeline. First, to prepare a list of high-confidence SNPs for BQSR, we ran HaplotypeCaller to generate gvcf files for each sample and then merged gvcf files of all samples with CombineGVCFs before genotyping SNPs using GenotypeGVCFs. To retain only highconfidence SNPs in the SNP-exclude set for BQSR, we retained only SNPs that fulfilled the following criteria: mapping quality > 40, Fisher strand (FS) phred-scaled P-value < 60, SNP quality score > 20, mapping quality ranksum value > -12.5, read pos rank-sum test value > -8.0 and quality by depth > 2. We retained only biallelic SNPs with at least one homozygous reference and one homozygous alternative genotype or with at least three observations of reference and alternative alleles. We excluded the resulting set of SNPs from BQSR in GATK. Following BQSR, we ran HaplotypeCaller on base-score-recalibrated bam files. The resulting gvcf files of all samples where merged (CombineGVCFs) and variant and invariant sites genotyped using the "include-non-variant sites" flag in GenotypeGVCFs. For all subsequent analyses we based genotypes on genotype likelihoods. This resulted in 871,428,254 unfiltered sites when the outgroup was included and 872,152,150 unfiltered sites without the outgroup.
In phylogenomic data sets, which are based on mapping of resequencing data to a reference genome, data of species more divergent from the reference genome may risk mapping at a lower percentage. To check for such mapping-related biases in our dataset, we estimated the average number of nucleotide differences (d XY ) between Oenanthe melanoleuca (reference genome) and all other species using pixy 0.95.02 (Korunes and Samuk 2021). We then estimated the mapping percentage for all species using SAMtools (Li et al. 2009) and tested whether there was a correlation between d XY and mapping success.

Data Filtering
Before data analysis, we removed all repeat regions from the multi-sample VCF file using the repeat mask reported in Peona et al. (2022). Then we used BCFtools 1.11 (Li 2011) to remove indels, sites close to indels (up to 10 bp) and all the sites at which exclusively alternative alleles were called. For analyses requiring variant sites only, we removed all SNPs with more than 20% missing data and all invariant sites using BCFtools and retained only SNPs with a minimum read depth of five. To ensure linkage-disequilibrium (LD) among SNPs, we LD-pruned SNPs in VCFtools 0.1.16 (Danecek et al. 2011) such as to only retain SNPs with a minimum distance of 1 kb between them. This physical distance is expected to remove most LD between SNPs, as e.g. in flycatchers LD breaks down in most genomic regions after 1 kb (Ellegren et al. 2012). After this filtering, we genotyped based on genotype likelihoods and retained 994,150 multiallelic SNPs. In addition, for analyses that require biallelic SNPs exclusively, we removed all multiallelic SNPs from the VCF file after the above filtering, using BCFtools.
For phylogenomic analyses requiring sequence data including both variant and invariant sites, we followed two MBE strategies. First, we defined 10 kb non-overlapping windows across the genome. Henceforth, we refer to the windowed data as "loci" and to phylogenetic trees inferred therefrom as "gene trees". Second, we inferred BUSCO, using BUSCO 5.0.0 (Simão et al. 2015). Similar to ultraconserved elements, UCE (Faircloth et al. 2012), BUSCO feature a high degree of conservation and moreover are present in single copies, circumventing issues with paralogs in phylogenomic reconstructions (Roy 2009). BUSCO are readily identified in whole-genome resequencing data sets, not requiring genome alignments, and are increasingly deployed for phylogenomic reconstructions (Kallal et al. 2021;Van Damme et al. 2022).
To make sure that the adopted filtering strategy did not affect our results, we generated four sets of fasta alignments using different filter settings for minimum read depth (DP), minimum percentage of the window covered by data (PW), and MD for both the 10 kb loci and the BUSCO data set: 1) DP = 1, PW = 50%, MD = 15%, 2) DP = 5, PW = 50%, MD = 15%, 3) DP = 1, PW = 50%, MD = 5%, and 4) DP = 1, PW = 80%, MD = 10% (supplementary Table S2, Supplementary Material online). These four filtering strategies yielded the same species tree and concatenated tree for 10 kb loci as well as for BUSCO. For these analyses, we therefore exclusively report the results based on the most stringent filtering on read depth (ii, DP = 5, PW = 50%, MD = 15%) (supplementary Table S2, Supplementary Material online). For gene tree heterogeneity analyses, on the other hand, we aimed to include the broadest representation of the genome and to this end retained all loci (N = 29,730) that fulfilled less stringent filtering criteria (i, DP = 1, PW = 50%, MD = 15%) (supplementary Table S2, Supplementary Material online).
Finally, for analyses making assumptions on intra-and inter-locus recombination (such as species tree reconstructions) we made sure to include only loci with no intra-locus but free inter-locus recombination (supplementary Table S2, Supplementary Material online). To this end, we excluded all loci with recombination signals (P ≤ 0.05) as inferred from the pairwise homoplasy index Phi (Φw) estimated in PhiPack 1.1 program (Bruen et al. 2006). The criterion P ≤ 0.05 does not account for multiple testing, but we preferred to conservatively exclude loci with evidence for intra-locus recombination. To possibly retain only loci among which free recombination occurs, we ensured a minimum distance of 10 kb by including no two consecutive loci. At this distance, no LD occurs in flycatchers (Ellegren et al. 2012).

Inference of BUSCO Sequences
Phylogenomic analyses based on the mapping of resequencing data to a reference genome, especially when including species well diverged from the latter, may be affected by several biases. For species more divergent from the reference genome, data from faster evolving genomic regions 1) risks not being mapped, if these regions are too diverged from the reference sequence, or 2) may map to paralogs, if the species experienced different duplication histories (Chakrabarty et al. 2017;Fitz-Gibbon et al. 2017). These biases are expected to be least important in slowly evolving regions of the genome, especially in BUSCO, that are conserved and by definition present in single copies in most species. To minimize mapping-related biases in our phylogenomic reconstructions, especially on rooting and placements of the most divergent species, we therefore extracted the intervals in which avian BUSCO (aves_odb10) are situated in our reference genome using BUSCO 5.0.0 (Simão et al. 2015).

BUSCO-Based Rooting of the Open-Habitat Chat Species Tree
First, to establish the root within open-habitat chats, we applied both concatenation and multispecies coalescentbased methods on BUSCO sequences, including the outgroup. First, we used all BUSCO (N = 7,335) to estimate the ML tree in IQ-TREE 2.1.2 (Minh et al. 2020b) based on the concatenated BUSCO, with one partition for each BUSCO and a GTR + I + G substitution model for all partitions (Abadi et al. 2019). One thousand bootstrap replicates were run using the ultrafast bootstrap approximation (Hoang et al. 2018). Second, we estimated the species tree under the multispecies coalescent using ASTRAL-III (Zhang et al. 2018) based on BUSCO without recombination signals and free inter-locus recombination (N = 2,091). To this end, we inferred BUSCO gene trees in IQ-TREE 2.1.2 using a GTR + I + G substitution model and one thousand ultrafast bootstrap approximations. To ensure that species tree inferences were not affected by inaccurately estimated gene trees (Zhang et al. 2018), we collapsed branches with bootstrap support inferior to 80% using Newick Utilities 1.6 (Junier and Zdobnov 2010). Reconstructing the species tree by including all BUSCO not considering intra-and inter-locus recombination (N = 7,335) did not affect the result.

Phylogenomic and Multispecies Coalescent Analyses Based on Full Evidence
To reconstruct the concatenated tree and species tree based on full evidence data, that is, data from the maximal possible fraction of the genome, and to study gene tree heterogeneity along the genome, we excluded the Saxicola outgroup. Instead, we rooted the trees with the clade that is the outgroup to all other open-habitat chats (sub-Saharan clade, supplementary fig. S1, Supplementary Material online). Excluding Saxicola ensured that analyses were not biased by mapping issues caused by this outgroup's divergence.
To estimate the concatenated tree using ML in IQ-TREE 2.1.2 we used all loci with a GTR + I + G substitution model and 1,000 ultrafast bootstrap approximations. To estimate the species tree under the multispecies coalescent using ASTRAL-III, we at first estimated ML gene trees using IQ-TREE 2.1.2 with a GTR + I + G substitution model and MBE one thousand ultrafast bootstrap approximations. Based on these gene trees (pruned for within-locus recombination and assuring free recombination between loci), we inferred the species tree using ASTRAL-III. Because ASTRAL relies on accurately estimated gene trees, we collapsed branches with bootstrap support inferior to 80% using Newick Utilities 1.6.
To find regions of the species tree that represent "anomaly zones" where the frequency of one of the alternative quartets is higher than that of the topology in agreement the species tree, we estimated local quartet supports for the main topology and its two alternatives in ASTRAL-III (Degnan and Rosenberg 2006). We used the anomaly_finder.py script to search for anomaly zones in our species tree (Linkem et al. 2016). To test if the gene tree discordance could be explained by polytomies instead of bifurcating nodes, we carried out a quartet-based polytomy test as implemented in ASTRAL-III.
To see whether the SNP-based species tree could confirm the sequence-based species tree, we used the unlinked multiallelic SNPs to the multispecies coalescent model implemented in SVDQuartets (Chifman and Kubatko 2014) in PAUP* 4 (Swofford 2003). We ran this with 1000 bootstrap replicates and summarized the result in a 50% majority-rule consensus tree.

Phylogenetic Relationships of Mitogenomes
We were interested in whether previously inferred relationships based predominantly on single mitochondrial genes Schweizer and Shirihai 2013;Alaei Kakhki et al. 2016) were supported by full mitogenomes and in how the mitogenomic relationships compare with the ones inferred from nuclear loci. To this end, we extracted and assembled mitochondrial genomes from the genomic data of all open-habitat chats using MitoFinder 1.2 (Allio et al. 2020). We used the published Isabelline Wheatear (Oenanthe isabellina) mitochondrial genome as a reference (Genbank accession number: NC_040290.1) and annotated the mitochondrial genome using the annotation pipeline integrated in MitoFinder. Finally, we aligned the 13 mitochondrial protein-coding gene sequences using the automatic alignment strategy in MAFFT 7.471 (Katoh and Standley 2013). We checked the alignments in AliView 1.26 (Larsson 2014) and removed stop codons within the coding sequences or indels for downstream analyses. We determined the best partition scheme using the Akaike information criterion (AIC) implemented in PartitionFinder 2.1.1 (Lanfear et al. 2017) and used the GTR + G + I model for all partitions. Then we constructed the ML tree from the concatenated supermatrix of all 13 genes in IQ-TREE 2.1.2 using the ultrafast bootstrap approximations with 1,000 replicates.

Dating Analyses
Beside species' relationships we were interested in estimating the divergence time in open-habitat chats. Because there are no appropriate fossils for calibration, we first ran BEAST 2.6.6 (Bouckaert et al. 2019) for 13 mitochondrial protein-coding genes to estimate a time-calibrated mitochondrial phylogeny. We included the mitochondrial genome sequence of S. maurus (GenBank accession number: MN356403.1) as an outgroup in these analyses. Substitution models were inferred during the MCMS analyses with bModelTest (Bouckaert and Drummond 2017) implemented as a package in BEAST 2.6.6. Published substitution rates for each mitochondrial gene (Lerner et al. 2011) were implemented as means of the clock rates in real space of lognormal distribution with standard deviations of 0.005. We defined a Yule speciation process for the tree prior and an uncorrelated lognormal relaxed clock model. Three independent MCMC chains were run for 50 million generations, each with sampling every 5,000 generations. Effective sample sizes (ESS) for all parameters and appropriate numbers of burn-in generations were checked with Tracer 1.5 (Rambaut and Drummond 2009). The three independent runs were combined using LogCombiner 2.6.6 (Bouckaert et al. 2019). We used TreeAnnotator 2.6.6 (Bouckaert et al. 2019) to calculate a maximum clade credibility tree and the 95% HPD distributions of each estimated node.
We then used the divergence time of the sub-Saharan clade from wheatears estimated from mitochondrial data as time constraint in dating analyses based on nuclear data using RelTime-ML implemented in MEGA 11 (Tamura et al. 2021). For this analysis, we provided the topology with branch length estimated in IQ-TREE based on concatenated BUSCO data retained after the most stringent filtering (ii, DP = 5, PW = 50%, MD = 15%), along with high-confidence BUSCO alignments. The latter consisted of BUSCO data filtered for DP = 5, MD = 5% and length of each BUSCO alignments longer than 1 kb. We used the same filtering to get the 10 kb non-overlapping windows across the genome and used the concatenated tree retained after most stringent filtering (ii, DP = 5, PW = 50%, MD = 15%) to repeat the analyses based on loci across the genome. To ensure that the differences in divergence times between mitochondrial and nuclear data were not due to the different dating approaches, we re-estimated the mitochondrial divergence times in RelTime-ML using the same approach as for the nuclear datasets.

Inference of the Levels of Gene Tree Variation
To investigate gene tree heterogeneity across the genome, we used gene trees inferred from less stringent filtering criteria (i, DP = 1, PW = 50%, MD = 15%) as described above. To infer how many gene trees reflect the species tree topology, we used the script 'findCommonTrees.py' (Edelman et al. 2019). To characterize the levels of gene tree heterogeneity across open-habitat chats, we compared the gene trees with the species tree. Specifically, we estimated "ICA" and the "gCF". ICA quantifies the amount of gene tree MBE heterogeneity for each internode of the species tree by calculating the number of all most prevalent conflicting bipartitions. It takes values ranging from −1 to 1, with values around zero indicating strong conflict; values toward 1 indicate robust concordance of gene trees with the species tree in the bipartition of interest; and negative values indicate discordance between the bipartition of interest and one or more bipartitions with a higher frequency (Salichos et al. 2014). While ICA thus represents the degree of conflict on each node of a species tree, gCF better reflects the gene tree heterogeneity around each branch, and is the percentage of gene trees supporting the two alternative topologies for each branch (Minh et al. 2020a). We estimated ICA and gCF with PhyParts 0.0.1 (Smith et al. 2015) and IQ-TREE 2.1.2, respectively.

Tests of an ILS Model
Next, we were interested in understanding whether ILS can sufficiently explain the level of gene tree heterogeneity observed at the level of the whole species tree. To this end, we applied the TICR test (Stenz et al. 2015) implemented in the Phylolm R package. This test evaluates whether the multispecies coalescent adequately explains gene tree heterogeneity across the species tree with no hybridization edges. TICR requires posterior distributions of gene tree topologies inferred through Bayesian inference of gene trees. Therefore, we first estimated posterior distributions of individual gene trees with MrBayes 3.2.7 (Ronquist et al. 2012). MrBayes analyses ran using three independent runs of 20 million generations each, sampling every 20,000th generation using a GTR + I + G model. We estimated the length of burn-in using Tracer 1.5 (Rambaut and Drummond 2009) to ensure that our sampling of the posterior distribution had reached sufficient ESS (ESS > 200) for parameter estimation. We then ran BUCKy (Ané et al. 2007;Larget et al. 2010) using the posterior distribution of gene trees after discarding 25% as burn-in to estimate the concordance factors (CFs) for the three possible splits of all quartets. The inferred CF values were then tested against those expected under a coalescent model that takes ILS but not hybridization into account (χ 2 test).
We then tested for each branch in the species tree whether the gene tree heterogeneity reflected in gCF can be sufficiently explained by a model incorporating ILS alone. Under ILS alone-assuming sorting of variation occurs by random genetic drift-proportions of alternative gene trees for a rooted triplet are expected to be approximately equal Hibbins and Hahn 2022), and the concordant tree topology (the topology in agreement with the species tree) should be at least as frequent as the two discordant topologies Hibbins and Hahn 2022). In contrast, introgression between non-sister taxa results in asymmetric proportions of gene trees in the rooted triplet (Green et al. 2010;Durand et al. 2012). Therefore, we performed a χ 2 tests comparing the number of gene trees supporting the two discordant topologies. Under ILS, these two alternative topologies are expected to be equally frequent among gene trees (He et al. 2020). For all these analyses we accounted for uncertainty in gene tree topologies by collapsing branches with bootstrap support <80%.

Inferring Footprints of Introgression
To infer footprints of introgression across the entire species tree, we estimated Patterson's D (Durand et al. 2011) and related statistics in Dsuite (Malinsky et al. 2021) based on 58,963,109 biallelic SNPs. D and f4 statistics were estimated across all possible combinations of trios in our 38 wheatear taxa. We used Dtrios to calculate the sums of three different patterns (BABA, BBAA, and ABBA) and D and f4 ratio statistics for all 8,437 possible trios. Dsuite uses the standard block-jackknife procedure to assess the significance of the D statistic. Due to the large number of D-statistics comparisons and difficulties disentangling false positives that may arise due to ancient gene flow, we performed the f-branch test (fb) implemented in Dsuite to assign gene flow to specific internal branches on the species tree. Then, we visualized the output using Dsuite's dtools.py script.
We then aimed to obtain further support for the footprints of introgression that were suggested in lugens, picata, and hispanica complex by the above approach based on the D-statistics. To this end, for these three complexes, we estimated phylogenetic networks from ML trees generated from BUSCO using the pseudolikelihood (InferNetwork_ MPL) (Yu and Nakhleh 2015) and likelihood (CalGTProb) (Yu et al. 2014) approaches implemented in phyloNet 3.6.9 (Than et al. 2008). Due to the high computational demands, analyses were run for each of the clades containing signals of introgression in earlier analyses separately, namely for the lugens, picata, and hispanica complexes. Furthermore, we only included BUSCO loci that had data available for all taxa of the respective complex. Outgroup species for each complex were selected based on the species tree. Analyses included 7,323 rooted gene trees for the lugens complex, 7,310 rooted gene trees for the picata complex, and 7,335 rooted gene trees for the hispanica complex. For each complex, we allowed for one to five reticulation events, with the starting tree corresponding to the species tree topology (-s), 0.9 bootstrap threshold for gene trees (-b), and 1,000 iterations (-x). To ensure convergence, the network searches were repeated 10 times. Then we estimated the likelihood by fixing the topology of the focal clade for the species tree (without any reticulation) and for each of the five networks (with different numbers of introgression edges) and calculated their likelihood scores. We determined the optimal network by calculating the BIC from the ML scores, the number of gene trees, the number of branch length being estimated, plus the number of admixture edges in each model (supplementary Table S3, Supplementary Material online). We used the browserbased tree viewer IcyTree (Vaughan 2017) to visualize the estimated networks.

Supplementary Material
Supplementary data are available at Molecular Biology and Evolution online.