The Mitogenome Relationships and Phylogeography of Barn Swallows (Hirundo rustica)

Abstract The barn swallow (Hirundo rustica) poses a number of fascinating scientific questions, including the taxonomic status of postulated subspecies. Here, we obtained and assessed the sequence variation of 411 complete mitogenomes, mainly from the European H. r. rustica, but other subspecies as well. In almost every case, we observed subspecies-specific haplogroups, which we employed together with estimated radiation times to postulate a model for the geographical and temporal worldwide spread of the species. The female barn swallow carrying the Hirundo rustica ancestral mitogenome left Africa (or its vicinity) around 280 thousand years ago (kya), and her descendants expanded first into Eurasia and then, at least 51 kya, into the Americas, from where a relatively recent (<20 kya) back migration to Asia took place. The exception to the haplogroup subspecies specificity is represented by the sedentary Levantine H. r. transitiva that extensively shares haplogroup A with the migratory European H. r. rustica and, to a lesser extent, haplogroup B with the Egyptian H. r. savignii. Our data indicate that rustica and transitiva most likely derive from a sedentary Levantine population source that split at the end of the Younger Dryas (YD) (11.7 kya). Since then, however, transitiva received genetic inputs from and admixed with both the closely related rustica and the adjacent savignii. Demographic analyses confirm this species’ strong link with climate fluctuations and human activities making it an excellent indicator for monitoring and assessing the impact of current global changes on wildlife.


Introduction
The barn swallow (Hirundo rustica) is one of the most widely distributed bird species (Turner and Rose 2010), possibly due to the switch from natural nesting sites, especially caves, to nesting in human-made structures (Zink et al. 2006). This commensal and iconic species for numerous human groups and cultures is portrayed in art, myths, legends, and poetry for millennia (Green 1988) and comprises at least six subspecies, all with breeding ranges in the Holarctic (but see Areta et al. 2021). The subspecies differ in several morphometric characteristics, such as body size, length of outer tail streamers, ventral coloration, and extent of the dark breast band (Turner 2006). The subspecies include H. r. rustica (Europe, North Africa and Western Asia), H. r. savignii (Egypt), H. r. transitiva (Israel, Lebanon, Jordan and Syria), H. r. tytleri (southern-central Siberia, Mongolia), H. r. gutturalis (central-eastern China, Japan), and H. r. erythrogaster (North America). Additional subspecies such as H. r. saturata and H. r. mandshurica have been postulated in north-eastern Asia, but their distinct subspecies status relative to the other Asian subspecies is debated (Cheng 1987;Brown and Brown, 1999;Dickinson and Dekker, 2001;Dickinson et al. 2002;Turner 2006;Liu et al. 2020). While the Hirundo rustica species complex is not endangered, local populations or even subspecies show declines due to specific threats, mostly related to agricultural intensification (Ambrosini et al. 2012;Møller 2019). Most subspecies are migratory, and their wintering grounds cover much of the southern hemisphere as far south as central Argentina, the Cape province of South Africa, and northern Australia (Turner 2006;Hobson et al. 2015;Liechti et al. 2015;Winkler et al. 2017). Adult swallows are highly philopatric (Møller 1994), whereas natal dispersal is relatively large, with some individuals, especially females, dispersing up to several hundreds of kilometers from their natal site (Turner 2006;Balbontín et al. 2009;Scandolara et al. 2014). However, H. r. savignii and H. r. transitiva are sedentary throughout the year (Shirihai et al. 1996;Turner 2006;Turner and Rose 2010), or make short-distance movements during the non-breeding period (Kiat, unpublished data).
The earliest study of barn swallow nuclear DNA variation (MUSK gene) did not detect a genetic structure within the species, suggesting a rather recent subspecies differentiation (Zink et al. 2006). More recent and extensive surveys of microsatellite and double digest Restriction-site Associated DNA (ddRAD) sequence data in H. r. rustica revealed a lack of population structure among breeding populations from Sweden, Germany, and Switzerland with no evidence of genomic selection between phenotypic migratory types (Santure et al. 2010;von Rönn et al. 2016). In contrast, genotyping of over 9,000 Single-Nucleotide Polymorphisms (SNPs) in 350 barn swallows from four subspecies revealed genome-wide clustering that generally corresponds with the subspecies, a certain level of differentiation of the UK population (H. r. rustica) from eastern European and Turkish populations of the same subspecies and genomic covariance of the latter H. r. rustica populations with non-migratory H. r. transitiva specimens from Israel . With a similar approach, molecular evidence of hybridization between subspecies was also obtained (Scordato et al. 2017;2020).
In the last few years, whole genome sequencing data have been reported for a few subspecies (H. r. erythrogaster, H. r. savignii) Smith et al. 2018), including the first reference genome draft (H. r. rustica) (Formenti et al. 2019). Recently, in the framework of the Vertebrate Genomes Project, an effort to generate complete and accurate genome assemblies for all vertebrate species, a new reference genome for H. r. rustica as well as the first pangenome for the species was released. This allowed the assessment of the extent of conservation and acceleration in the barn swallow genome and the identification of a catalogue of genetic markers and candidate genomic regions under selection (G. Formenti, data not shown).
So far, however, most genetic studies concerning the relationships between barn swallow subspecies have focused on the maternally transmitted and fast-evolving mitochondrial DNA (mtDNA), particularly on the sequence variation of single mitochondrial genes, such as ND2 and CYB (Sheldon et al. 2005;Zink et al. 2006;Dor et al. 2010Dor et al. , 2012Malaitad et al. 2016). They confirmed that the barn swallow species complex is monophyletic, and revealed that the different subspecies cluster into two major phylogenetic branches, which might have diverged approximately 100 thousand years ago (kya) and geographically correspond to Europe-Middle East and Asia-America (Zink et al. 2006;Dor et al. 2010), thus substantially predating human agriculture and the new nesting opportunities provided by human settlements. Moreover, the close relationship between one of the Asian subspecies (H. r. tytleri) and the American one (H. r. erythrogaster) has raised the possibility of a secondary dispersal event, possibly about 27 kya, from North America back into Asia (Zink et al. 2006). Finally, the potential lack of differentiation between the migratory H. r. rustica and the sedentary H. r. transitiva was also observed with the fast-evolving mtDNA (Dor et al. 2010), suggesting intermingling between the two subspecies.
Despite the valuable genetic insights provided by these studies, the assessment of only a rather short segment of the barn swallow mtDNA limits their phylogenetic resolution and the understanding of this species' origin and spread. A finer phylogenetic resolution can be achieved by sequencing the entire mitogenome, an approach that has been employed for humans and many other animal species (Achilli et al. 2008(Achilli et al. , 2012Behar et al. 2012;Lombardo et al. · https://doi.org/10.1093/molbev/msac113 MBE Miao et al. 2013Morin et al. 2015;Battaglia et al. 2016;Barth et al. 2017;Peng et al. 2018;Cole et al. 2019;de Manuel et al. 2020;Niedziałkowska et al. 2021) and recently pursued also in H. rustica . Here, we exploited next generation sequencing (NGS) to obtain 411 complete mitogenomes, mainly from the European H. r. rustica, but also from other subspecies. Phylogenetic and Bayesian analyses allowed us to 1) obtain a highresolution mitogenome phylogeny of the species, 2) better define the matrilineal relationships and links between subspecies and their divergence times, and 3) assess demography through time.

Results and Discussion
Organization of the Barn Swallow Mitogenome Our first complete mitogenome was obtained from a H. r. rustica specimen (no. 20) from Italy (supplementary table S1, Supplementary Material online). This mitogenome (MZ905359), employed as H. r. rustica Reference Sequence (HrrRS), was Sanger sequenced together with four additional H. r. rustica mitogenomes from Italy (nos. 1,35,136,and 302). The mitogenome is 18.143 bps in length and harbors 37 genes: 13 proteincoding, 22 tRNA, and two rRNA genes, as well as two noncoding regions, CR1 and CR2, following the GO-II gene order (Mackiewicz et al. 2019;Urantówka et al. 2020 -14,859, nps 16,068-16,740, nps 18,075-18,143). On average, 32.8 + 1.0 nucleotide differences were found between any two coding-region sequences. The average π value for the 411 entire mitogenomes is 0.226% (+0.018%) with the highest variability in the two control regions (Supplementary fig. S3, Supplementary Material online). A total of 1102 synonymous and 156 non-synonymous mutations were identified in the 13 protein-coding genes (PCGs) (supplementary fig. S4, Supplementary Material online). As expected, all loci harbor more synonymous than non-synonymous mutations indicating the action of purifying selection (Stewart et al. 2008).

The Phylogeography of Barn Swallow Mitogenomes and Haplogroup Ages
Phylogenetic analyses reveal that all 411 Hirundo rustica mitogenomes cluster into four main branches that we named haplogroups A, B, C, and D ( fig. 1; supplementary fig. S5, Supplementary Material online). These mitogenomes derive from a common female ancestor that harbored the H. rustica ancestral mitogenome (HrAM). Consistent with previous results, the four haplogroups are included in two primary branches (Zink et al. 2006;Dor et al. 2010) resulting from the first split in the phylogeny. One of the branches includes haplogroups A and B, and the other encompasses haplogroups C and D. We thus named them AB and CD, respectively. As previously noted (Dor et al. 2010), this division is supported by a plumage trait, the dark breast band, which is broad and complete in the subspecies clustering within the AB branch (H. r. rustica, H. r. transitiva, H. r. savignii), and narrow or incomplete in those with CD mitogenomes (H. r. gutturalis, H. r. erythrogaster, H. r. tytleri).
For all nodes in the phylogeny and the derived haplogroups and sub-haplogroups, we obtained age estimates both with maximum likelihood (ML) and Bayesian approaches. The estimates obtained with the two methods are very similar (supplementary table S3, Supplementary Material online). Thus, for brevity we report here only the Bayesian ages.
According to our data, the female barn swallow carrying the HrAM lived 276.9 + 24.3 kya, an almost three-fold age increase relative to earlier estimates (Zink et al. 2006;Dor et al. 2010). A result of this type was not unexpected. Indeed, by improving the molecular and phylogenetic resolution of mtDNA to the level of entire (or almost entire) mitogenomes, important age changes for the most recent common female ancestor were reported in different species (Achilli et al. 2012), including humans (Torroni et al. 2006;Behar et al. 2012).
Of the four main haplogroups, haplogroup A is by far the best represented (n = 388) in our sample (figs. 1 and 2). It began to radiate 57.1 + 6.4 kya and comprises all mitogenomes from Europe and Algeria (H. r. rustica) as well as 46 of the 50 H. r. transitiva mitogenomes from Israel and one from H. r. gutturalis (no. 258) sampled in China (Nujiang Prefecture, Yunnan Province). Three sub-haplogroups originated from its initial split, the largely predominant A1 and the rare A2 and A3, with the former harboring two very common sub-branches detected in all European locations as well as in Algeria ( fig. 1), with mitogenomes from each location generally scattered and intermingled with those from the other European locations. Furthermore, we observed that the 46 mitogenomes from H. r. transitiva belonging to A (black dots in fig. 2) are also scattered among the H. r. rustica mitogenomes.
These observations tend to confirm the rather poor genetic differentiation of H. r. rustica populations at a high level of molecular and phylogenetic resolution and of H. . Three possible explanations can be envisioned for the extensive mtDNA overlap between rustica and transitiva. First, the two adjacent subspecies derive from the same ancestral source in which A was the only (or predominant) haplogroup and was already differentiated into sub-haplogroups at the time of the initial rusticatransitiva split. Alternatively, rustica and transitiva maternal lineages underwent gene flow, possibly continuously over time. Finally, rustica and transitiva derive from the same ancestral population, but also admix; a process that is still going on, despite the (growing) differences in migratory behavior, moult strategy (Kiat et al. 2019) and morphology, when migrant rustica individuals pass through the transitiva breeding areas at the main time of their breeding season.
Nevertheless, because of the abundance of haplogroup A mitogenomes in our collection, we also detected a certain amount of genetic differentiation among populations. Indeed, a number of subclades harbor rather localized geographic distributions and appear to be population-specific. These subclades are not uncommon and sometimes they are relatively ancient: four were found in Spain (2-3 haplotypes each) with the oldest (A1a1b3b) dating 11.4 ky; 20 in Italy (2-5 haplotypes each) with the oldest (A1a2g) dating 11.6 ky; one (A1a2d1, 2 haplotypes) in Switzerland dating 8.0 ky; two (2 haplotypes each) in Ukraine with the oldest subclade (A1a2e1a2a2) dating 7.6 ky; and one (3 haplotypes) in Poland (A1a2e1a1a2a1b) dating 6.0 ky. This feature is not exclusive to H. r. rustica, but it characterizes also H. r. transitiva: two subclades (2-3 haplotypes) with the oldest (A1a2e1a3) dating 11.6 ky (supplementary table S3, Supplementary Material online).
With a lower degree of specificity, some geographic clustering characterizes also a few more common and sometimes older branches. For example, sub-haplogroups A1a1a1a ( 19.5 ky), A1a2e1a1a5 ( 11.5 ky), and At the other extreme, we also observed a couple of instances in which specimens sampled at very distant locations harbored the same haplotype (no. 177 from Denmark and no. 178 from Italy; no. 200 from Poland and no. 201 from Italy). They suggest that long-distance dispersal between populations occurs, in agreement with observations concerning the behavioral flexibility and adaptability of the species (Mead 2002;Turner 2006;Romano et al. 2017;Teglhøj 2020).
While limited by the relatively small size of our population samples and restricted to haplogroup A mitogenomes, the complete or partial clustering of some sub-haplogroups of A would fit with the generally reported short-distance dispersal of offspring from natal to breeding sites, although this feature is less extreme in The Barn Swallow Mitogenome Relationships · https://doi.org/10.1093/molbev/msac113 MBE females compared to males (Balbontín et al. 2009;Scandolara et al. 2014), thus less detectable in terms of mtDNA. On the other hand, the general overall sharing of the haplogroup A branches among H. r. rustica populations and between H. r. rustica and H. r. transitiva can be at least in part explained when considering that even shortdistance dispersal can lead to extensive and long-distance gene flow over the course of generations. Moreover, if the instances of long-distance dispersal from natal to breeding sites are confirmed, even at a low percentage, they would further speed up the loss of genetic structure in European barn swallow populations.
As for the remaining three major haplogroups, B, C, and D ( fig. 1; supplementary fig. S5, Supplementary Material online), the former encompasses only the four H. r. transitiva mitogenomes already mentioned above and is dated at 18.9 + 3.9 kya. It shares an ancestral node (AB; 115.6 + 13.3 kya) with the sister haplogroup A, which is approximately 40 ky younger than the CD node (156.4 + 18.0 kya) from which C and D derive.
Haplogroup C includes only H. r. gutturalis samples, four of the five sampled in China and is dated at 31.1 + 5.7 kya, while the fifth is a member of haplogroup A. The detection of haplogroup A in the gutturalis individual might indicate past or present admixture with rustica, especially when considering that it was collected in the westernmost (Nujiang Prefecture) of the sampling locations in China, the closest to the breeding range of H. r. rustica.
Finally, haplogroup D, dated at 51.1 + 7.9 kya, characterizes all 15 H. r. erythrogaster specimens from North America (USA, Nebraska), in either one or the other of its sub-haplogroups (D1 and D2). Haplogroup D age provides a minimum time for the spread of H. rustica from Asia to the Americas and indicates that North America was most likely the nesting ground of the ancestors of H. r. erythrogaster since at least 51 ky ago.

Subspecies Specificity of the Major Haplogroups
To gain a broader view of the haplogroup distribution in the different subspecies, including some not included in our study, we compared the combined ND2 and CYB gene variation of our mitogenomes with that reported in 119 barn swallow mtDNAs available from previous studies (Dor et al. 2010(Dor et al. , 2012Liu et al. 2015, direct submission;Keepers et al. 2016, direct submission;Smith et al. 2018;Feng et al. 2020;Carter et al. 2020) (fig. 3).
The phylogeny of fig. 3 confirms that haplogroup A is typical of both H. r. rustica and H. r. transitiva, with H. r. transitiva mitogenomes scattered among those of H. r. rustica in virtually all sub-haplogroups of A. Moreover, it reveals that the four haplogroup B mitogenomes observed in H. r. transitiva form a clade that is defined by the transitions at nps 14,235 and 14,243 in CYB. This branch encompasses also an additional H. r. transitiva specimen (Dor et al. 2012) and nine of eleven H. r. savignii (Dor et al. 2010;Smith et al. 2018). This high frequency of haplogroup B in H. r. savignii indicates that haplogroup B is typical of the sedentary Egyptian FIG. 3. MP phylogeny of Hirundo rustica ND2 and CYB gene sequences. This tree includes 155 barn swallows from different subspecies for which both ND2 and CYB gene sequences were available. A total of 119 are from the literature and the remaining were selected from our mitogenome dataset as follows: the first five mitogenomes that we obtained from H. r. rustica (nos. 1, 20, 35, 136, 302)  and CYB gene ranges were not included, as well as sequences that harbored gaps at informative nucleotides. The two mtDNAs forming the rather long sub-branch (6 mutations) within A1a2, one from H. r. savignii, and one from H. r. transitiva   . 3) appears to indicate that gene flow of maternal lineages is not restricted to transitiva and rustica, but it also occurs between transitiva and savignii, and possibly also between rustica and savignii. These and other alternative scenarios cannot be distinguished without nuclear genome data.
As for haplogroup C (n = 10), the phylogeny confirms instead its complete subspecies specificity. It includes only H. r. gutturalis specimens: the four from China of this study (nos. 393-396), one from Japan, three from Russia, one from Mongolia (Dor et al. 2010), and one of an undefined Asian origin (Liu et al. 2015, direct submission).
A more complex situation concerns haplogroup D. The phylogeny of fig. 3 supports the exclusive affiliation of all H. r. erythrogaster specimens (n = 30) to haplogroup D: the 15 from Nebraska of this study (nos. 397-411), additional 14 from the USA (Dor et al. 2010;Keepers et al. 2016, direct submission;Smith et al. 2018), and one from Argentina (Dor et al. 2010). As already shown by the phylogeny of entire mitogenomes, they all belong to either subhaplogroups D1 or D2, whose ages are estimated at 19.7 + 3.9 kya and 20.6 + 3.4 kya, respectively (supplementary table S3, Supplementary Material online).
However, ND2 and CYB sequences are available also for three H. r. tytleri specimens (Dor et al. 2010;Carter et al. 2020), an Asian subspecies that was not included in our survey of entire mitogenomes. The three H. r. tytleri partial mtDNA sequences appear to form a private subhaplogroup within D1, which we named D1 tytleri (fig. 3). It is a sister branch to the D1 branches of H. r. erythrogaster, thus supporting the previously proposed close relationship between H. r. tytleri and H. r. erythrogaster as well as an American origin of the ancestral mitogenomes of H. r. tytleri (Zink et al. 2006;Dor et al. 2010Dor et al. , 2012. Moreover, taking into consideration that D1 arose approximately 20 kya, we have now a maximum age boundary for the back migration from North America: the ancestors of H. r. tytleri did not move to Asia earlier than 20,000 years ago. As for the minimum boundary for this event, it will be defined only by sequencing H. r. tytleri mitogenomes.

The Demography of Barn Swallows Over Time
The Bayesian skyline plot of fig. 4 shows changes in the effective population size over time for haplogroup A, which is typical of western Eurasia and by far the most represented in our survey encompassing all H. r. rustica mitogenomes and 92% of those from H. r. transitiva. It underwent two population growth events. The first, very sharp increase occurred 30 kya, prior to the last glacial maximum (LGM). This was followed by a plateau throughout the LGM and up to the YD (12.9-11.7 kya), when the second increase began, lasting until the end of the Early Holocene Optimum (EHO) 6 kya (Baker et al. 2017).
Population expansion has been documented during postglacial periods of many other bird populations, in parallel with and thanks to their northward range expansion (Milá et al. 2006;Zink and Gardner 2017). Migratory behavior might have both resulted from and played a role in this population expansion. Glacial cycles act as switches for the evolutionarily labile migratory behavior. Lacking a suitable habitat, species would retreat to their wintering ranges during glacial maxima and revert back to longdistance migration during interglacial periods (Zink and Gardner 2017). Our results on haplogroup A mitogenomes are consistent with H. r. rustica ancestors expanding northward from the eastern Mediterranean basin, which might have acted as a refugium during the LGM and the YD. H. r. transitiva would then have mainly derived from specimens/populations that maintained their sedentary behavior, while H. r. rustica would descend from those that differentiated and re-acquired a long-distance migratory behavior while expanding northward at the end ( 11.7 kya) of the YD. These climatic changes, and possibly the increase in energy consumption associated with the re-acquisition of the long-distance migratory behavior, appear to strongly affect the extent to which selection modulates the evolution of mitochondrial PCGs (supplementary fig. S4, Supplementary Material online). Taking the end of the YD ( 11.7 kya) as a cut-off in the phylogeny, it is evident that the ratio of divergence at nonsynonymous and synonymous sites (dN/dS) is much higher when considering only variants accumulated after the YD (0.19 vs. 0.08, Fisher exact test P-value = 0.0001). This is particularly evident for genes encoding subunits of OXPHOS complexes I and V, thus supporting scenarios linking heat production and avian flight ability with mitogenome variation (Shen et al. 2009;Zhong et al. 2020).
Such a scenario would explain the sharing of haplogroup A by rustica and transitiva and many of its subbranches and the "intermingling" of their haplotypes within these clades ( fig. 2). However, it would also explain the detection of A sub-haplogroups within localized populations in Europe (H. r. rustica). The oldest are in the Mediterranean area, A1a2g in Italy and A1a1b3b in Spain ( fig. 2), with ages of 11.6 ky and 11.4 ky, respectively. Thus, they arose shortly after the end of the YD. In contrast, the population-specific sub-haplogroups detected further north in Europe arose later with a south to north time profile: A1a2d1 in Switzerland ( 8.0 ky), A1a2e1a2a2 in Ukraine ( 7.6 ky) and A1a2e1a1a2a1b in Poland ( 6.0 ky). Their ages suggest that they arose in situ when these different European regions became suitable as nesting grounds. There is also a Levantine counterpart to the European-specific sub-haplogroups. This is represented by A1a2e1a3, the oldest transitiva-specific sub-haplogroup, which is dated at 11.6 kya, again immediately after the YD, underscoring its role in the differentiation of rustica and transitiva.
The chronological gradient from south to north suggests a history of northward expansion from the Near East or the Mediterranean basin at large. This is supported The Barn Swallow Mitogenome Relationships · https://doi.org/10.1093/molbev/msac113 MBE by the negative correlation between nucleotide diversity and latitude that we observed in H. r. rustica and H. r. transitiva populations, which encompass most of our dataset and were more densely sampled, when they were grouped into the following macro-groups: South (Algeria, Spain, South Italy and Israel), Center (North Italy and Switzerland), and North (Poland, Ukraine and Denmark) (supplementary fig. S7, Supplementary Material online). For haplogroup A1, a correlation (P-value ,0.05) close to 1 was detected, thus confirming the overall reduction of mitogenome variation from South to North, as expected in models envisioning a more recent origin of central and northern European populations.
Recolonization of Europe from refugia following glacial retreat has been documented in a variety of species (Hewitt 1999(Hewitt , 2000Hansson et al. 2008). The pattern of ever younger population-specific sub-haplogroups suggests a post-glacial expansion without major loss of genetic diversity and supports a relatively slow northward spreading -the so-called "phalanx-model" of colonization, as opposed to a "pioneer model" (Nichols and Hewitt 1994;Excoffier et al. 2009). Such a slow expansion is a feature of species with short dispersal, strict requirements for habitat, and/or dependency on other species (Hewitt 2004). Barn swallows preferentially nest in human structures and are closely associated with human agriculture. Their association with slow-moving agriculturalists might explain the age gradient from south to north Europe that we observed for the population-specific subclades. The first evidence of human built structures dates to around 12-15 kya (Potts 2012), and the expansion of agriculture from the Middle East might have begun as early as 12-13 kya (Salamini et al. 2002;Arranz-Otaegui et al. 2016). Thus, the second population increase is compatible with a role for rising temperatures at the beginning of the Holocene 12 kya (Taberlet et al. 1998) as well as for the association with human settlements (Smith et al. 2018).

On the Origin of Barn Swallows
Previous comparisons of mtDNA variation in barn swallows, along with that seen in their closest relatives (Dor et al. 2010;Carter et al. 2020), have suggested that the ancestor of all Hirundo species most likely originated in Africa, as most of them have African distributions, including H. aethiopica and H. angolensis that are the closest relatives to H. rustica .
We further assessed this issue by adding the mitogenomes from other Hirundo species to the phylogeny of our barn swallows.

Conclusions
Over the course of years, numerous studies have shown that the information contained in mtDNA is phylogenetically best exploited when the sequence variation of the entire (or almost entire) mitogenome is assessed and the sequencing survey is carried out on numerous specimens sampled throughout the species distribution range. Here we employed this "magnifying glass" approach to reconstruct the genetic history of an iconic species, the barn swallow. Mitogenome data allowed us to build a detailed phylogeny for the species, to determine its coalescence time as well as the ages of its haplogroups, and to better define the matrilineal relationships between subspecies.
According to our data, the female barn swallow carrying HrAM lived 276.9 + 24.3 kya, which is much earlier than previously thought (Zink et al. 2006;Dor et al. 2010;Scordato et al. 2017). Considering that, due to its reduced effective population size, mtDNA is much more prone to lineage loss and founder events than its autosomal counterpart, an almost 300 ky of mtDNA divergence implies an even older divergence time for most nuclear genes. This allows for plenty of polymorphism in the species complex and, probably, a rather extensive differentiation of its nuclear genome, thus explaining the observed flexibility and adaptability.
In most cases, we observed complete subspecies and geographic specificity of mitogenome haplogroups, arising at different times and different places, which we employed together with estimated radiation times to postulate an overall model for the geographical and temporal spread of barn swallows ( fig. 5). According to the mtDNA data, this species left Africa (or its vicinity) almost 300 kya, expanded first into Eurasia and then, at least 51 kya, into the Americas, from where a relatively recent (,20 kya) back migration to Asia took place. Subspecies differentiation occurred in parallel to the species dispersal, usually much earlier than previously suggested (Smith et al. 2018). The Barn Swallow Mitogenome Relationships · https://doi.org/10.1093/molbev/msac113 MBE The notable exception to the haplogroup subspecies specificity is represented by the sedentary Levantine H. r. transitiva that extensively shares haplogroup A with the migratory H. r. rustica and haplogroup B, to a lesser extent, with the Egyptian H. r. savignii. We propose that rustica and transitiva derive from the same population source, which was located in the Levant and had adapted to sedentarism during the LGM. Our data indicate that the two subspecies began to split rather recently, shortly after 11.7 kya at the end of the YD. H. r. rustica would descend from individuals that re-acquired the long-distance migratory behavior while expanding northward to regions that were then becoming suitable as nesting grounds. In contrast, H. r. transitiva would derive from the Levantine component that remained in situ and maintained its sedentary behavior. Since then, however, transitiva did not remain genetically isolated, receiving genetic inputs and admixing with migratory rustica populations, as shown by the absence of significant correlations between genetic and geographic distances when assessing the shared haplogroups (supplementary fig. S9, Supplementary Material online), as well as the adjacent savignii.
This scenario, which is compatible with the presence of some haplogroup B mitogenomes in transitiva as well as its behavioral phenotype, is also supported by field and phenotypical observations (Ambrosini et al. 2009;Reiner Brodetzki et al. 2021), including the expression in H. r. transitiva of both elongated tail streamers and dark ventral coloration. The first feature is shared with rustica, but not its function as a sexual signal, and the second is shared with savignii (Vortman et al. 2011). Genetic admixture is also a plausible explanation for the detection of haplogroup A in one of the five H. r. gutturalis specimens from China.
Finally, Hirundo rustica has been strongly affected by climatic changes in the past. At the beginning of the Holocene, its population size began to grow extensively in parallel with temperature increases, and this growth was probably facilitated by the concomitant spread of agriculture and human built structures. It is also evident that climatic changes occurring during the LGM and the YD, and the possible resulting changes in migratory behavior, significantly affected the extent to which selection modulates gene sequence evolution, to a degree that is comparable with that reported in Neolithic animal domestication (Colli et al. 2015). The strong link of this widespread species with climate fluctuations and human activities makes it an excellent indicator for monitoring and assessing the impact of current global changes on wildlife.

Samples Analysed for Mitogenome Variation
We completely sequenced a total of 410 barn swallow mitogenomes. An additional H. r. rustica sample from Italy (no. 151) was extrapolated from We extracted DNA either from tissues (muscle or liver) of specimens found dead (nos. 1, 20, 35, 58, 136, 302) or blood (all remaining samples). We obtained blood samples, under license according to national guidelines and legislation, by capturing individuals with mist-nets at the barns and cowsheds where barn swallows spend the nights during breeding. Venipuncture of the brachial vein, a minimally invasive technique (Arctander 1988), was performed to draw blood. Blood samples were collected and stored either in heparinized glass capillaries or dehydrated in ethanol. Blood samples from Spain and Nebraska arrived in Sodium-EDTA buffer. Detailed information on the barn swallow samples and their mitogenomes is provided in supplementary table S1, Supplementary Material online.
At the time when the analyses were performed, there were nine entire H. rustica mitogenomes in GenBank. However, none of these could be included in our analyses of entire mitogenomes due to the following sequence issues: KX398931 (H. r. erythrogaster, Keepers et al. 2016 Thirouin et al. 2020, direct submission), large insertion in RNR2, many gaps throughout the sequence, two large NUMTs in ND3 and ND5; and KP148840 (H. r. gutturalis, Liu et al. 2015, direct submission), numerous NUMTs throughout the sequence. In addition, we extracted 16 low-coverage mitogenomes (eight from H. r. savignii and eight from H. r. erythrogaster) from the PRJNA323498 BioProject (Smith et al. 2018). They also harbored many gaps. However, most of these samples (n = 22) could be employed in the phylogenetic analyses of ND2 and CYB gene sequences (see below).

DNA Extraction
We obtained genomic DNA extracted from muscle or liver with the ReliaPrep™ (Promega Madison, WI, USA) gDNA Tissue kit, using the standard protocol for animal tissue. We extracted and purified blood samples using phenolchloroform. These samples were prepared by breaking 1-2 cm of the glass capillary containing the blood ( 4 μl) and placing it overnight at 56°C in a 2 ml tube containing: lysis buffer B (400 mM Tris-HCl, pH 8.0; 100 mM EthyleneDiamine Tetra-Acetic acid [EDTA], pH 8.0; 1% Sodium Dodecyl Sulfate [SDS]), 250 µl of TBS buffer (20 mM Tris-HCl, pH 7.5; 150 mM NaCl), and 40 µl of Proteinase K (20 mg/ml). Samples from Spain and Nebraska were instead extracted with magnetic beads on Lombardo et al. · https://doi.org/10.1093/molbev/msac113 MBE a 16 Maxwell® RSC 16 instrument using the dedicated Blood DNA Kit (Promega) and employing the "Blood DNA" protocol. Sample preparation in this case was performed by adding 1-2 μl of blood to 300 μl of lysis buffer and 30 μl Proteinase K and incubating overnight at 56°C. Genomic DNAs were eluted into TE buffer (10 mM Tris-Cl, 1 mM EDTA) or elution buffer (Promega).

NGS Sequencing
We obtained 405 additional barn swallow mitogenomes by NGS sequencing and extrapolated one more from Formenti et al. (2019). We designed a set of three oligonucleotide pairs with similar T m ( 60°C) and length (20 nt) (supplementary table S7, Supplementary Material online) to amplify the entire mtDNA molecule in three overlapping, long range PCR fragments of comparable lengths ( 6400 bps). Each fragment overlapped the next one by about 300-500 bps. PCRs were carried out in 50 µl reaction mix containing 1x GoTaq® Long PCR Master Mix (Promega), 0.3 μM of each primer and 150 ng of DNA template, using the following 2 step PCR thermal profile: 94°C (2 min); 20 cycles at 94°C (30 s), 58°C (30 s), 65°C (7 min), followed by 10 cycles at 94°C (30 s), 55°C (30 s), 65°C (7 min), and a final extension at 72°C (10 min). PCR products were checked by electrophoresis on 1% agarose gels. PCR purification was performed with the membrane method, Presto™ 96 Well PCR Cleanup Kit (Geneaid). Amplicons were quantified with the Quantus™ fluorometer (Promega) using the QuantiFluor® ONE dsDNA system. One nanogram for each of the three amplicons were combined for library preparation. The sequencing library was prepared with the Nextera™ DNA Flex Library Prep, following the manufacturer's protocol steps: tagmentation of input DNA, amplification of tagmented DNA with the addition of premixed dual-indexed adapters (IDT® for Illumina Nextera UD indexes or Nextera™ DNA CD Indexes), and PCR clean-up. Libraries were then checked on a 2% agarose gel, quantified using the Quantus™ fluorometer (Promega), normalized and pooled together. We then run the pooled normalized library on the 4150 TapeStation System (Agilent) and diluted to 4 nM using RSB resuspension buffer. Five microliter of pooled libraries were denatured using 5 μl of freshly prepared NaOH (0.2N) and diluted to the loading concentration of 6 pM (600 μl final volume) using HT1 hybridization buffer. This was finally sequenced on a MiSeq system (Illumina) using paired-end sequencing either with the MiSeq Reagent Kit v2, (2 × 150 or 2 × 250 cycles) or the MiSeq Reagent Kit v2 Nano (2 × 150 cycles). We also NGS sequenced mitogenome no. 20 (HrrRS) and the other four samples already Sanger sequenced (nos. 1, 35, 136 and 302), fully confirming the initial Sanger sequences.

Analysis of Mitogenome Sequence Data
The raw MiSeq sequencer data were output in BCL format, demultiplexed, and converted to FASTQ format with the Illumina® bash package, bcl2fastq2 Conversion Software v2.20, and trimmed with Trim Galore! v 0.6.4 (Krueger 2012) to remove low-quality reads and adapters. We checked the quality of paired end reads by using FastQC v 0.11.9 (Andrews 2010). Files were subsequently converted from FASTQ to BAM by aligning and mapping the reads to the mitogenome no. 20 (HrrRS) using BWA v0.7.17 (Li 2013, direct submission). BAM files were analysed with Geneious 8.1.5 (Biomatters, Kearse et al. 2012). Variant calling was performed by setting the threshold for heteroplasmies at 30% of reads and considering only mutations in the range 30%-70% as heteroplasmic in phylogenetic analyses. Mitogenomes employed in the phylogenetic analyses were completely sequenced with an average depth of 459X. Finally, samples were exported in the standard FASTA format.

Phylogenetic Analyses and Age Estimates of MtDNA Haplogroups
We built a maximum parsimony (MP) tree by manually programming the software mtPhyl v 5.003 (Eltsov and Volodko 2014) for the analysis of barn swallow The Barn Swallow Mitogenome Relationships · https://doi.org/10.1093/molbev/msac113 MBE mitogenomes (the modified.txt files are available on request). By comparing the mtDNA FASTA sequences to the HrrRS, the software allowed for the reconstruction of a coding-region (15,601 bps; nps 1-14,859, nps 16,068-16,740, nps 18,075-18,143) MP tree with detailed information concerning mutations (supplementary fig. S5, Supplementary Material online). We did not consider indels for tree construction. The tree was rooted using the available Hirundo angolensis (NC_050287) and Hirundo aethiopica (NC_050293) reference mitogenomes, and its topology was also confirmed by using the software Molecular Evolutionary Genetics Analysis X (MEGAX) (figs. 1 and 2) (Kumar et al. 2018).
We estimated ML and Bayesian ages of haplogroups and sub-haplogroups by using all barn swallow (n = 411) coding-region sequences (15,601 bps). We performed ML estimations using the BaseML package present in the PAMLX 1.3.1 software (Xu and Yang 2013) assuming the HKY85 (Hasegawa et al. 1985) mutation model with gamma-distributed (32 categories) rates (plus invariant sites) and 17 partitions (13 for protein coding genes, 1 for tRNAs, 1 for each rRNA gene, and 1 for intergenic regions) using the predefined tree obtained by the MP approach. We converted ML mutational distances into years by assuming an estimated split time between Progne (not shown) and Hirundo at 9.34 Mya (95% CI: 5.8-13.2 Mya) (Moyle et al. 2016), thus employing the standard approach that does not include the error on the calibration point.
We performed Bayesian estimations using the software Bayesian Evolutionary Analysis Sampling Trees (BEAST) 2.6.0 (Bouckaert et al. 2019) under the HKY substitution model (gamma-distributed rates plus invariant sites) with a relaxed clock (log normal). We entered as prior the clock value of 2.45 × 10 −8 base substitution per nucleotide per year (or one mutation every 2616 years), derived from the rate calculated in the ML method. The chain length was established at 100,000,000 iterations, with samples drawn every 1,000 Markov chain Monte Carlo steps after a discarded burn-in of 10,000,000 steps (Olivieri et al. 2017). We performed the demographic analysis on the BEAST results using Tracer v1.7.1 (Rambaut et al. 2018) and Excel using a generation time of one year. We constructed Bayesian skyline plots (BSPs) using the output tree file and a stepwise constant.
To assess the subspecies specificity of major haplogroups in a wider sample encompassing also previously published sequences, we built a MP tree based on ND2 (1,017 bps; nps 3980-4996) and CYB (1,058 bps; nps 13696-14753) gene sequences ( fig. 3), for a total of 2075 bps. We aligned sequences with MEGAX and rooted the tree using the H. aethiopica (NC_050293) and H. angolensis (NC_050287) reference mitogenomes.

Mitogenome Diversity and Latitude
To identify a potential correlation between mitogenome diversity (entire mitogenomes) and latitude, we measured both haplotype diversity (HD) and nucleotide diversity (Pi) for the most represented haplogroups (A1 as a whole as well as its major sub-haplogroups A1a1 and A1a2) using DNAsp 6.12.03 (Rozas et al. 2017). Both indices were compared with the average of the latitudes among the samples of each population using the software Tableau 2021.4 (https://www.tableau.com/).

Isolation by Distance
To assess isolation by distance (IBD), genetic distance matrices were created based on PhiSt, which was computed using the R package haplotypes (Aktas 2020), while pairwise geographic distances were calculated with the geodist R package (Padgham and Sumner 2021), applying the geodesic methods provided in (Karney 2013). We tested the correlations between genetic and geographic matrices for the most represented haplogroups (A1 as a whole as well as its major sub-haplogroups A1a1 and A1a2) using a Mantel test and simple linear regression, to also account for possible type II errors (Teske et al. 2018). The Mantel test was performed using the R package ade4 (Dray and Dufour 2007) with 999 permutations. This program tests for significant IBD by comparing the observed correlation with a histogram of simulated correlation categories and their frequency under the assumption of no IBD. Simple linear regression was performed in the stats package (R Core Team 2021). We generated plots in R using the package ggplot2 (Wickham 2016).

Data Availability
The sequence data for the 411 H. rustica mitogenomes are available in GenBank under accession numbers MZ905359, OK539050-OK539458, and OK624420. The raw NGS sequence data (fastq and bam files) are available under the ENA accession number PRJEB51610.