-
PDF
- Split View
-
Views
-
Cite
Cite
Caroline Duffie Judy, Gary R Graves, John E McCormack, Katherine Faust Stryjewski, Robb T Brumfield, Speciation with gene flow in an island endemic hummingbird, PNAS Nexus, Volume 4, Issue 4, April 2025, pgaf095, https://doi.org/10.1093/pnasnexus/pgaf095
- Share Icon Share
Abstract
We examined speciation in streamertail hummingbirds (Trochilus polytmus and Trochilus scitulus), Jamaican endemic taxa that challenge the rule that bird speciation cannot progress in situ on small islands. Our analysis shows that divergent selection acting on male bill color, a sexual ornament that is red in polytmus and black in scitulus, acts as a key reproductive barrier. We conducted a population-level analysis of genomic and phenotypic patterns to determine the traits that contribute the most to speciation despite ongoing gene flow across a narrow hybrid zone. We characterized genomic patterns using 6,451 single-nucleotide polymorphisms and a segment of the mitochondrial control region. Our analyses revealed high diversity within species, and low divergence between them, consistent with a recent speciation event or extensive gene flow following secondary contact. We observed narrow clines in two phenotypic traits and several SNP loci. The cline width for male bill color is only 2.3 km, marking it as one of the narrowest phenotypic clines documented in an avian hybrid zone. The coincidence of estimated cline centers with the Rio Grande Valley suggests that this landscape feature may contribute to hybrid zone stability. However, given that streamertails are highly mobile, it is unlikely that such a narrow river acts as a physical barrier to dispersal. The limited genomic divergence across scanned regions of the genome offers little support for postmating reproductive barriers. Instead, our findings point to strong premating selection acting on bill color as the primary driver of streamertail speciation.
The streamertail hummingbirds of Jamaica represent an intriguing exception to the rule that volant birds cannot speciate in situ on small, oceanic islands. This comprehensive genomic and phenotypic analysis of the exceptionally narrow hybrid zone provides a snapshot of speciation in progress. Evidence suggests that female selection for male bill color is the most likely driver of divergence. Given the high mobility of hummingbirds, the narrow river valley coincident with the hybrid zone is unlikely to be an effective barrier to dispersal and gene flow.
Introduction
Some of the most dramatic examples of avian diversification occur in archipelagos where small population sizes and oceanic barriers to dispersal facilitate divergence and speciation between island populations (1–5). The spectacular radiations of Hawaiian honeycreepers (5) and, most famously, Darwin’s finches (6) are thought to have arisen through multiple instances of allopatric speciation on different islands followed by secondary colonization (7). In contrast, avian examples of in situ speciation within islands smaller than Madagascar (592,800 km2) remain scarce (8, but see 9–11). Opportunities for geographic isolation are limited on smaller islands, and selection pressures associated with environmental gradients are typically insufficient to cause speciation in the face of gene flow (12).
Streamertail hummingbirds (Trochilus polytmus and Trochilus scitulus), endemic to the Caribbean island of Jamaica (10,991 km2), represent a potential exception to the rule that avian speciation cannot progress on small islands (13, 14). The two streamertail hummingbirds are sister to the clade of emerald hummingbirds (Trochilini) that includes the genera Leucolia and Saucerottia (15) from mainland Middle and South America. A time-calibrated molecular phylogeny of all hummingbirds suggests that the Trochilus lineage likely originated in South America, diverging from its sister taxon 5–7 million years ago (16). We interpret the split date as an upper bound to the timing of colonization but cannot determine when within the last 5–7 million years it occurred. Multiple divergence events may have taken place, with some lineages subsequently going extinct. Further in situ diversification within the Trochilus lineage gave rise to two phenotypically distinct populations, the red-billed streamertail (T. polytmus) and the black-billed streamertail (T. scitulus). These taxa have been variously considered as species or subspecies (14).
Streamertails are parapatrically distributed on Jamaica with polytmus, Jamaica’s national bird, widespread in humid forest across most of the island (13, 14, 17, 18), while scitulus is restricted to the extreme east. Both species are extremely abundant in their respective ranges (19). As their common names suggest, the taxa are distinguished by male bill color, which is bright red in polytmus and jet black in scitulus. The red bill of polytmus is also significantly wider than the black bill of polytmus (20). Both taxa are polygynous, and males perform courtship displays that emphasize their bills, emerald gorgets, elongated tail streamers, and bicornuate crests (14).
Streamertail hummingbirds meet at a narrow hybrid zone where their parapatric ranges overlap in the Rio Grande Valley in northeastern Jamaica (13, 14, 17) (Fig. 1). This hybrid zone was first documented by James Bond (17) in the 1950s. Bill color is a reliable marker of hybridization, with hybrid individuals exhibiting bills that are intermediate in both color and width (13, 14, 21). A second contact zone occurs along the southern coast that follows the Morant River to the ocean, but hybrids are unknown from this locality (13) and this region remains poorly explored. Gill et al. (13) mapped the geographical extent of the northern hybrid zone some 20 years after it was described by James Bond. They estimated that it was <16.1 km at its widest point and perhaps as narrow as 4.8 km in some locations. They concluded that the hybrid zone resulted from differentiation during a period of geographic isolation followed by secondary contact facilitated by anthropogenic habitat change in the Rio Grande Valley. More extensive field investigations conducted from 2000 to 2006 (14, 19, 21–23) confirmed that the northern hybrid zone had neither widened nor shifted significantly during the subsequent half century. The minimum distance between populations with wholly red bills and those with wholly black bills was 3.2 km (14). The corresponding cline in bill color across the hybrid zone is, to the best of our knowledge, one of the steepest phenotypic gradients ever documented in an avian hybrid zone.

Streamertail sample localities 1–12 across Jamaica, West Indies. The geographic extent of the hybrid zone (shaded region) lies between the Blue Mountains and the John Crow Mountains. See inset map for westernmost locality 1. Pie charts show the proportion of polytmus, scitulus, and phenotypic hybrids captured at each locality.
In this study, we characterized genomic and phenotypic patterns of introgression across the Trochilus hybrid zone to determine which trait(s), if any, contribute to speciation. We analyzed a total of 186 Trochilus specimens from 11 sample localities that span the hybrid zone plus an additional reference locality in western Jamaica. We generated 6,451 anonymous single-nucleotide polymorphisms (SNPs) using genotyping-by-sequencing (GBS) technology and amplified a segment of the mitochondrial control region (MtCR). We mapped the SNPs to a reference genome for Anna’s hummingbird (Calypte anna) to gain insights into the genomic architecture of divergence between the streamertail relatives. Equilibrium cline models were employed to characterize the cline structure of genomic and phenotypic traits and to identify traits experiencing strong natural selection that opposes gene flow.
Results
Bioinformatics processing
Running the initial UNEAK pipeline yielded 97,432 SNPs. Filtering steps in the UNKEAK pipeline reduced the total number of called SNPs from 97,432 to 32,949. Collapsing reverse complements and performing additional filtering steps reduced the total number of SNPs to a final call set of 6,451. Two individuals (G.R.G. 635683 and 636151) had ambiguous calls (“N”’s) for the majority of SNPs (>90%) and were removed from the dataset. In addition, 14 outliers in the multidimensional scaling and principal component analysis (PCA) plots were pruned because they were potentially biased by sequencing error: 633563, 633565, 633590, 633591, 633600, 633603, 633623, 635683, 635691, 636149, 636157, 636159, 636163, and 63617. The final dataset contains 6,451 high-quality SNPs in 84 scitulus, 46 polytmus, and 15 hybrids, for a total of 145 individuals (Table S1).
Population genomics
The PCA of the GBS dataset (n = 6,451 SNPs) revealed a modest signal for species-level divergence along PC1, which explained 22% of the total variance in 145 sequenced individuals (Fig. 2A). None of the 6,451 SNPs were fixed for alternate alleles between parental populations. The discriminant analysis of principal components (DAPC) showed stronger, though still imperfect, discrimination at the species level (Fig. 2B). Examination of the DAPC allelic loadings revealed a single SNP with allelic loadings >0.005 and 22 additional SNPs with allelic loadings above 0.001 (Fig. S2), an arbitrary threshold defined as “ancestry-informative” for the purposes of this study (see Materials and Methods). The DAPC performed using only parental polytmus and scitulus individuals (excluding hybrids) showed similar results (Fig. S3).

A) PCA of genomic variation from 6,451 SNPs in 145 individual streamertail hummingbirds. B) DAPC based on 22 retained principal components. C) Structure plot based on variation in 23 ancestry-informative SNPs in 145 individuals (K = 2). The localities are indicated numerically (1–12) above, and the corresponding phenotype (polytmus = red, scitulus = black, and hybrid = teal) below each ancestry coefficient.
The “structure” analysis based on 23 ancestry-informative SNPs revealed clinal variation in ancestry coefficients across the hybrid zone (Fig. 2C). The polytmus individuals assigned more strongly to the polytmus cluster (M = 0.85 and SD = 0.14), while scitulus individuals assigned more strongly to the scitulus cluster (M = 0.80 and SD = 0.17). Hybrids showed a stronger assignment to the polytmus cluster, but with a lower overall average and greater variation in assignment (M = 0.69 and SD = 0.23). Higher values of K did not capture additional structure in the data (results not shown). The “faststructure” analysis performed on the entire dataset (n = 6,451) revealed little variation among individual ancestry coefficients which all assigned strongly (>0.999) to one of the two clusters (Fig. S4).
An analysis of molecular variance (AMOVA) performed on MtCR haplotypes revealed high genetic diversity within morphogroups (98.6%) and low, but significant, structure (1.40%, ɸST = 0.01, P = 0.04) among them (Table S3). The minimum-spanning network revealed extensive haplotype sharing among polytmus, scitulus, and hybrids (Fig. S8).
Genomic mapping
A total of 6,205 out of 6,451 SNPs successfully mapped to the reference genome for Anna’s hummingbird (C. anna). Of these, 208 SNPs mapped to the Z-chromosome. Among the 23 ancestry-informative SNPs, 21 mapped successfully (Table S5). Seven of these mapped to the Z-chromosome, which is more than expected by chance (P = 2.26E−7) (Fig. S6). SNP 10, the most informative SNP for interspecific divergence (Fig. S2), mapped to chromosome 20, along with two other ancestry-informative SNPs (SNP 7 and SNP 16). Functional annotation using the Panther classification system revealed that SNP 10 resides in a gene that is orthologous to Gene NCOA6, the nuclear receptor coactivator 6 in Gallus gallus (Panther: http://www.pantherdb.org/panther/category.do?categoryAcc=GO:0030374), with a biological function of regulating transcription by RNA polymerase II.
Geographic cline analysis
The maximum-likelihood cline for bill color, as estimated by “hzar,” had a cline width of 2.3 km (Table 1 and Fig. 3A and D). Under neutrality, such a narrow cline would have formed only 10 years ago based on the most conservative dispersal estimate (0.5 km) and even more recently using higher dispersal estimates (Table S7). Using that same conservative estimate (0.5 km), the maximum-likelihood cline for the bill width trait had an estimated cline width of 14.3 km (Table 1 and Fig. 3A and D), which, under neutrality, would have formed 286 years ago. However, using the highest dispersal estimate (1.5 km), the cline would have formed just 31 years ago (Table S7), which is more recent than the youngest known age of the hybrid zone (17).

A) Maximum-likelihood geographic clines with observed population means for bill color (purple) and bill width (yellow). The 95% credible cline regions are shaded in corresponding purple or yellow. B) Maximum-likelihood geographic clines and observed population frequencies for five Z-linked SNPs (blue), three autosomal SNPs (green), and the population mean ancestry coefficients (black) generated by a clustering analysis of the 23 ancestry-informative SNPs implemented in the program “structure.” The 95% credible cline region for the ancestry coefficient cline is shaded in gray. C) Estimated cline centers (c) plotted with their 95% CI for each trait, sorted west to east. D) Estimated cline widths (w) plotted with their 95% CI for each trait, sorted narrowest to widest.
Parameter estimates for clines fitted to (A) quantitative and (B) molecular traits using the program “hzar.”
A. Quantitative . | Model . | Cline width (km) . | Cline center (km) . | δ . | τ . |
---|---|---|---|---|---|
Ancestry | V | 1.4 (0.4‒3.4) | 133.0 (132.6‒133.8) | 0.0 (0.0‒0.9) | 0.1 (0.0‒0.3) |
Bill color | I | 2.3 (0.9‒4.1) | 134.0 (133.4‒134.4) | n.a. | n.a. |
Bill width | I | 14.3 (9.4‒21.5) | 130.6 (128.5‒132.3) | n.a. | n.a. |
A. Quantitative . | Model . | Cline width (km) . | Cline center (km) . | δ . | τ . |
---|---|---|---|---|---|
Ancestry | V | 1.4 (0.4‒3.4) | 133.0 (132.6‒133.8) | 0.0 (0.0‒0.9) | 0.1 (0.0‒0.3) |
Bill color | I | 2.3 (0.9‒4.1) | 134.0 (133.4‒134.4) | n.a. | n.a. |
Bill width | I | 14.3 (9.4‒21.5) | 130.6 (128.5‒132.3) | n.a. | n.a. |
B. Molecular . | Model . | Cline width (km) . | Cline center (km) . | pmin . | pmax . |
---|---|---|---|---|---|
SNP 2 | II | 3.8 (0‒17.6) | 132.6 (128.7‒135.6) | 0.5 (0.5‒0.6) | 0.8 (0.7‒0.9) |
SNP 7 | II | 1.7 (0‒16.7) | 135.4 (133‒138.4) | 0.1 (0‒0.1) | 0.2 (0.2‒0.3) |
SNP 8 | II | 1.2 (0.1‒6.8) | 134.2 (130.1‒134.9) | 0.5 (0.5‒0.6) | 0.7 (0.7‒0.8) |
SNP 9 | I | 26.9 (16.2‒57) | 130.9 (123.3‒134.4) | ||
SNP 10 | II | 1.2 (0‒5.4) | 135.2 (134.4‒136.6) | 0.2 (0.1‒0.2) | 0.6 (0.5‒0.7) |
SNP 12 | II | 0.3 (0.1‒5.2) | 138.9 (136.8‒139.4) | 0.6 (0.5‒0.6) | 0.8 (0.8‒0.9) |
SNP 16 | I | 6.4 (0.2‒22.3) | 138.3 (135.2‒146.7) | 0.6 (0‒0.7) | 0.9 (0.8‒1) |
SNP 23 | I | 16.6 (10.5‒28) | 132.6 (129.3‒134.8) |
B. Molecular . | Model . | Cline width (km) . | Cline center (km) . | pmin . | pmax . |
---|---|---|---|---|---|
SNP 2 | II | 3.8 (0‒17.6) | 132.6 (128.7‒135.6) | 0.5 (0.5‒0.6) | 0.8 (0.7‒0.9) |
SNP 7 | II | 1.7 (0‒16.7) | 135.4 (133‒138.4) | 0.1 (0‒0.1) | 0.2 (0.2‒0.3) |
SNP 8 | II | 1.2 (0.1‒6.8) | 134.2 (130.1‒134.9) | 0.5 (0.5‒0.6) | 0.7 (0.7‒0.8) |
SNP 9 | I | 26.9 (16.2‒57) | 130.9 (123.3‒134.4) | ||
SNP 10 | II | 1.2 (0‒5.4) | 135.2 (134.4‒136.6) | 0.2 (0.1‒0.2) | 0.6 (0.5‒0.7) |
SNP 12 | II | 0.3 (0.1‒5.2) | 138.9 (136.8‒139.4) | 0.6 (0.5‒0.6) | 0.8 (0.8‒0.9) |
SNP 16 | I | 6.4 (0.2‒22.3) | 138.3 (135.2‒146.7) | 0.6 (0‒0.7) | 0.9 (0.8‒1) |
SNP 23 | I | 16.6 (10.5‒28) | 132.6 (129.3‒134.8) |
Two log-likelihood unit support limits are presented in parentheses. Cline center is measured as distance (km) from locality 1. For the molecular traits, parameter pmin is the minimum estimated frequency at the west end and pmax is the maximum estimated frequency at the east end of the transect. The exponential decay curves (tails) shape parameters, represented as δ and τ, are reported where estimated.
Parameter estimates for clines fitted to (A) quantitative and (B) molecular traits using the program “hzar.”
A. Quantitative . | Model . | Cline width (km) . | Cline center (km) . | δ . | τ . |
---|---|---|---|---|---|
Ancestry | V | 1.4 (0.4‒3.4) | 133.0 (132.6‒133.8) | 0.0 (0.0‒0.9) | 0.1 (0.0‒0.3) |
Bill color | I | 2.3 (0.9‒4.1) | 134.0 (133.4‒134.4) | n.a. | n.a. |
Bill width | I | 14.3 (9.4‒21.5) | 130.6 (128.5‒132.3) | n.a. | n.a. |
A. Quantitative . | Model . | Cline width (km) . | Cline center (km) . | δ . | τ . |
---|---|---|---|---|---|
Ancestry | V | 1.4 (0.4‒3.4) | 133.0 (132.6‒133.8) | 0.0 (0.0‒0.9) | 0.1 (0.0‒0.3) |
Bill color | I | 2.3 (0.9‒4.1) | 134.0 (133.4‒134.4) | n.a. | n.a. |
Bill width | I | 14.3 (9.4‒21.5) | 130.6 (128.5‒132.3) | n.a. | n.a. |
B. Molecular . | Model . | Cline width (km) . | Cline center (km) . | pmin . | pmax . |
---|---|---|---|---|---|
SNP 2 | II | 3.8 (0‒17.6) | 132.6 (128.7‒135.6) | 0.5 (0.5‒0.6) | 0.8 (0.7‒0.9) |
SNP 7 | II | 1.7 (0‒16.7) | 135.4 (133‒138.4) | 0.1 (0‒0.1) | 0.2 (0.2‒0.3) |
SNP 8 | II | 1.2 (0.1‒6.8) | 134.2 (130.1‒134.9) | 0.5 (0.5‒0.6) | 0.7 (0.7‒0.8) |
SNP 9 | I | 26.9 (16.2‒57) | 130.9 (123.3‒134.4) | ||
SNP 10 | II | 1.2 (0‒5.4) | 135.2 (134.4‒136.6) | 0.2 (0.1‒0.2) | 0.6 (0.5‒0.7) |
SNP 12 | II | 0.3 (0.1‒5.2) | 138.9 (136.8‒139.4) | 0.6 (0.5‒0.6) | 0.8 (0.8‒0.9) |
SNP 16 | I | 6.4 (0.2‒22.3) | 138.3 (135.2‒146.7) | 0.6 (0‒0.7) | 0.9 (0.8‒1) |
SNP 23 | I | 16.6 (10.5‒28) | 132.6 (129.3‒134.8) |
B. Molecular . | Model . | Cline width (km) . | Cline center (km) . | pmin . | pmax . |
---|---|---|---|---|---|
SNP 2 | II | 3.8 (0‒17.6) | 132.6 (128.7‒135.6) | 0.5 (0.5‒0.6) | 0.8 (0.7‒0.9) |
SNP 7 | II | 1.7 (0‒16.7) | 135.4 (133‒138.4) | 0.1 (0‒0.1) | 0.2 (0.2‒0.3) |
SNP 8 | II | 1.2 (0.1‒6.8) | 134.2 (130.1‒134.9) | 0.5 (0.5‒0.6) | 0.7 (0.7‒0.8) |
SNP 9 | I | 26.9 (16.2‒57) | 130.9 (123.3‒134.4) | ||
SNP 10 | II | 1.2 (0‒5.4) | 135.2 (134.4‒136.6) | 0.2 (0.1‒0.2) | 0.6 (0.5‒0.7) |
SNP 12 | II | 0.3 (0.1‒5.2) | 138.9 (136.8‒139.4) | 0.6 (0.5‒0.6) | 0.8 (0.8‒0.9) |
SNP 16 | I | 6.4 (0.2‒22.3) | 138.3 (135.2‒146.7) | 0.6 (0‒0.7) | 0.9 (0.8‒1) |
SNP 23 | I | 16.6 (10.5‒28) | 132.6 (129.3‒134.8) |
Two log-likelihood unit support limits are presented in parentheses. Cline center is measured as distance (km) from locality 1. For the molecular traits, parameter pmin is the minimum estimated frequency at the west end and pmax is the maximum estimated frequency at the east end of the transect. The exponential decay curves (tails) shape parameters, represented as δ and τ, are reported where estimated.
Several genomic traits also exhibited exceptionally narrow cline widths relative to neutral expectations. The maximum-likelihood cline for ancestry coefficients (sexes pooled) based on the 23 informative SNPs had an estimated cline width of 1.4 km (Table 1 and Fig. 3B and D), which would have formed <10 years ago using the most conservative estimate for dispersal (Table S7).
Eight ancestry-informative SNPs had narrow joint CIs for cline width and cline center compared with the median value for each parameter and were selected for further analysis. These eight ancestry-informative SNPs had cline widths that ranged from 0.4 to 27 km (Table 1 and Fig. 3B and D). Under neutrality, six of these eight clines would have to be younger than the known age of the hybrid zone using the most conservative dispersal estimate (0.5 km), one SNP would be younger than the known age of the hybrid zone using higher dispersal estimates (1.0, 1.5 km), and one (SNP 9) would not be younger than the known age of the hybrid zone using any tested dispersal estimate (Table S7).
Finally, while the geographic cline centers for all modeled phenotypic and genomic traits were clustered near the Rio Grande Valley (Table 1 and Fig. 3A–C), small but significant differences in the location of cline centers were detected among the individual traits. For instance, the cline for bill color is centered approximately 4 km east of the cline center for bill width (Table 1 and Fig. 3A and C).
Discussion
We present evidence for speciation occurring at an unusually small spatial scale, one of the rare cases of avian speciation on a small oceanic island where the entire island is well within the potential daily cruising range of the diverging taxa. The weight of evidence suggests that streamertail speciation is driven primarily by selection on a single phenotypic trait—bill color, which likely acts as a premating reproductive barrier. The low level of genomic divergence is consistent with a demographic model of recent speciation within the last 10,000–100,000 years or extensive gene flow following secondary contact. These contrasting demographic scenarios can generate similar patterns of genetic diversity that can be difficult to distinguish. We were unable to rule out either scenario. Opportunities for allopatric speciation may have arisen in the Pleistocene when climate- and sea-level fluctuations caused a dramatic shift in vegetation from mesic to xeric habitats and profoundly impacted biotic distribution in the Caribbean (24). The Pleistocene climactic fluctuations are thought to have promoted diversification of various vertebrates on Jamaica (24) including Anolis lizards (25).
Diagnostic SNPs, or SNPs that are fixed for alternative alleles between parental populations, may have gone undetected in our scan if they are few or restricted to narrow regions of strong differentiation in the genome. Such “islands” of divergence (26, 27) have frequently been detected in closely related but phenotypically differentiated avian taxa (e.g. crows (28), wood warblers (29), munias (30), and seedeaters (31)). We did identify a narrow subset of 23 ancestry-informative SNPs that drive most of the modest signal for divergence in our dataset. Among these, the signal was heavily skewed toward a single SNP, SNP 10. These 23 SNPs mapped to the Z-chromosome more frequently than expected by chance alone. Sex chromosomes (ZW in birds) typically show stronger divergence than autosomal chromosomes early in speciation and are thought to play an important role in avian speciation due to two general patterns: their disproportionately large effect on hybrid fitness (large-Z effect) and potential for faster evolution of the Z-chromosome due to its smaller effective population size (fast-Z effect) (reviewed in 32). However, SNP 10 and two other ancestry-informative SNPs mapped to chromosome 20, a gene-rich microchromosome known to house genes relevant for melanistic plumage traits in wood warblers (29) and munias (30). Functional annotation of the SNP 10 locus revealed that it is orthologous to gene NCOA6, the nuclear receptor coactivator 6 in G. gallus whose biological function is a hormone receptor. These finding should direct future studies aimed at uncovering the genomic basis for divergent bill phenotypes.
The narrow geographic clines detected for male bill color (2.3 km) and several of the ancestry-informative SNPs, including SNP 10, offer compelling evidence that strong selection is maintaining the hybrid zone. While the nature of selection acting on the bill color trait is unknown, sexual selection is a compelling hypothesis. Males of both taxa use the bill prominently in the courtship display (33), making it a likely target of sexual selection. We theorize that adult hybrid males, with mottled bicolored bills, are discriminated against by females because their bills closely resemble those of immature male polytmus. This suggests that females of both bill color types are selecting against juvenile age classes of the red-billed streamertail. This would promote assortative mating and contribute to the hybrid zone’s remarkable stability. However, we do not rule out the possibility that other forms of sexual selection could be operational, such as male competition, or differences in song (34, 35). Field-based studies of behavior are needed to determine the nature of selection operating on the bill color trait.
Narrow hybrid zones have been reported in a number of differentiated avian species pairs including crows, grackles, warblers, manakins, and others (discussed in 14), with a few approaching the narrow width of the streamertail clines. Brumfield et al. (36) estimated a cline width of 3.0 km for belly color, a secondary sexual character, across the hybrid zone between white- and golden-collared manakins (Aves, Pipridae) located in western Panama. Morales-Rozo et al. (37) estimated a cline width of 6.7 km across a hybrid zone between tanagers in the genus Ramphocelus (Aves, Thraupidae) located in southwestern Columbia. Delahaie et al. (11) report a narrow 5.1-km cline for head color in lowland forms of Réunion gray white eyes (Aves, Zosteropidae). Sexual selection likely influences all three cases though other factors also contribute (11, 36, 37). For instance, small geographic barriers such as rivers or lava fields are thought to hinder dispersal of Réunion gray white eyes (11). In contrast, streamertail hummingbirds are highly vagile, and small geographic barriers do not appear to impact their demography to such an extent.
The convergence of both genomic and phenotypic clines in a narrow zone in the Rio Grande Valley raises questions about the role of local landscape characteristics in stabilizing the position of the streamertail hummingbird hybrid zone. However, there is no evidence that the Rio Grande Valley serves as a dispersal barrier to hummingbirds or that regional variation in habitat quality has created a population density trough. Small differences among cline centers for individual traits within the Rio Grande Valley may indicate some dynamism in the pattern, or they could reflect a sampling artifact (Appendix). The valley was originally vegetated with humid lowland and lower montane forest but has since undergone significant anthropogenic disturbance since at least the early 18th century (38) and perhaps much earlier (39). While the floodplain and lower montane slopes have been heavily modified, the higher slopes and ridges of the flanking mountains are still cloaked with old-growth humid montane and elfin forests. The current habitat from Millbank (150-m elevation), at the head of the valley, to the river’s mouth (20 km by air) is a patchwork of scrubby woodland, small agricultural plots, dooryard gardens, pasture, banana plantations, bamboo (Bambusa vulgaris), and small forested plots. The coastal plain of Portland Parish east and west of the Rio Grande River has experienced a similar level of anthropogenic disturbance. Despite historical reports of higher streamertail densities in the mountains (17), a recent population survey found no evidence for a density trough in the valley (19). In fact, introduced nectar plants associated with mixed-use agriculture may have benefited nectivorous birds and increased streamertail density in the lower regions of the Rio Grande Valley, making the enduring isolation of the streamertail taxa all the more remarkable.
Materials and methods
Specimen collection and preparation
Specimens (n = 186) analyzed in this study were collected in 2003–2014 from 11 localities in or near the Rio Grande valley hybrid zone with a 12th reference population of polytmus from western Jamaica (Figs. 1 and S5 and Table S1). Altitude (above sea level) of specimen collection sites in eastern Jamaica varied from 22 to 368 m (x = 122 ± 25 m). Collections were made with permission and licenses issued by Jamaica’s National Environment and Planning Agency (NEPA), Jamaica Conservation and Development Trust, and private landowners. Collecting protocols were governed by the Smithsonian Institutional Animal Care and Use Committee (IACUC 2005-9 and 2013-11). Specimens are deposited in the research collections of the National Museum of Natural History, Smithsonian Institution, Louisiana State University Museum of Natural Science (LSUMNS), and the Institute of Jamaica. Specimens were weighed using a digital scale (0.01 g), sexed internally, and prepared as round museum skins with partial skeletons (lacking rostrum). Streamertails have sexually dimorphic plumage and can be sexed by appearance as early as the nestling stage. Two independent digital photographs were taken of the bill in dorsal aspect next to a millimeter scale (14).
Bill color is the sole phenotypic trait that can unequivocally differentiate the taxa. Reddish bill coloration is apparent in fledging polytmus (21). In newly fledged males, the proximal two-thirds of the lower mandible is pinkish red and the maxillary rhamphotheca exhibits a suffusion of pink below the nasal operculum. Streamertails in juvenal plumage exhibit striations on the maxillary rhamphotheca, indicative of immaturity (40). Bill striations and melanin deposits in the rhamphotheca disappear with age so that males in first basic plumage have smooth, predominately red bills. Bill color of scitulus is black from the nestling stage through adulthood (21). The ontogeny of bill color, bill striations, and plumage sequences in females has not been studied in depth but it is thought to parallel the patterns observed in males.
Laboratory methods
Total DNA was extracted from ∼25 mg of muscle tissue from 168 individuals using a DNeasy tissue kit (Qiagen, Valencia, CA, USA). A segment of the MtCR was amplified using a standard protocol for PCR (primers: H1251, 5′-TCTTGGCATCTTCAGTGCCRTGC-3′ (41) and TrochL-CR1: 5′-TGGCTAACGCGGAGCATACAATCTC-3′). Sequencing products were purified with Sephadex and loaded on an ABI Prism 3100 Genetic Analyzer. Sequences were aligned in Sequencher 4.1 (Gene Codes, Ann Arbor, MI, USA). When direct sequencing revealed more than one heterozygous site within a sequence, haplotypes were resolved probabilistically using PHASE (42, 43).
We sent 0.4–3.2 µg of DNA from 159 samples to the Cornell Institute of Genomic Diversity for GBS (44). GBS is a streamlined workflow for generating reduced representation libraries for Illumina sequencing, similar to other forms of RAD-Seq (45, 46). Details of the laboratory methods can be found in (44). Briefly, DNA was digested with PstI (CTGCAG), and the resulting fragments were ligated to sample-specific adapters and common adapters. Fragments were then pooled, cleaned, and amplified using an 18-cycle PCR. Final libraries were sequenced as single-end, 100-bp reads on an Illumina HiSeq 2000 (Illumina, San Diego, CA, USA).
Bioinformatics processing
The Cornell Institute of Genomic Diversity processed the raw sequence reads using the reference-free SNP calling pipeline, UNEAK, an extension of the Java program TASSEL 3.0 (47). Briefly, UNEAK retains all reads with a barcode, cut site, and no missing data in the first 64 bp of the sequence following the barcode. Reads were clustered into tags by 100% identity, tags were aligned pairwise, and any tag pairs differing by 1 bp were called as potential SNPs. To remove potential sequencing errors, alleles represented by fewer than five reads or a frequency of <5% were filtered out. Following processing with the UNEAK pipeline, reverse complement tag pairs were collapsed, and genotypes were recalled using the method of (48) as implemented in custom perl scripts obtained from White (49) and available at https://github.com/mgharvey/GBS_process_Tom_White. Potential paralogs were filtered out by removing SNPs with heterozygosity >0.75. Finally, SNPs for which genotype calls were missing from more than 20% of the individuals were removed. Custom python scripts (available at https://github.com/mgharvey/misc_Python) were then used to generate input files for further analysis.
Molecular analysis
SNP data were analyzed using nonparametric and model-based approaches to clustering. Multivariate methods such as PCA have the benefit of being able to detect genetic structure in large datasets without making assumptions about the underlying population genetic model (50). We performed a PCA using the “adegenet::dudi.pca” function in the R programming language, and the first five axes were retained in the analysis. As a prior step, the data were centered and scaled, and missing data were replaced with mean allele frequencies. Population genetic statistics were estimated using the R packages “adegenet” v. 2.1.3 (51) and “hierfstat” v. 0.5-7 (52).
A DAPC was then performed to identify the SNPs that differentiate parental polytmus, parental scitulus, and hybrids. A formal optimization procedure, or cross-validation, “adegenet::xvalDAPC” guided selection of the number of principal components (PCs) to retain to prevent overfitting the model. The cross-validation was performed using 30 repetitions and showed that the lowest root mean square error was associated with 30 of 144 retained PCs (Fig. S1). The variable contributions (loadings) to the first discriminant function in the DAPC were examined to identify which SNPs best differentiate among the three a priori groups. SNPs whose allelic loadings were >0.001, an arbitrary threshold, were considered informative for ancestry, hereafter “ancestry-informative” SNPs. A separate DAPC analysis was performed on just the parental taxa, excluding hybrids as an a priori group.
Model-based clustering analysis was performed on the entire SNP dataset using the program “faststructure 1.0” (53), an algorithm for inferring structure from large SNP databases. The number of populations was set to K = 2, and a simple prior was used. A second analysis was performed using just the ancestry-informative SNPs (n = 23) implemented in the program “structure” v. 2.3.3 (54). The number of populations was set to K = 2, and default priors were used. The results for both analyses were visualized using the program “distruct” v. 2.2 (55).
The MtCR haplotypes were analyzed using an AMOVA (56) to determine the distribution of genetic diversity within and among polytmus, scitulus, and hybrid individuals. A median-joining network (57) was used to visualize the relationships among haplotypes. The median-joining network was constructed setting α = 0 and was implemented in “popart” (58).
Genomic mapping
Streamertail hummingbirds, such as many nonmodel organisms, lack a reference genome. However, the highly conserved nature of avian genomes (59) enabled the mapping of the anonymous SNP loci to the reference genome for Anna’s hummingbird (C. anna; bCalAnn1_v1.p) (60). Alignments were performed using the BLAST+ algorithm (61) with a low E-value (E = 3). The results were annotated using Blast2Go (62). Code for these steps is given in Table S4. We pursued annotation of genes of interest using the Panther Classification system (Panther v. 14.0, http://www.pantherdb.org).
Following methods in (63), we looked for an association between the 23 ancestry-informative SNPs and the Z chromosome. All 6,451 SNPs in the dataset were binned as “Z,” “autosome,” “unplaced scaffold,” or “unknown” based on the mapping results. The data were then parsed to include only SNPs that mapped successfully to a known chromosomal locality. A χ2 test was employed to determine whether the 21 ancestry-informative SNPs that mapped to a known chromosomal locality were located on the Z chromosome more often than expected by chance. The results of these tests were visualized using the mosaic plots function in the R package “vcd” v. 1.4-8 (64).
Phenotypic analysis
Bill color was measured from digital photographs of specimens following methods in (21). Various hybrid indices have been developed to describe bill color in Trochilus (13, 20, 21). We used a five-category index developed by G.R.G. that yields repeatable classification from photographs (14). “Category 1” corresponds to entirely black bills, and “category 5” corresponds to bills that are bright red and narrowly tipped in black. Separate hybrid indices were used for males and females due to sex-specific differences in the brightness and amount of red coloration in the rhamphotheca (Fig. 4). Individuals scored as 2–3 were defined as phenotypic hybrids.

A) Red-billed streamertail (Trochilus polytmus) male in definitive basic plumage (photo credit: Sam Woods). B) Five-category hybrid index based on bill color in dorsal aspect. Categories IV and V correspond to polytmus, categories II and III correspond to hybrids, and category I corresponds to scitulus. C) Bill width (mm) variation in males (left panel) and females (right panel).
Bill width was measured at the anterior extension of nasal feathers to the nearest 0.01 mm from digital photographs of specimens. Independent measurements of bill width were made from each of the two photographs for each specimen. The average of the two measurements is reported. The use of digital photographs circumvents distortion issues that arise during specimen preparation and the process of drying (14, 21). Bill distortion can be especially pronounced in hummingbirds because the ventral bar of the maxilla is fragile and can flex when tied shut (65). Finally, juvenile status was assessed by examining the digital photographs of specimens for obvious striations (40).
Geographic cline fitting
Geographic cline models (Table S6) were fit to molecular and quantitative traits of interest using “hzar” v. 0.2-5 (66) a cline fitting program developed in the R programming language (67). All cline models were compared graphically and statistically to a null model in the post processing state. Three molecular models were fit to allele frequencies associated with the 23 ancestry-informative SNPs. These molecular models differ in trait interval used (pmin, pmax), and/or with respect to fitted tails: none (no tails fit), or both (two tails with independent parameters).
Five quantitative trait models were fit to ancestry coefficients returned by the clustering analysis implemented in “structure” of the 23 ancestry-informative SNPs (n = 145, sexes pooled), measurements of bill color (n = 118, adult males only), and bill width (n = 108, adult males only). The quantitative trait models all estimate cline center (distance from locality 1, c) and cline width (1/maximum slope, w), but varied with respect to how the exponential tails were fitted: none, both, mirror (two tails mirrored about the cline center), left (only one fitted tail on the left side of the cline), and right (only one fitted tail on the right side of the cline).
The “hzar” program requires data to be collected along a 1D transect (66). The geographic position of each sample locality was set as the centroid (i.e. average) latitudinal and longitudinal coordinates among all individuals assigned to that locality (Fig. S5). Locality 1 (Trelawny Parish) was used as the western anchor for the analysis. The transect for the study was created by drawing a straight line between locality 1 and locality 12. The geographic positions of the individual localities 2–12 were then transformed into a list of Euclidean distances from locality 1 using the “hzar functions “hzar.map.latLongSites” and “hzar.map.distanceFromSite” (Table S2). Separately, a sensitivity analysis was performed to determine whether estimated cline parameters are sensitive to the transect angle used or population assignment of individuals (Appendix).
Cline models were fit using the Metropolis–Hastings Markov chain Monte Carlo (MCMC) algorithm using default settings for chain length (1e5), burn-in period (1e4), and thinning (100). The amount of geographic space explored by the MCMC was limited to focus on the region where the data were collected (−30, 180 km). Because of the large number of free variables for model II (n = 7), the tune parameter was reduced from 1.5 to 1.2. Convergence was assessed by visually inspecting the MCMC traces for each parameter. Cline models were compared using the Akaike information criterion corrected for small sample size (AICc) and considered the best fitting model to be the one with the lowest AICc score.
The CIs for the cline width (w) and cline center (c) and parameters were strongly correlated (adjusted r2 = 0.70, P = 8.02E−8). Conservatively, traits with joint CIs larger than the median interval size for either parameter were excluded from the statistical analysis of concordance and coincidence (Fig. S7). Statistical differences in cline parameters were determined by visual observation of nonoverlapping 95% CIs.
Neutral diffusion
Endler’s neutral diffusion equation (68) was used to calculate the amount of time (in generations) that it would take for a neutrally expanding cline to reach the estimated cline width given a dispersal distance rate: T = 0.35 (w/σ)2, where T is the time in generations since secondary contact, w is the width of a neutral cline (1/max slope), and σ is the root mean square dispersal distance per generation, a measure of natal dispersal. A generation time of 1 year was used; this is a reasonable rate for hummingbirds (69).
Dispersal has not been studied in streamertail hummingbirds and is poorly known in other hummingbird species (70). Given this uncertainty, three different dispersal estimates were used in the calculations: 0.5 km (low), 1.0 km (medium), and 1.5 km (high). Jamaica is ∼234 km long and 80 km wide at its widest point. The age of each cline was then calculated using these three dispersal estimates. The hypothesis of neutral diffusion was rejected for clines that, under neutral conditions, would have to be much younger than 70 years even when using the smallest (most conservative) dispersal estimate (0.5 km).
Acknowledgments
The authors thank the people of Jamaica for allowing them to research their national bird. Brian Schmidt, Errol Frances (1950–2011), Eric McCurbin, Matthew Brady, Walter Gray, Junior Carson, and Kenard Wilson provided critical field support. Susan Koenig, Mike Schwartz (1947–2018), and Catherine Levy offered key logistical support. Fieldwork followed protocols of the IACUC, Smithsonian Institution, under research permits issued by the NEPA of Jamaica. The authors thank Ricardo Miller, Monique Curtis, and Andrea Donaldson (all NEPA) for administration of permits from 2003 to 2014. Local assistance was provided by Bowden Pen Farmers Association and Windsor Research Center. Matthew Kweskin, Vanessa González, Rebecca Dikow, and Mike Trizna offered bioinformatics support. Donna Dittmann, Tammie Jackson, Ellen Paul, and Christopher Milensky provided assistance with specimen import and loans. Jessica Eberhard and Noor White provided valuable feedback on an earlier draft of this manuscript.
Supplementary Material
Supplementary material is available at PNAS Nexus online.
Funding
C.D.J. was supported by the LSUMNS, a Buck Fellowship (Smithsonian Institution), and a Chapman Grant (American Museum of Natural History). G.R.G. was supported by the James Bond Fund and Alexander Wetmore Fund of the Smithsonian Institution. R.T.B. was supported by NSF DEB-0841729. Some of the computations in this paper were conducted on the Smithsonian High Performance Cluster (SI/HPC), Smithsonian Institution (https://doi.org/10.25572/SIHPC).
Author Contributions
C.D.J. was involved in conceptualization, data curation, formal analysis, funding acquisition, investigation, methodology, writing—original draft, project administration, and writing—review and editing. G.R.G. was involved in conceptualization, resources, supervision, funding acquisition, methodology, and writing—review and editing. J.E.M. was involved in project administration and writing—review and editing. K.F.S. was involved in formal analysis and writing—review and editing. R.T.B. was involved in conceptualization, resources, data curation, supervision, funding acquisition, project administration, and writing—review and editing.
Data Availability
Genomic data are available at https://doi.org/10.5061/dryad.pg4f4qrrq. Custom perl scripts for processing SNP data are available at https://github.com/mgharvey/GBS_process_Tom_White. Custom python scripts used to format SNP data for downstream analyses are available at https://github.com/mgharvey/misc_Python.
References
Author notes
Competing Interest: The authors declare no competing interests.