-
PDF
- Split View
-
Views
-
Cite
Cite
Paul M Oliver, Andrew F Hugall, Audrey Prasteya, Alex Slavenko, Sabin Zahirovic, Oligo-Miocene radiation within South-west Pacific arc terranes underpinned repeated upstream continental dispersals in pigeons (Columbiformes), Biological Journal of the Linnean Society, Volume 138, Issue 4, April 2023, Pages 437–452, https://doi.org/10.1093/biolinnean/blad003
- Share Icon Share
Abstract
Upstream colonizations from islands to continents have played an important role in two major global bird radiations: the oscine passerines and the pigeons. Here, we investigate insular diversification and upstream dispersal dynamics of pigeons (Columbiformes) within the Indo-Australian Archipelago using a supermatrix fossil-calibrated phylogeny and model-based biogeographical analyses. These analyses show that the islands of Melanesia, now centred on New Guinea and considered separately from Australia, have been a centre of pigeon diversification since the Eocene–Oligocene transition (~34 Mya). Geological reconstructions are concordant in suggesting that arc terranes and continental ribbon fragments that underpin the contemporary Melanesian region might have formed extensive archipelagos for much of the Oligocene and Miocene. These islands are also inferred to have functioned as a net source of pigeon lineages for Asia and especially Australia. Arboreal fruit-eating pigeons have colonized nearby continents on multiple occasions yet show little evidence of subsequent radiation. Insular terrestrial pigeons have been largely unable to colonize Asia, and a single Miocene colonization of Australia preceded an endemic radiation. Upstream dispersal may well be a frequent process in the history of the Indo-Australian Archipelago and surrounds, however ecological and environmental factors likely place strong constraints on its success and evolutionary outcomes.
INTRODUCTION
Theoretical predictions (MacArthur & Wilson, 1967) and empirical data (Crottini et al., 2012; Economo et al., 2015) suggest that islands are rarely a source of diversity for continents. This broad pattern is thought to be underpinned by varying combinations of the typically lower diversity of species on islands (Matthews et al., 2019), the tendency for insular taxa to lose adaptations for dispersal or competition (Bellemain & Ricklefs, 2008) and the putatively higher rates of extinction in less stable island systems (Whittaker et al., 2017). Given that biotic communities on larger continents are often more species rich, they are also predicted to generate higher numbers of outwards colonists (MacArthur & Wilson, 1967) and act as a strong ecological filter against upstream colonization (White et al., 2021). However, running counter to these broad predictions, phylogenetic analyses and ancestral state reconstructions have demonstrated that some continental lineages have insular origins (upstream colonization or reverse colonization) (Carine et al., 2004; Filardi & Moyle, 2005; Bocek & Bocak, 2019), especially in the Caribbean region (Nicholson et al., 2005; Bellemain & Ricklefs, 2008). Understanding where, when and how islands contribute diversity to continents has the potential to provide broader insights into the factors that shape biotic dispersal and colonization (Bellemain & Ricklefs, 2008).
The Indo-Australian Archipelago (IAA) spans from the Sundaland continental promontory to New Guinea and is also referred to as Malesia (as opposed to Melanesia). This region provides a prominent example of putative upstream colonization with subsequent radiation in the global radiation of oscine songbirds, which comprises > 4000 extant species, all hypothesized to be descended from ancestors that occurred on islands of the IAA (Jønsson et al., 2010; Moyle et al., 2016). Upstream colonization of Southeast Asia from the IAA has also been inferred recently in two insect lineages (Letsch et al., 2020; Bank et al., 2021). The original geographical source of these upstream colonizations has been linked variably to terranes accreted to the northern edge of the Australian continent, often referred to as ‘the proto-Papuan Archipelago’ (Jønsson et al., 2010), or to the islands of ‘Wallacea’ (Moyle et al., 2016) (see more discussion of the geology of these regions below). These results emphasize the potential significance of the islands of eastern IAA as a source for diversity on nearby continents. However, the dynamics of insular diversification and island–continent dispersal across this region remain poorly understood.
Pigeons and doves (Columbiformes) are a moderately diverse global radiation of fruit- and seed-eating birds (~350 nominal species) that also show a strong signature of insular diversification (Cibois et al., 2014, 2017) and upstream colonization (Lapiedra et al., 2021). The islands of the IAA, and especially New Guinea and surrounds, are a world centre of pigeon ecological and lineage diversity and were even more diverse before human settlement across the Pacific (Steadman, 2006; Steadman & Takano, 2020). Conversely, the oldest known Columbiform fossils are from early Miocene deposits in Australia (Worthy, 2012) and New Zealand (Worthy et al., 2009), and on this basis it has been suggested that pigeons might have originated in Australia or on nearby islands (Low, 2014). To gain a better understanding of the temporal and spatial dynamics of pigeon dispersal and diversification between islands and continents in the IAA, we assembled a supermatrix dated phylogeny for the pigeons, and from this we estimate the timing and geographical centres of insular pigeon diversification, with a particular focus on the potential role of island arc terranes in the IAA in the early radiation of the group, and the relative frequency and direction of dispersal events between major island groups and nearby continents.
MATERIAL AND METHODS
Geological context
The geological history of the IAA is complex. Here, we focus on summarizing geological inference on the history of three broad clusters of putative island systems that might have formed to the north of the Australian continent and that have been hypothesized to have a role in regional biotic diversification and dispersal through the Oligo-Miocene. These three broad clusters are as follows: (1) island arc and continental ribbon terranes that might underpin contemporary archipelagos spanning from the Philippines through to Melanesia (Oliver et al., 2018b); (2) the proto-Papuan Archipelago (Jønsson et al., 2010); and (3) the islands now clustered into the contemporary region of Wallacea (Moyle et al., 2016). Owing to the complex history of the IAA, some components of these clusters are not mutually exclusive. For example, the southward migrating Caroline Arc has accreted onto the northern edge of New Guinea (e.g. Zahirovic et al., 2016). Likewise, the Halmahera Arc has migrated west from the New Guinea region into the contemporary Wallacean region (e.g. Hill & Hall, 2003).
In this paper, when we refer to the region of Melanesia we include the island of New Guinea, reflecting the broader contemporary conceptualization of the region across biotic and cultural spheres. We note that this broad definition differs from some key literature on birds, which uses a more restrictive definition for Melanesia that excludes New Guinea (e.g. Mayr & Diamond, 2001).
For the broad history of island terranes in the IAA, we focus here on plate reconstructions by Zahirovic et al. (2016) but recast into a pure palaeomagnetic reference frame from the study by Merdith et al. (2021) using GPlates (www.gplates.org). This allows us to preserve relative plate motions but remain true to ‘true’ geographical palaeolatitudes that are relevant to climatically sensitive analyses, such as those required for palaeobiogeography. The distributions of shallow marine, emergent land and elevated topographic regions are modified from palaeogeographical reconstructions presented by Cao et al. (2017). Geological reconstructions can also be accessed at: doi:10.5061/dryad.nk98sf7x7
Island arc terranes and continental ‘ribbon terranes’
Continental ribbon terranes probably had their origin in the convergence zone between the Eurasian, Indo-Australian and Pacific tectonic plates. The oldest parts of the Philippine Archipelago are likely to have formed 160 Mya along the northern margin of Australia (Deng et al., 2015; Zahirovic et al., 2016). In addition to these Australia-derived terranes, younger volcanic arcs formed from ongoing subduction further to the north, including the Halmahera–Caroline, the Izu–Bonin–Mariana and the Melanesian arc systems. For a broad period spanning from ~45 to 25 Mya, tectonic reconstructions suggest that the Philippine, Caroline and Melanesian arc systems might have broadly aligned to form a chain extending thousands of kilometres across the tropical South-west Pacific (Fig. 1). Within this chain, the Philippine arc terrane was most proximal to Sundaland and, potentially, provided a pathway for eastward colonization of the more distant islands of the Vitiaz Arc (itself composed of the eastward continuation of the Melanesian and Caroline arc systems).

Tectonic reconstructions for the Indo-Australian Archipelago and South-west Pacific region from the paper by Zahirovic et al. (2016). Orange demarcates putative mountain building, yellow subaerial land, light blue shallow sea (flooded continental shelf and volcanic plateau crust) and dark blue deep sea. Note the composite nature and changing configuration of Wallacea, the greatly varying position of key ribbon arc terranes and the consistent inference of an isolated proto-Papuan Archipelago across all time slices.
The proto-Papuan Archipelago
Within the IAA, a persistent area of geological uncertainty in biogeographical analyses has been around the historical extent of islands in the present-day region of New Guinea and whether these now-accreted fragments were contiguous or separated from the main Australian continent by marine barriers. Addressing these questions is complicated further by geologically long- and medium-term sea-level fluctuations of up to ~250 m since the Cretaceous (Haq et al., 1987) that might periodically isolate or connect these landmasses. Geological constructions suggest that the Sepik terrane is likely to have collided with the Australian continental margin ~50–30 Mya and generated uplift and topographic relief over the collisional time frame (Schellart & Spakman, 2015; Zahirovic et al., 2016; Mahoney et al., 2019). The Caroline Arc is likely to have collided between 15 and 5 Mya (Zahirovic et al., 2016; Mahoney et al., 2019). Progenitor islands uplifted by these respective collisions were probably either partly or entirely isolated from the Australian mainland by a shallow sea over most of the last ~50 Myr (Norvick, 2003; Golonka et al., 2006; Harrington et al., 2017). This flooding of the northern Australian continental margin has been ascribed to the role of sinking tectonic plates (Harrington et al., 2017), causing ‘dynamic subsidence’ of the margin and north-eastward tilting of the Australian continent. The proto-Papuan archipelago itself was emergent only owing to the collisions (e.g. the Sepik and Caroline arc accretions) occurring on this continental margin (Mahoney et al., 2019), leading to crustal shortening and uplift on the leading edge of the advancing Australian continent (Fig. 1).
Wallacea
This region denotes the collection of islands bound to the west by Wallace’s Line and to the east by Lydekker’s Line, including the Lesser Sundas, Sulawesi and Maluku (Fig. 1). It forms the contemporary geological and biotic bridge between the Sundaland (Eurasian) and Sahul (Gondwanan) bioregions. The islands of Wallacea are a collage of continental fragments, volcanic island arcs related to subduction and, possibly, also accreted volcanic plateaus from mantle hotspots (Zahirovic et al., 2016). Their complex tectonic evolution is evident in the network of suture zones (Zahirovic et al., 2016), indicating the consumption of ancient ocean gateways during the main phase of equatorial ocean gateway closure since the drift of Australia northwards from Antarctica (Whittaker et al., 2013). The islands between the Bird’s Head Peninsula in north-western New Guinea and the east/south-east arms of Sulawesi represent the ‘Sula Spur’, which is a continental promontory attached to the Australian continent. The arrival of these blocks in the near-equatorial region by 20 Mya marks the final closure of the Indonesian oceanic gateway and the putative onset of the main phase of biological exchange between Australia and Asia. Most of the concentration and uplift of islands in Wallacea has probably occurred subsequent to this closure (in the last 20 Myr) (Zahirovic et al., 2016). The Lesser Sunda Islands are thought to have become emergent in the last ~10 Myr, owing to the interplay of the Australia–Sundaland collision and the opening of the Banda Sea (Hinschberger et al., 2005). Parts of Maluku and Sulawesi are also built on older Australian continental crust, forming the Sula Spur (Audley-Charles et al., 1979), but have experienced extensive uplift and reconfiguration in the last 20 Myr.
Phylogenetic supermatrix assembly
We compiled a species-level supermatrix of genetic data for the Columbidae plus selected outgroups using sequence data downloaded from the National Center for Biotechnology Information (NCBI; https://www.ncbi.nlm.nih.gov/). This used specified exemplar reference gene sequences drawn from the key Columbiformes phylogenetic references, with BLASTn+organism NCBI database searches. The emphasis was on widely sampled loci used in published phylogenetic studies with a reasonable proportion of taxa and major lineages. The key references are provided in the Supporting Information (Supplementary File), and genes and GenBank accessions are listed in the Supporting Information (Table S1). Species-level taxonomy followed the International Ornithology Council World Bird List (IOC) v.9.2 (https://www.worldbirdnames.org/new/). Recent phylogenomic analyses of the neoaves (Jarvis et al., 2014; Prum et al., 2015; Reddy et al., 2017) indicated the Pteroclidae and Mesitornithidae as outgroups to the Columbidae. Given that there are patchy gene sequences for these taxa, we pooled data for each lineage to create composite family representatives, along with a composite Cuculiformes taxon as a further outgroup. Genetic data was aligned with MAFFT (v.7.245; Katoh & Standley, 2013) using the local-pair (L-INS-i) algorithm, alignments assembled into a custom Microsoft Excel database and nomenclature rationalized to IOC9.2 (with the help of cross-referencing via Wikipedia using common names). Gene trees were inferred by IQ-TREE (Nguyen et al., 2015) ultrafast bootstrap consensus (Hoang et al., 2018), using models of sequence evolution identified by ModelFinder as implemented in IQ-TREE (Kalyaanamoorthy et al., 2017). These trees were then scanned for non-monophyletic genera and species (using the custom script GTREER5), and the database was updated by excluding aberrant accessions or, in some cases, revising nomenclature. Where necessary, sequence sets were then realigned (as above).
Some long genes are routinely sequenced in fragments (e.g. RAG1 and COI); therefore, in order to maximize data for COI, we also used a consensus method whereby the alignment was reduced to a single consensus sequence per species, based on the most common base per site (with ties scored as ambiguous). In effect, this picks the most commonly sequenced sub-lineages and is a simple way to combine data and discount aberrant sequences (wrong loci/taxa etc.). These consensus alignments were then subjected to the same procedure of gene tree and genera monophyly scans as above.
We also used mitogenome data as follows. Given that gene order is preserved across the relevant taxa, we first aligned the entire mitogenomes, then excised the set of commonly used genes and added the sequences to their respective alignments. The remainder (referred to as mtg-block) was kept as a separate alignment, after deleting the non-coding D-loop region. Concatenated supermatrix sequence data then used the best (longest accepted) single exemplar sequence per gene per species. These gene alignments were then compacted by removing regions with few or no data (< 10% of taxa per gene) and ambiguous alignment regions (via GTREER5, ALISCORE v.2.2; Misof & Misof, 2009).
Two versions of the supermatrix (with and without the mtg-block) were analysed; the latter to avoid distorting the result owing to biased mitogenome sampling (missing from several key groups) and nucleotide saturation effects on relative divergence (especially for deeper outgroup lineages). Final analyses used the supermatrix without the mtg-block (because the six genes add enough well-sampled mitochondrial DNA, and results were very similar empirically). This final supermatrix comprised 247 of 344 recognized pigeon species (72% complete), including sections of four nuclear and six mitochondrial gene loci, amounting to 11 100 sites, 39% data-complete, with 1 125 420 defined bases in 1262 sequences (including 63 COI consensus) from 1527 accessions out of a total database of 3639 accessions (with 37 rejected). Of 49 IOC9.2 Columbidae genera, only three (all monotypic) were missing (Cryptophaps, Microgoura and Starnoenas).
Phylogenetic inference
Phylogenetic resolution was first explored using IQ-TREE ultrafast bootstrap consensus with the multipartition sequence evolution model selected via ModelFinder (see Supporting Information, Table S2). This was done for both datasets (with and without mtg-block) and a cut-down version of nuclear data for only 94 taxa with more than two nuclear genes. To investigate the consistency of support among genes for primary lineages, we used ASTRAL consensus species tree from a gene trees approach (rather than site-support methods, owing to the patchwork supermatrix data). This used a cut-down dataset of 59 taxa for eight genes.
Bayesian relaxed-clock analyses were performed in BEAST v.2.4 (Drummond et al., 2006; Bouckaert et al., 2014), with a lognormal clock and Yule speciation prior (each with 1/X parameterization), and using a seven-partition sequence evolution model optimized by ModelFinder (nuclear gene codon positions, intron+12S and mitochondrial codon positions; Supporting Information, Table S2). Dating calibrations comprised two Columbidae fossil constraints and two secondary outgroup priors. Rupephaps, a fossil ptilinopine pigeon from New Zealand, was applied as exponential (offset=22.0, mean=3.0) on the stem subtending Hemiphaga, Lopholaimus and Gymnophaps (Worthy et al., 2009). Primophaps, a phabine pigeon from northern Australia, was applied as exponential (offset=19.0, mean=3.0) on the main Australian radiation of terrestrial seed-eating pigeons (stem subtending Leucosarcia to Phaps) (Worthy, 2012). Two normal secondary priors, based on the study by Prum et al. (2015), were placed on the root (mean=63.0, sd=4.3) and Columbimorphs (mean=58.0, sd=4.8). Preliminary tests indicated that a birth–death speciation prior was not required above the simpler Yule prior.
We ran two independent Bayesian Markov chain Monte Carlo analyses of 50 million steps, sampling every 5000, with a burn-in of 20%, that returned all parameters with effective sample size > 500 (Rambaut et al., 2014), with near-identical maximum clade credibility (MCC) consensus tree topology, node support and dating (for the BEAST2 .xml file, see doi:10.5061/dryad.nk98sf7x7). Runs were then pooled. For the purposes of biogeographical analyses, in addition to an MCC tree, a subset of 100 trees were drawn from the BEAST posterior sample to provide a reasonable representation of the topological and branch length variation of the total posterior. Outgroup taxa were then removed (phytools R package; Revell, 2012).
Defining biogeographical regions
To define a practical biogeographical categorization of regions, we balanced capturing the complexity of our focal regions (i.e. the islands between Australia and continental Southeast Asia) with limiting biogeographical model complexity (i.e. minimizing states for which there are insufficient data to estimate parameters). We also structured regions to allow comparisons between the major continents and islands with long histories of isolation from continental landmasses and to allow some investigation of biogeographical patterns within the islands of the geologically complex IAA.
To begin with, we used 21 sub-regions or areas to describe insular and continental region pigeon diversity (Supporting Information, Table S3). Geographical range data were summarized initially from the Handbook of the Birds of the World (Baptista et al., 1997), then cross-referenced and updated based on two online databases (https://www.iucnredlist.org/ and https://www.worldbirdnames.org/new/). Grouping of areas was explored initially using turnover of phylogenetic diversity (phylobeta; Graham & Fine, 2008; Leprieur et al., 2012). Phylogenetic beta-diversity was estimated using Simpson’s phylobeta index (pβsim) using the BEAST MCC tree (after removal of outgroups) and compiled into a pairwise pβsim distance matrix (using custom script GTREER5; see doi:10.5061/dryad.nk98sf7x). Using this distance matrix, we visualized differences between regions using both hierarchical clustering (average = UPGMA) and a multidimensional scaling (MDS) two-dimensional plot (hclust and cmdscale, ‘stats’ R package; R Core Team, 2016)). Five taxa that occur in more than eight sub-regions were removed from estimation of pβsim to limit emphasis on recent connections at the expense of underlying older endemicity, which is the focus of the present study.
We then assigned taxa to broad regional categories (hereafter referred to as regions) based on modification of the pre-existing historical regionalizations (e.g. Jønsson et al., 2010, and also including the IOC9.2 breeding range zones) with information from both phylogenetic turnover (Supporting Information, Fig. S1) and geological models for the history and formation of islands around the Indo-west Pacific (see above and Zahirovic et al., 2016). This resulted in an eight-region scheme: (1) New World (North, South and Middle America); (2) Old World (Europe, Africa and mainland Asia combined); (3) Philippines; (4) Wallacea (Sulawesi and the Lesser Sundas); (5) West Melanesia (New Guinea plus the islands of Maluku); (6) Indian Ocean Islands, especially the Mascarenes and Madagascar; (7) Australia (not including New Guinea); and (8) Pacific Ocean (the islands of the South-west Pacific, including East Melanesia, New Caledonia and New Zealand). Assignment of phylogenetically sampled species to these eight regions is listed in the Supporting Information (Table S4). We note that our configuration of Wallacea does not include Maluku. This is because our phylogenetic turnover analyses indicate that much of Maluku is more closely allied to New Guinea (Supporting Information, Fig. S1A). Geological (Hill & Hall, 2003) and biological (Oliver et al., 2022) data also indicate a close relationship between the islands of Maluku and New Guinea. We separated Australia from West Melanesia in our analyses because phylobeta analyses indicate that the dominant pigeon faunas of the two regions are relatively discrete and because geological models indicate that New Guinea has been separated from the main Australian landmass by at least a shallow sea for most the late Cenozoic (see geological context above). Species that occur across regions were scored as such and allowed to have multiple states in analyses.
To investigate the role of islands in pigeon diversification and dispersal, we collated regions into broader categories of islands (a composite of the Wallacea, West Melanesia, Philippines and Pacific Ocean regions) vs. continents (Americas, Old World and Australia). This was done by post hoc combining of states and state changes inferred in the full eight-region analyses into summary results (see below).
Historical biogeography modelling
We estimated ancestral states across the pigeon radiation using two approaches: (1) a discrete-state Markov rate-based maximum likelihood (Mk-ML) model (Pagel, 1999; Huelsenbeck et al., 2003), as developed from the study by O’Hara et al. (2019); and (2) a suite of Bayesian biogeographical models as implemented in BioGeoBEARS (Matzke, 2013).
Markov maximum likelihood ancestral states inference
We inferred ancestral states using the rayDISC function in the R package corHMM (Beaulieu et al., 2013), with node.states=“marginal”, state.recon=“subsequently” and root.p=“maddfitz”. This algorithm allows multistate tip input and infers marginal likelihoods (probabilities) for each state at each node including tips, the latter being, in effect, an implied state of the common ancestor of the multistate tip (Felsenstein, 2004: 255). Options for interpreting this post-analysis tip state marginal are as follows: (1) to use these values; (2) to recode as equal probabilities to account, in part, for terminal dispersals within the tip lineage; or (3) explicitly to count all implied terminal dispersals. In our eight-region scheme, 86% of species are endemic to a single region, hence the moderate proportion (14%) of multistate species is a tolerable level for this type of ancestral state method to accommodate.
In order to select an optimal maximum likelihood (ML) model structure, we applied an approximation intended to achieve a practical fair evaluation of what can be a prohibitively large number of possible rate models (Huelsenbeck et al., 2003; Felsenstein, 2004). Using the single consensus tree, an average of equal rate (ER), symmetric (SYM) and all rate different (ARD) model rates (in order to buffer against potentially miss-specified extreme values) were rank ordered into a serially increasing number of rate category models from the simplest single rate (ER), evaluated by the Bayesian information criterion (Supporting Information, Table S5). We selected a four-rate model structure representing a balance of complexity and support, which was then used for all subsequent analyses (Supporting Information, Table S6).
Rather than produce a single best ancestral state result, we focused on broad biogeographical inferences summarizing both the model ancestral state and the phylogenetic uncertainty. To do this, we applied stochastic node state mapping, using 200 resamplings of node state marginal likelihoods, to each of 100 BEAST tree samples. Randomly sampling node states according to marginal likelihoods selects an explicit node state, and multiple resampling then represents the marginal likelihood fairly (Huelsenbeck et al., 2003; Revell, 2012; O’Hara et al., 2019). Doing this for a set of tree samples then includes phylogenetic uncertainty in the whole inference.
Transition events are summarized according to parent-to-daughter node (including tip) states; cases where the state remains the same are referred to as endemic cladogenesis. These results can be divided up by divergence age into time bins, assigned by daughter node age. For counting state lineages, branches spanning a time point can be assigned to the closest parent or daughter node state. These procedures for summarizing an explicitly resolved set of ancestral states are straightforward (Revell, 2012) and can be applied to any ancestral state method, including Dispersal Extinction Cladogenesis (DEC) models (Bribiesca-Contreras et al., 2019).
This entire procedure can be repeated on multiple trees and the entire set of transition events and state lineages per time bin combined into a final matrix integrating the tree and model variation (Bribiesca-Contreras et al., 2019). Such summary results of events per time bin are best interpreted as a summary of the relative frequency or probability density of such events rather than an explicit count. Results can be integrated further by post-hoc combination of several individual states and changes into larger summary categories, such as continental vs. islands. By summarizing the results across the whole marginal likelihood, the method accounts for multiple solutions; hence, in effect, it accounts to some degree for multistate solutions, particularly for combined region summaries.
Analyses were executed, and summarized into 2 Myr time bin profile plots, in R using a custom script, ASSMR2. For phylogenetic visualizations onto the single MCC consensus tree, we averaged marginal likelihoods across all 100 sample trees plus the consensus tree, for all nodes in common with the consensus tree (identical taxon bipartitions) and the linked parent node. Nodes (hence branches) with a maximum state probability < 67% were then marked as indecisive. Except for inclusion of the parent node, these procedures are often used (Matzke, 2013; O’Hara et al., 2019).
BioGeoBEARS
The probability of ancestral ranges was also estimated using BioGeoBEARS (Matzke, 2013) in R (R Core Team, 2016). Pigeon species were assigned to the same eight regions as the Markov ML inference. The maximum range size was set to eight regions, because some species are widespread across all regions. Other settings were left as default. We used the standard six different biogeographical models tested by BioGeoBEARS: BAYAREA, which does not allow dispersal at cladogenesis; DIVA, which allows vicariance [i.e. if parent lineage occurs in (x,y), then one daughter lineage can occur in only (x), whereas the other can occur in only (y)] but disallows subset speciation, in which one daughter lineage inherits a subset of the distribution of the parent [i.e. if parent lineage occurs in (x,y), then one daughter lineage can occur in only (x) or only (y), whereas the other can occur in (x,y)]; DEC, which allows both vicariance and subset speciation; and BAYAREA+J, DIVA+J and DEC+J, which are equivalent to the previous three models but also allow jump dispersal through founder effects at cladogenesis (Matzke, 2013). The BioGeoBEARS analysis was run using two separate phylogenetic inputs: (1) using the BEAST MCC tree; and (2) using a subset of 100 BEAST trees. Results of the BioGeoBEARS run across multiple trees were averaged. Repeated analyses were run to account for possible different topologies and poorly supported nodes in the phylogeny. In both the 100 individual trees and the consensus tree, DEC+J was the best-fitting model (weighted Akaike information criterion for the 100 trees = 1.00; Supporting Information, Table S7).
RESULTS
Our Columbidae supermatrix phylogeny (Fig. 2A; Supporting Information, Fig. S2) most closely resembles that of Lapiedra et al. (2021), which has the most data of published columbiform trees. The tree was well resolved, with the majority of nodes significantly supported [84% of nodes had ultrafast bootstrap (BS) > 95%, and 73% of nodes had posterior probability (PP) > 0.95]. However, the phylogeny shows a striking feature of relatively low resolution among the basal primary lineages. This is evident in both ML and time-calibrated Bayesian inferences (Supporting Information, Fig. S2) and when using nuclear data alone or in the low consistency among individual gene trees (Supporting Information, Fig. S3). This is exacerbated by the long stem divergence to the closest living outgroups. Including the extra mtg-block does not make any material change to the topology. The root (and implied American) origins for pigeons are therefore weakly supported, despite considerable data (all of these nodes are informed by at least nine of ten genes) and a high level of resolution elsewhere in tree. These basal nodes are associated with relatively short internodes (BS and PP with Pearson product–moment correlation coefficient to branch length > 0.5), and the shape of the overall pigeon phylogeny describes a long stem followed by rapid early radiation.

A, phylogeny and ancestral state estimations for the global radiation of pigeons based on eight biogeographical regions. Contemporary taxa that occur on continents (C) or islands and continents (IC) within the fruit-specialist Ptilinopini are numbered as follows: (1) Ptilinopus reginae (IC); (2) Ptilinopus superbus (IC); (3) Ptilinopus porphyreus (IC); (4) Ptilinopus melanospilus (IC); (5) Ptilinopus jambu (C); (6) Ptilinopus magnificus (IC); (7) Ducula bicolor/luctuosa complex (IC); (8) Ducula aenea complex (IC); (9) Ducula badia (C); and (10) Lopholaimus antarticus (C). Nodes subtending potential older upstream colonizations in other clades are indicated with question marks. The Supporting Information (Fig. S4) shows the full tree, with tip names and ancestral state reconstructions. B, C, lineage accumulation (B) and endemic cladogenesis (C) plots, for the eight biogeographical regions, with the y-axis indicating the logarithm of median values and with 50% confidence intervals indicated by coloured shading (drawn from 200 node marginal resamplings, each of 100 posterior sample trees, summarized into 20 time bins, each lasting 2 Myr).
Results on general patterns of ancestral state estimation are consistent and complementary across the Mk-ML (Fig. 2; Supporting Information, Fig. S4) and DEC+J (Supporting Information, Figs S5, S6) methods. The main difference is that Mk-ML infers greater uncertainty about the geography of early basal nodes, which is likely to be a product of more transition-rate parameters. Given the age depth, long branches, short internodes and low resolution, this is likely to be a better reflection of unclear biogeography history. Accordingly, in the subsequent summary and discussion of patterns of inference, we focus on the Mk-ML results.
All models and analyses support lineage accumulation (Fig. 2B) and endemic cladogensis (Fig. 2C) within the islands of the IAA through the Oligo-Miocene (Supporting Information, Fig. S7). Endemic cladogensis summaries of combined islands vs. combined continental regions (Fig. 3A–C) indicate that the island systems have been continuously generating in situ diversity since this time, including a peak relative to continents after a rapid accumulation of lineages in the early Oligocene. Three major and strongly supported clades (Ptilonopini, Raphini and Phabini) are each largely endemic to or centred on the eastern IAA and group together with weak support into a large insular superclade in Bayesian analyses (Fig. 2A). The cuckoo doves (Macropygia and allied genera within the Columbini) also diversified on IAA islands (Supporting Information, Fig. S4), but are much younger.

Columbidae continental (Australia, Old World and Americas) vs. insular (Indian Ocean, Wallacea, Melanesia, Pacific Ocean and Philippines) biogeographical model summaries. A, insular (green) vs. continental (red) endemic cladogensis. B, continental vs. insular lineage accumulation. C, inferred upstream (insular to continental) vs. downstream (continental to insular) dispersals. D, colonizations into and out of Australia, juxtaposed against estimates of endemic cladogensis in Australia (dashed line). The y-axes in A, B are as for Figure 2. The y-axes in C, D show the mean and SD, and endemic cladogenesis in D is scaled arbitrarily to fit the figure.
West Melanesia (comprising New Guinea and Maluku) is the most frequently inferred insular state throughout the Oligo-Miocene, and is also the most frequent regional state for the entire Columbiformes through the Oligocene (Fig. 2B). The Rhaphini and Ptilinopini reconstruct to West Melanesia, whereas the ancestral state for the Phabini is split across Wallacea, the Pacific and West Melanesia (Fig. 2). Three endemic pigeon radiations on the Philippines are inferred to be of Miocene age and largely derived from other insular lineages (Fig. 2). The Pacific and Indian Oceans show no endemic clades older than the Miocene.
Insular, Old World and Australian pigeon faunas are relatively discrete; however, we inferred a number of exchanges between continental and insular regions. Upstream colonizations are more frequent than downstream colonization until the late Miocene (Fig. 3B). In the Oligocene, many reconstructions infer an initial upstream shift into the Old World (Eurasia and Africa) at the base of weakly supported clade spanning Treron–Turtur (Fig. 2A). An alternative hypothetical reconstruction has the Treron and Oena/Turtur ancestral lineages independently colonizing the Old World, contributing a separate mid-Miocene peak in inferred upstream shifts. In the early Miocene, the mainly terrestrial Phabines colonized Australia from islands (Fig. 3D). From the mid-Miocene onwards, at least ten lineages within the Ptilinopini are inferred to have colonized Australia or Sundaland independently (Fig. 2A). A majority of these dispersals are inferred in widespread and vagile species or species complexes that occur across both continental and insular regions.
In contrast, there is a negligible signal of downstream colonization until the late Miocene (Fig. 3C), when the ancestor of the cuckoo doves (Macropygia and allied genera) shifted from continental to insular regions (Fig. 2A); whether this colonization occurred from the Old or New World is ambiguous. Through the late Plio-Pleistocene, the overall number of inferred colonization events increases, and downstream events begin to outnumber upstream events, with many shifts linked to widespread species or species complexes that occur across both continental and insular regions in genera such as Macropygia, Ptilinopus, Streptopelia and Treron.
DISCUSSION
Before discussing our results in detail, we note some important caveats for biogeographical inference. Limitations of our phylogeny (70% sampling and low resolution at basal nodes) and historical reconstruction methods mean that we can give an accurate summary of trends, but not detailed reconstructions of all events. Extinction also has the potential to significantly confound inference of historical patterns from contemporary taxa (Sanmartin & Mesegeur, 2016; Louca & Pennell, 2020). In Columbiformes, the very long stem branch strongly alludes to a large amount of extinction around the mid-Cenozoic (see below). The more recent and node-dense basal radiation and subsequent lineage diversifications do not immediately suggest (hidden) extinction, and BEAST estimation of extinction in a birth–death model was very low (< 0.2; not significantly different from zero; results not shown). However, some of the ostensibly implied quixotic long-distance dispersals (see below) might be understood better if we infer hidden extinct intermediates, especially if these were to be associated with relatively small and unstable island systems (e.g. early dispersals involving South America or the colonization of the IAA by the cuckoo doves). Accordingly, in the following discussion we place the greatest weight on historical inferences that are strongly supported across all analyses and that show concordance with independent environmental or geological reconstructions or biogeographical signatures that emerge from other taxa.
Upstream colonization from the Indo-Australian Archipelago
Although there is strong evidence that upstream colonization from islands to continents does occur, it has, until recently, been considered an anomalous trajectory (Bellemain & Ricklefs, 2008). The islands of the IAA appear to be the source area for at least one global radiation of birds (Jønsson et al., 2010). Recent macroevolutionary analyses of Columbiformes have also emphasized the importance of islands in their early evolutionary history, but did not focus on spatial or temporal patterns of dispersal. Conversely, other data indicate that the biota of continental Asia serves as a strong filter on upstream colonization either by blocking successful colonization completely (Oliver et al., 2018a; White et al., 2021) or by strongly filtering taxa by ecology (Letsch et al., 2020).
In congruence with the work of Lapiedra et al. (2021), we found that in the extant radiation of Columbiformes upstream overwater dispersals from the IAA dominated downstream colonizations through much of the Oligo-Miocene. The inferred importance of dispersal from islands in the IAA is particularly striking given their comparatively small areal extent and more turbulent geological history when compared with nearby continents (Figs 1, 2). Many lineages of pigeons are very good flyers and capable of long-distance overwater dispersal (Baptista et al., 1997), possibly in conjunction with small founder population size effects. The importance of long-distance dispersal in pigeon biogeography is also supported by the much higher likelihood of the DEC model with added J parameter (jump dispersal). This also makes the DEC-J model more like the Mk-ML model, which allows point transition from one ancestral state to another.
The regional geography and ecological context of the earliest inferred upstream shifts into the Old World are unclear, with the alternative hypotheses being one or two shifts from islands into the broader ‘Old World’ in a weakly supported clade spanning Treron–Turtur. The Oligo-Miocene timing and direction of these colonizations are potentially comparable with upstream colonization of the Oscine passerines (Jønsson et al., 2010; Moyle et al., 2016), although the outcomes in terms of diversification differ starkly (< 40 species vs. > 4000). Overlooked extinction could explain some uncertainty around these early inferred upstream dispersal events. Ideally, a more resolved fossil record spanning both continents and islands would be needed to shed more light.
The geographical, temporal and ecological context of more recent inferred upstream colonizations in Columbiformes is clearer. The specialist fruit-eating arboreal pigeons (fruit doves) of the Ptilinopini unequivocally reconstruct as having insular origins and have colonized nearby continents on at least ten separate occasions (and probably more if missing taxa, such as Ptilinopus alligator, and intraspecific populations are added to the phylogeny). Ecological niche shifts to arboreality in the ancestor of the Ptilinopini underpinned extensive radiation in the IAA and Pacific (~100 species) and, potentially, also predisposed them to colonize nearby continents repeatedly (Lapiedra et al., 2021). In contrast, speciation within continental regions appears to have been very limited. Indeed, around half the inferred upstream shifts involve species-level taxa that occur across islands and continents. Furthermore, most Ptilinopini in continental areas (and especially Asia) remain closely associated with coastal habitats and islands (e.g. Ducula aenea, Ptilinopus jambu and Ptilinopus melanospilus) or other habitats with often less biodiverse communities, such as montane areas (e.g. Ptilinopus porphyreus and Ducula badia) (Baptista et al., 1997). This conforms with the hypothesis that upstream colonization is challenging (White et al., 2021), and even when frequent, ecological filtering might limit new arrivals to comparatively species-poor environments on ‘mainlands’ (Mayr & Diamond, 2001).
Upstream colonization by terrestrial-feeding Raphinini and Phabini has been even more limited, again suggesting ecological filtering at continental margins. Raphini are entirely restricted to islands, despite their evident history of dispersing across vast distances into the Indian and Pacific oceans. The one striking outlier is provided by the Phabini, which are inferred to have colonized Australia from islands in the early Miocene and subsequently radiated across the continent. The relative success of this upstream colonization could have been mediated by major climatic changes (aridification) and the concomitant opening up of niches and expansion of key food sources for terrestrial taxa, especially grasses (Toon et al., 2015). Like the Raphinini, the Phabini have been largely unable to colonize continental Asia. The only exception is in the genus Geopelia, which originated in Australia, but has probably colonized the expanding savannahs of the Sundaland only recently, during the Plio-Pleistocene. Thus, our data support the hypothesis that islands of the IAA have been a net source area for continents (Lapiedra et al., 2021), while simultaneously indicating that continents (and especially the lowland rainforests of Asia) present a formidable ecological filter to upstream colonization (White et al., 2021).
Mid-Cenozoic insular radiation in the Indo-Australian Archipelago
Temporal and spatial patterns of insular diversification within the IAA are poorly understood and have, at times, been contentious (Jønsson et al., 2010; Moyle et al., 2016). A major challenge is that although geological reconstructions suggest the position of key geological terranes, it is often unclear whether these were above or below water (Cao et al., 2017). A further issue for biogeographical analysis is that the contemporary proximity of geological features differs greatly from palaeogeographical configurations. Differing geological, biogeographical and cultural delineations of key regions have potentially also confounded analyses and comparison. For example, New Guinea is often not considered part of Melanesia (Mayr & Diamond, 2001) and is sometimes even lumped together with Australia as a single landmass of Sahul. Additionally, high rates of extinction, turnover and dispersal in lineages associated with smaller and unstable islands have the potential to mask deep insular origins further. For the early basal nodes, the Mk-ML model infers greater uncertainty than DEC-J, at least better reflecting the uncertainties in inferring this biogeographical history.
However, even after considering the caveats above, our analyses emphasize that islands in the east of the IAA, here considered to be part of the broader Melanesian region, have been a hotspot of pigeon evolution and speciation since the Oligocene (Fig. 2; Supporting Information, Figs S6, S7). Similar deep (mid-Cenozoic) insular origins have now been inferred for several other radiations centred on Melanesia, strongly indicating that key terrane complexes (Vitiaz Arc and Sepik Arc) and/or the proto-Papuan region have probably been shaping diversity across the IAA since the mid-Cenozoic (Aggerbeck et al., 2014; Oliver et al., 2018b; Bank et al., 2021; Supporting Information, Table S8). Wallacea to the west also appears to be a potentially important source of upstream colonists for pigeons, although this smaller region shows younger and weaker signals of lineage accumulation and endemic cladogenesis (Fig. 2B, C), and geological evidence suggests more recent arrival and uplift of key geological features (Zahirovic et al., 2016). To the north-west, the Philippine pigeon fauna (Supporting Information, Fig. S7B) also appears to be relatively young and derived when compared with that of Melanesia. In this context, we suggest that terranes now underpinning the contemporary Melanesian region might have been the centre of early pigeon diversification in the IAA. However, in light of the extreme geological dynamism of the region and the mobility of pigeons, more detailed speculation is likely to be unrewarding.
Although the importance of Melanesian islands in the early diversification of pigeons is strongly supported, the ultimate source from which this region was colonized is obscured by the long stem lineage, the apparent rapid radiation at the base of the Columbiformes and the sparse fossil record. This contrasts with most other old insular radiations in the IAA, for which biogeographical analyses typically point clearly to either Australian or Asian origins (Supporting Information, Table S8). One intriguing potential scenario for pigeons is South America, with westward dispersal to island arcs, leading to populations on quasi-continental fragments (Melanesia, Australia/Zealandia and Philippines) and later dispersal to Old World. This is speculative, but we suggest that is as consistent with the current information on distribution, diversity, phylogeny, dispersal ability, biogeography and the fossil record of pigeons as any other inference. At least one other Pacific lineage also shows evidence of deep ‘out-of-South America’ origins (Malone et al., 2017), showing that long-distance eastward migration across the Pacific is possible, if not common.
The rapid early radiation of pigeons around the mid-Cenozoic also mirrors the estimated timing of initial diversification in two lizard radiations centred in a similar manner on island arcs in the IAA (Oliver et al., 2018b; Slavenko et al., 2022). This concordance might reflect the time frame when island arcs were especially conducive to biotic colonization and diversification; for example, the putative Philippine–Caroline–Vitiaz arcs (Zahirovic et al., 2016). An alternative, or even complementary, explanation is a shared legacy of global climatic shifts linked to changes in Southern Ocean circulation patterns (Zachos et al., 2001). Analyses of plant (Nge et al., 2020) and reptile (Oliver & Hugall, 2017) lineage diversification patterns in Australia, patterns of lineage diversity in mammal faunas globally (Stadler, 2011) and substantial turnover in marine and terrestrial fossil records (McGowran et al., 2004; Sun et al., 2014) all suggest the early Oligocene as a time of profound biotic turnover.
Challenging the cohesiveness of ‘Sahul’
Growing evidence for the biogeographical significance of terranes in the IAA also emphasizes that considering Australia and New Guinea together as ‘Sahul’ might often mask two distinct mid-Cenozoic centres of diversification with independent histories of isolation. One probably consisted of tropical and mountainous island arcs and/or the proto-Papuan Archipelago to the north. Many taxa with poor overwater dispersal abilities were probably absent (e.g. most terrestrial mammals), potentially setting the stage for new patterns of ecological release and diversification in pigeons (Lapiedra et al., 2021) and in other lineages that were able to colonize (Aggerbeck et al., 2014; Oliver et al., 2014; Tallowin et al., 2020; Roycroft et al., 2022). The other potential centre of diversification is the main subaerial portion of the Australian continent to the south, which was probably more arid and more temperate. Historical differentiation across these regions might be reflected in the contemporary diversity of Australian pigeons. On the one hand, the Australian phabin radiation appears be associated with the second region and is centred on temperate forests, woodlands and deserts. On the other hand, the tropical rainforests of Australia are dominated by lineages of Ptilinopini that are largely nested within insular radiations, suggesting a recently assembled fauna dominated by upstream colonists.
This growing evidence that a cohesive ‘Sahul’ might be a relatively recent geological feature has important implications for biogeographical analyses. First, analyses of ‘Sunda–Sahul biotic’ exchange should account for the probability that for much of the last 35 Myr or more, the contemporary geological features that compose ‘Sahul’ were probably spread across two environmentally and geographically discrete regions separated by sea barriers. This would explain the often striking differentiation between the Australian and Melanesian biotas, especially in taxa with tropical or Asian origins (Joyce et al., 2021b; Oliver et al., 2022). Second, considering Melanesia, and especially New Guinea, as separate from Australia might also set the stage for a more holistic understanding of the longer-term role that island-to-continent dispersal has played in the assembly of the Australian biota, and especially the Australian rainforest biota. As a case in point, pigeons are key dispersers of rainforest fruit in the rainforests of Australia (Crome, 1975), but here we provide evidence that the Australian fruit-specialist pigeon fauna is relatively young and derived from Melanesia (Fig. 3D). If fruit-specialist pigeons are indeed recent colonists, this might have important implications for understanding the assembly and evolutionary dynamics of the Australian rainforest flora, especially the recent influx of many tropical rainforest plant species from the north (Sniderman & Jordan, 2011; Joyce et al., 2021a).
ACKNOWLEDGEMENTS
We thank Trevor Worthy for his advice on the age and placement of key pigeon fossils. We also thank the two anonymous reviewers for their constructive comments. This work was supported by funding from the Centre for Biodiversity Research at the Australian National University and by Broken Hill Proprietary (BHP) as part of Project Digital Infrastructure Growth (DIG) at the Queensland Museum and by Broken Hill Proprietary (BHP) as part of Project Digital Infrastructure Growth (DIG) at the Queensland Museum. S.Z. was supported by Australian Research Council grant DE210100084 and Alfred P. Sloan grants G-2017-9997 and G-2018-11296. GPlates development is funded by the AuScope National Collaborative Research Infrastructure System (NCRIS) program. The authors declare no competing interests.
DATA AVAILABILITY
All customs scripts (GTREER5.sh and ASSMR2.R), the supermatrix data and trees, and the geological data are available in the Dryad depository (doi:10.5061/dryad.nk98sf7x7).
SUPPORTING INFORMATION
Additional supporting information may be found in the online version of this article on the publisher's website.
Table S1. Pigeon 250-taxon supermatrix loci and GenBank accessions.
Table S2.ModelFinder sequence evolution optimal partition model.
Table S3. Areas used in initial biogeographical clustering analyses.
Table S4. All sampled pigeon taxa and the areas in which they occur.
Table S5. Markov maximum likelihood ancestral state biogeographical model selection.
Table S6. Rate matrix model structure.
Table S7. BioGeoBEARS ancestral state reconstruction models.
Table S8. Ages and distributions for old crown radiations identified in the South-west Pacific.
Figure S1. Twenty-one-area phylobeta (pβsim) clustering analyses.
Figure S2. Columbidae supermatrix IQ-TREE and BEAST maximum clade credibility phylogenies.
Figure S3. Columbidae supermatrix phylogeny major lineage support.
Figure S4. Columbidae eight-region Markov rate-based maximum likelihood ancestral state tree.
Figure S5. Columbidae eight-region BioGeoBEARS analysis.
Figure S6. DEC ancestral state diversity profiles.
Figure S7. Region-specific colonization summaries.