Abstract

The bushbuck (Tragelaphus scriptus) is an antelope widely distributed throughout sub-Saharan Africa that varies greatly in body size, horn length and coat patterning. Recent taxonomic work has suggested reclassification as two or more species. Molecular studies have shown a deep mitochondrial divergence between north-western (scriptus) and south-eastern (sylvaticus) bushbuck populations, and morphological assessments, based largely on coat patterning, have divided the bushbuck into as many as eight species and 32 subspecies. Molecular and morphological approaches have, to date, not been brought together in a single integrative study. Using 43 museum cranial specimens, we sequenced complete mitochondrial genomes and used three-dimensional geometric morphometrics to compare genetic and phenotypic divergence between the two main subclades of the T. scriptus species complex. We found a significant but weak relationship between phenotypic and genetic distances. Mean body size differed between males of each subclade, but morphological differences attributable to sexual dimorphism within subclades were greater than differences between subclades. Despite a mitochondrial divergence of nearly 3 Myr, cranial morphological differences between subclades were minor and could be accounted for largely by minor size differences between males (size–shape allometry). These results support the recognition of a single bushbuck species, as indicated by nuclear genetic data.

INTRODUCTION

For millennia, morphological techniques have served as the primary methodological approach for testing taxonomic hypotheses. In only a few decades, increasingly inexpensive and accessible DNA sequencing techniques have enabled molecular systematics largely to supplant morphological analyses in taxonomic and phylogenetic research (at least on extant or recently extinct taxa). In particular, high-throughput sequencing and improvements in the successful isolation of ancient DNA material have created new opportunities for researchers working with molecular data from specimens in historical collections. However, the last decades have also seen the development of modern multivariate statistical methods in the analysis of morphology, with increases in computational power and the recent development and distribution of hardware and software for handling three-dimensional (3D) modelling and data. Three-dimensional geometric morphometric methods have reinstated quantitative morphological methods in modern zoological studies. Considering molecules and morphology, each of these methodological frameworks presents its own biases and limitations to the evolutionary biologist (Padial et al. 2010; Schlick-Steiner et al. 2010, Catalano et al. 2014). In order to overcome the difficulties of molecular or morphological research alone, researchers are now turning to integrative taxonomic methods.

Integrative taxonomy can provide systematics research with a more nuanced perspective of complex evolutionary histories (e.g. hybridization, the identification of cryptic species, reconciliation of gene discordance) than individual approaches alone (Chaplin et al. 2020). Technological and methodological advances have allowed new combinations of previously incompatible data types (e.g. shape data in geometric morphometrics with haplogroup data from mitogenomics; Gray et al. 2022). The production of 3D digital models and whole-genome sequencing (and the large quantities of data that these methods produce) have become easier to manage and analyse. Recent studies (e.g. Heckeberg 2020, Petzold and Hassanin 2020) have combined multiple lines of evidence successfully to investigate questions of species delimitation and patterns of evolutionary history.

In this study, we use integrative taxonomic methods to combine morphological and molecular data: 3D landmarks from surface scans and mitogenomes assembled via state-of-the-art molecular approaches, both from the same set of historical specimens. We address patterns of variation in different data types and recent evolutionary history of a widespread antelope species complex to resolve a long history of taxonomic conflict.

The bushbuck [Tragelaphus scriptus Pallas, 1766] is a medium-sized antelope that lives across most of sub-Saharan Africa and has adapted to a wide variety of habitats across its range. These adaptations have produced much variation in phenotypic characteristics, such as horn length and shape, body size, and coat patterning (Kingdon 1982). This phenotypic variation has motivated more than a century of morphological research on the taxonomy of the species complex. These studies have produced myriad classification systems suggesting as few as one species (Plumptre and Wronski 2013) and up to eight (Castelló 2016), or as many as 27 subspecies (Allen 1939).

Many of these morphological studies, regardless of the relative granularity of their taxonomic classifications, have observed a phenotypic split [defined either as separate species (Groves and Grubb 2011, Castelló 2016) or subspecies (Plumptre and Wronski 2013)] based on variations in coat patterning among populations in north-west and south-east Africa, with the presentation of two distinct phenotypes split roughly across the East African Rift Valley. Kingdon (1982) described both ‘harnessed’ (‘T. s. scriptus’) stock in the north-west and ‘sylvan’ (‘T. s. sylvaticus’) in the south-east, in addition to six other subtypes, but also noted the potential for much phenotypic variation even at a single location. Castelló (2016) elevates eight types of T. scriptus to species status: four are classified in the ‘scriptus’ group and the remainder in the ‘sylvaticus’ group. Drawing on a combination of data including coat patterning, linear cranial measurements and mitochondrial DNA (mtDNA) studies, Groves and Grubb (2011) describe eight species (three in a ‘T. scriptus cluster’; five in a ‘T. sylvaticus cluster’).

Interestingly, recent molecular studies of the bushbuck have described a similar north-west–south-east continental division between mitochondrial sequences. Indeed, two subclades of bushbuck haplogroups are divergent in mitochondrial phylogenies of the genus in that they sort as non-monophyletic clades, separated from one another by other tragelaphine species (Willows-Munro et al. 2005, Moodley et al. 2009). These mitochondrial subclades are also roughly divided by the Rift Valley into north-western and south-eastern populations (Moodley and Bruford 2007, Hassanin et al. 2018). In this study, we refer to the north-western mitochondrial subclade as the scriptus subclade and the south-eastern as the sylvaticus subclade (after Moodley and Bruford 2007).

Investigating this mitochondrial division, Hassanin et al. (2018) hypothesized that the mtDNA of the scriptus subclade was acquired by introgression after hybridization in the Early Pleistocene between T. scriptus and an extinct species closely related to Tragelaphus angasii Angas, 1849Angas 1848. More recently, a study combining mitochondrial and nuclear sequence data (Rakotoarivelo et al. 2019a) showed that this rift in the mitochondrial data is also present in the nuclear phylogeny, although to a much lesser extent. Notably, however, the nuclear data in that study support the monophyly of T. scriptus, despite the differing number of chromosomes between the two subclades [cited by Hassanin et al. (2018) as evidence for two species]. For Rakotoarivelo et al. (2019a), the hybridization event hypothesized by Hassanin et al. (2018) reconciles the discordance between mitochondrial and nuclear phylogenies and supports the hypothesis of a single species, with many zones of ongoing secondary contact between the two subclades.

Despite the apparent biogeographical correlation between phenotypic and genetic variation in the bushbuck, no study to date has integrated morphological and molecular data from the same specimens. Prior morphological taxonomic research on the bushbuck has been restricted to comparison of coat patterning and a few linear measurements (Groves and Grubb 2011); molecular studies of both mitochondrial and nuclear DNA have used only short segments. Here, we combine morphological and molecular data (3D cranial surface landmark data and whole mitochondrial genome sequences) drawn from the same set of specimens to determine whether patterns of variation in cranial morphology reflect the deep divergence observed in the mitochondrial subclades. We took advantage of improvements in ancient DNA extraction techniques and state-of-the-art molecular methods to enrich and sequence whole mitochondrial genomes from the same historical specimens from which we collected morphological data. Leveraging historical collections to study patterns of both genetic and phenotypic diversity using the same specimens, we highlight the importance of historical museum collections as repositories of biodiversity across space and time. This publication also serves as a case study for contemporary integrative taxonomic work, combining current morphological and molecular methods to clarify the complex evolutionary history and taxonomic status of the bushbuck. Using these integrative methods, we tested for correlation between phenotypic and genetic variation and determined that the two mitochondrial subclades of bushbuck cannot be distinguished by their cranial morphology.

MATERIALS AND METHODS

Geometric morphometrics

Specimens

We performed 3D geometric morphometric analyses on 43 adult bushbuck crania (27 males and 16 females) from the zoological collections of the Museum für Naturkunde, Berlin (Table 1). Thirty-nine of the specimens had location information (Fig. 1), broadly representing the modern range of the bushbuck. Twenty-two specimens had collection dates, which ranged from 1897 to 1938.

Table 1.

Bushbuck (Tragelaphus scriptus) cranial specimens used in the analyses. Specimens without GenBank accession numbers were not sequenced or did not yield sufficient mitochondrial DNA; NA subcade refers to these specimens. Subclade taxonomy was determined by molecular data (see Fig. 4). DRC refers to the Democratic Republic of the Congo.

IDSpecimen numberSexCountryYearGenBank accession no.Subclade
1ZMB Mam 106581MaleNA
2ZMB Mam 106591MaleCameroonON474903Scriptus
3ZMB Mam 106592MaleAngola1903NA
4ZMB Mam 106597MaleCameroonON474904Scriptus
5ZMB Mam 106604MaleGhanaON474905Scriptus
6ZMB Mam 106608MaleMozambiqueNA
7ZMB Mam 106609MaleON474906Sylvaticus
8ZMB Mam 106620MaleAngola1902ON474907Sylvaticus
9ZMB Mam 106625MaleCameroon1909NA
10ZMB Mam 106626MaleSouth AfricaON474908Sylvaticus
11ZMB Mam 106659Male1909NA
12ZMB Mam 106686MaleCameroon1909NA
13ZMB Mam 106691MaleTanzania1909ON474909Scriptus
14ZMB Mam 107602FemaleEquatorial Guinea1909ON474910Scriptus
15ZMB Mam 107662FemaleCameroon1912ON474911Scriptus
16ZMB Mam 107664FemaleDRCON474912Sylvaticus
17ZMB Mam 107665FemaleTanzania1897ON474913Sylvaticus
18ZMB Mam 107667FemaleEthiopia1900ON474914Sylvaticus
19ZMB Mam 107680FemaleCameroonON474915Scriptus
20ZMB Mam 107683FemaleAngolaON474916Scriptus
21ZMB Mam 107722FemaleAngolaON474918Scriptus
22ZMB Mam 107732FemaleTanzania1909ON474919Sylvaticus
23ZMB Mam 107735FemaleCameroon1910ON474920Scriptus
24ZMB Mam 107740FemaleCameroon1904ON474921Scriptus
25ZMB Mam 107751MaleCameroonON474922Scriptus
26ZMB Mam 107796FemaleSouth SudanNA
27ZMB Mam 107812FemaleCameroonNA
28ZMB Mam 16105FemaleTogo1912ON474890Scriptus
29ZMB Mam 2169MaleSouth AfricaNA
30ZMB Mam 31590MaleTanzaniaNA
31ZMB Mam 32456MaleEthiopiaON474891Sylvaticus
32ZMB Mam 37177MaleDRCON474892Sylvaticus
33ZMB Mam 37178MaleDRCON474893Sylvaticus
34ZMB Mam 40389MaleAngola1924ON474894Scriptus
35ZMB Mam 40543MaleEthiopiaON474895Sylvaticus
36ZMB Mam 41189FemaleSenegal1929ON474896Scriptus
37ZMB Mam 41729MaleKenya1925ON474897Sylvaticus
38ZMB Mam 45680MaleSouth Africa1933ON474898Sylvaticus
39ZMB Mam 48268MaleNigeria1910ON474899Scriptus
40ZMB Mam 4931MaleON474889Scriptus
41ZMB Mam 60992MaleTanzania1938ON474900Sylvaticus
42ZMB Mam 60995MaleTanzania1938ON474901Sylvaticus
43ZMB Mam 90802FemaleTanzania1913ON474902Sylvaticus
IDSpecimen numberSexCountryYearGenBank accession no.Subclade
1ZMB Mam 106581MaleNA
2ZMB Mam 106591MaleCameroonON474903Scriptus
3ZMB Mam 106592MaleAngola1903NA
4ZMB Mam 106597MaleCameroonON474904Scriptus
5ZMB Mam 106604MaleGhanaON474905Scriptus
6ZMB Mam 106608MaleMozambiqueNA
7ZMB Mam 106609MaleON474906Sylvaticus
8ZMB Mam 106620MaleAngola1902ON474907Sylvaticus
9ZMB Mam 106625MaleCameroon1909NA
10ZMB Mam 106626MaleSouth AfricaON474908Sylvaticus
11ZMB Mam 106659Male1909NA
12ZMB Mam 106686MaleCameroon1909NA
13ZMB Mam 106691MaleTanzania1909ON474909Scriptus
14ZMB Mam 107602FemaleEquatorial Guinea1909ON474910Scriptus
15ZMB Mam 107662FemaleCameroon1912ON474911Scriptus
16ZMB Mam 107664FemaleDRCON474912Sylvaticus
17ZMB Mam 107665FemaleTanzania1897ON474913Sylvaticus
18ZMB Mam 107667FemaleEthiopia1900ON474914Sylvaticus
19ZMB Mam 107680FemaleCameroonON474915Scriptus
20ZMB Mam 107683FemaleAngolaON474916Scriptus
21ZMB Mam 107722FemaleAngolaON474918Scriptus
22ZMB Mam 107732FemaleTanzania1909ON474919Sylvaticus
23ZMB Mam 107735FemaleCameroon1910ON474920Scriptus
24ZMB Mam 107740FemaleCameroon1904ON474921Scriptus
25ZMB Mam 107751MaleCameroonON474922Scriptus
26ZMB Mam 107796FemaleSouth SudanNA
27ZMB Mam 107812FemaleCameroonNA
28ZMB Mam 16105FemaleTogo1912ON474890Scriptus
29ZMB Mam 2169MaleSouth AfricaNA
30ZMB Mam 31590MaleTanzaniaNA
31ZMB Mam 32456MaleEthiopiaON474891Sylvaticus
32ZMB Mam 37177MaleDRCON474892Sylvaticus
33ZMB Mam 37178MaleDRCON474893Sylvaticus
34ZMB Mam 40389MaleAngola1924ON474894Scriptus
35ZMB Mam 40543MaleEthiopiaON474895Sylvaticus
36ZMB Mam 41189FemaleSenegal1929ON474896Scriptus
37ZMB Mam 41729MaleKenya1925ON474897Sylvaticus
38ZMB Mam 45680MaleSouth Africa1933ON474898Sylvaticus
39ZMB Mam 48268MaleNigeria1910ON474899Scriptus
40ZMB Mam 4931MaleON474889Scriptus
41ZMB Mam 60992MaleTanzania1938ON474900Sylvaticus
42ZMB Mam 60995MaleTanzania1938ON474901Sylvaticus
43ZMB Mam 90802FemaleTanzania1913ON474902Sylvaticus
Table 1.

Bushbuck (Tragelaphus scriptus) cranial specimens used in the analyses. Specimens without GenBank accession numbers were not sequenced or did not yield sufficient mitochondrial DNA; NA subcade refers to these specimens. Subclade taxonomy was determined by molecular data (see Fig. 4). DRC refers to the Democratic Republic of the Congo.

IDSpecimen numberSexCountryYearGenBank accession no.Subclade
1ZMB Mam 106581MaleNA
2ZMB Mam 106591MaleCameroonON474903Scriptus
3ZMB Mam 106592MaleAngola1903NA
4ZMB Mam 106597MaleCameroonON474904Scriptus
5ZMB Mam 106604MaleGhanaON474905Scriptus
6ZMB Mam 106608MaleMozambiqueNA
7ZMB Mam 106609MaleON474906Sylvaticus
8ZMB Mam 106620MaleAngola1902ON474907Sylvaticus
9ZMB Mam 106625MaleCameroon1909NA
10ZMB Mam 106626MaleSouth AfricaON474908Sylvaticus
11ZMB Mam 106659Male1909NA
12ZMB Mam 106686MaleCameroon1909NA
13ZMB Mam 106691MaleTanzania1909ON474909Scriptus
14ZMB Mam 107602FemaleEquatorial Guinea1909ON474910Scriptus
15ZMB Mam 107662FemaleCameroon1912ON474911Scriptus
16ZMB Mam 107664FemaleDRCON474912Sylvaticus
17ZMB Mam 107665FemaleTanzania1897ON474913Sylvaticus
18ZMB Mam 107667FemaleEthiopia1900ON474914Sylvaticus
19ZMB Mam 107680FemaleCameroonON474915Scriptus
20ZMB Mam 107683FemaleAngolaON474916Scriptus
21ZMB Mam 107722FemaleAngolaON474918Scriptus
22ZMB Mam 107732FemaleTanzania1909ON474919Sylvaticus
23ZMB Mam 107735FemaleCameroon1910ON474920Scriptus
24ZMB Mam 107740FemaleCameroon1904ON474921Scriptus
25ZMB Mam 107751MaleCameroonON474922Scriptus
26ZMB Mam 107796FemaleSouth SudanNA
27ZMB Mam 107812FemaleCameroonNA
28ZMB Mam 16105FemaleTogo1912ON474890Scriptus
29ZMB Mam 2169MaleSouth AfricaNA
30ZMB Mam 31590MaleTanzaniaNA
31ZMB Mam 32456MaleEthiopiaON474891Sylvaticus
32ZMB Mam 37177MaleDRCON474892Sylvaticus
33ZMB Mam 37178MaleDRCON474893Sylvaticus
34ZMB Mam 40389MaleAngola1924ON474894Scriptus
35ZMB Mam 40543MaleEthiopiaON474895Sylvaticus
36ZMB Mam 41189FemaleSenegal1929ON474896Scriptus
37ZMB Mam 41729MaleKenya1925ON474897Sylvaticus
38ZMB Mam 45680MaleSouth Africa1933ON474898Sylvaticus
39ZMB Mam 48268MaleNigeria1910ON474899Scriptus
40ZMB Mam 4931MaleON474889Scriptus
41ZMB Mam 60992MaleTanzania1938ON474900Sylvaticus
42ZMB Mam 60995MaleTanzania1938ON474901Sylvaticus
43ZMB Mam 90802FemaleTanzania1913ON474902Sylvaticus
IDSpecimen numberSexCountryYearGenBank accession no.Subclade
1ZMB Mam 106581MaleNA
2ZMB Mam 106591MaleCameroonON474903Scriptus
3ZMB Mam 106592MaleAngola1903NA
4ZMB Mam 106597MaleCameroonON474904Scriptus
5ZMB Mam 106604MaleGhanaON474905Scriptus
6ZMB Mam 106608MaleMozambiqueNA
7ZMB Mam 106609MaleON474906Sylvaticus
8ZMB Mam 106620MaleAngola1902ON474907Sylvaticus
9ZMB Mam 106625MaleCameroon1909NA
10ZMB Mam 106626MaleSouth AfricaON474908Sylvaticus
11ZMB Mam 106659Male1909NA
12ZMB Mam 106686MaleCameroon1909NA
13ZMB Mam 106691MaleTanzania1909ON474909Scriptus
14ZMB Mam 107602FemaleEquatorial Guinea1909ON474910Scriptus
15ZMB Mam 107662FemaleCameroon1912ON474911Scriptus
16ZMB Mam 107664FemaleDRCON474912Sylvaticus
17ZMB Mam 107665FemaleTanzania1897ON474913Sylvaticus
18ZMB Mam 107667FemaleEthiopia1900ON474914Sylvaticus
19ZMB Mam 107680FemaleCameroonON474915Scriptus
20ZMB Mam 107683FemaleAngolaON474916Scriptus
21ZMB Mam 107722FemaleAngolaON474918Scriptus
22ZMB Mam 107732FemaleTanzania1909ON474919Sylvaticus
23ZMB Mam 107735FemaleCameroon1910ON474920Scriptus
24ZMB Mam 107740FemaleCameroon1904ON474921Scriptus
25ZMB Mam 107751MaleCameroonON474922Scriptus
26ZMB Mam 107796FemaleSouth SudanNA
27ZMB Mam 107812FemaleCameroonNA
28ZMB Mam 16105FemaleTogo1912ON474890Scriptus
29ZMB Mam 2169MaleSouth AfricaNA
30ZMB Mam 31590MaleTanzaniaNA
31ZMB Mam 32456MaleEthiopiaON474891Sylvaticus
32ZMB Mam 37177MaleDRCON474892Sylvaticus
33ZMB Mam 37178MaleDRCON474893Sylvaticus
34ZMB Mam 40389MaleAngola1924ON474894Scriptus
35ZMB Mam 40543MaleEthiopiaON474895Sylvaticus
36ZMB Mam 41189FemaleSenegal1929ON474896Scriptus
37ZMB Mam 41729MaleKenya1925ON474897Sylvaticus
38ZMB Mam 45680MaleSouth Africa1933ON474898Sylvaticus
39ZMB Mam 48268MaleNigeria1910ON474899Scriptus
40ZMB Mam 4931MaleON474889Scriptus
41ZMB Mam 60992MaleTanzania1938ON474900Sylvaticus
42ZMB Mam 60995MaleTanzania1938ON474901Sylvaticus
43ZMB Mam 90802FemaleTanzania1913ON474902Sylvaticus
Map of specimens with location information used in this study (39 individuals; four additional specimens did not have location information and are not shown here). Mitochondrial subclade designations (scriptus and sylvaticus) are based on the molecular phylogeny, and are designated NA (grey) where this could not be determined. Location coordinates are based on locality information on historical specimens and were estimated as closely as possible. The modern range of the bushbuck is shown in grey (IUCN SSC Antelope Specialist Group 2016). See Supporting Information, Fig. S2 for a map labelled with specimen numbers and countries.
Figure 1.

Map of specimens with location information used in this study (39 individuals; four additional specimens did not have location information and are not shown here). Mitochondrial subclade designations (scriptus and sylvaticus) are based on the molecular phylogeny, and are designated NA (grey) where this could not be determined. Location coordinates are based on locality information on historical specimens and were estimated as closely as possible. The modern range of the bushbuck is shown in grey (IUCN SSC Antelope Specialist Group 2016). See Supporting Information, Fig. S2 for a map labelled with specimen numbers and countries.

Digitization, landmarking and principal component analysis

We created 3D surface models of all crania using a 3D surface scanner (Artec Spider) and Artec Studio Professional software (v.13). We scanned all external surfaces of the crania, then aligned and registered the partial scans, created a fast fusion to check for missing surfaces (repeating the scanning, alignment and registration steps when necessary), removed outliers and non-cranial elements, created a sharp fusion, applied texture, simplified the fusion to one million vertices, and exported the model in ply format with vertex colour. We used default parameters except where otherwise specified.

To characterize cranial shape, we used a set of 53 fixed 3D landmarks placed on the surface of each cranial model at anatomically homologous points chosen to capture morphological variation in bovid crania (Bibi and Tyler 2022). The Supporting Information (Table S1) details the landmark numbers and anatomical positions, which are illustrated in Figure 2. We performed all landmarking and geometric morphometric analyses in RStudio (v.1.4.1103; RStudio Team 2019) and the R statistical environment (v.4.0.2; R Core Team 2020). Functions refer to the geomorph package (v.4.0.0; Adams and Otárola-Castillo 2013) except where otherwise stated. We placed landmarks on the centred surface models using the digit.fixed() function, then Procrustes-transformed the raw landmark coordinates using both the gpagen() function, in order to extract centroid size, and the function bilat.symmetry(), to account for bilateral symmetry reflected in the left and right landmarks (Supporting Information, Table S1). Finally, we performed a principal component (PC) analysis with the function gm.prcomp() on the symmetric component of shape variation from this latter set of transformed landmarks.

Surface model of Tragelaphus scriptus cranium in dorsal, lateral and ventral views, with points representing 53 digitized landmarks used to characterize cranial shape (22 blue landmarks mirrored on left and right sides; 9 orange medial landmarks). See the Supporting Information (Table S1) for anatomical descriptions of the locations of landmarks.
Figure 2.

Surface model of Tragelaphus scriptus cranium in dorsal, lateral and ventral views, with points representing 53 digitized landmarks used to characterize cranial shape (22 blue landmarks mirrored on left and right sides; 9 orange medial landmarks). See the Supporting Information (Table S1) for anatomical descriptions of the locations of landmarks.

In order to visualize the effects on cranial shape attributable to size and PCs, we selected one specimen close to the mean shape (ZMB Mam 41189) and warped it to the mean shape calculated with the function warpRefMesh(). We then warped this mean shape mesh into models representing the extremes of centroid size in our sample set, in addition to the minima and maxima of the first three PCs (Supporting Information, Fig. S1).

We used ordinary least squares ANOVA with randomized permutation to examine the relationships between cranial shape and three variables: mitochondrial subclade (scriptus vs. sylvaticus), sex and size. We tested the comparison shape ~ subclade for significant morphological differences between the two bushbuck subclades. Specimens were assigned to a subclade by the mitochondrial phylogeny (see Results). We used the comparison shape ~ sex to examine the significance of sexual dimorphism and compare it to differences between subclades (sexual determination in bushbuck crania is relatively straightforward, because females are hornless). Finally, we used the comparison shape ~ size to assess the significance of size allometry in the generation of morphological differences between subclades and to determine whether the two subclades follow similar allometric trajectories (slopes). Here, we used the natural log of centroid size as a proxy for body mass. For models without clade as a variable, we included all T. scriptus specimens (N = 42); otherwise, only specimens for which we obtained mitochondrial sequence information were included (N = 33). All comparisons used the procD.lm() function in geomorph (Adams and Otárola-Castillo 2013).

We visualized the relationship between phylogeny and morphospace using the phylomorphospace() function in the phytools package (v.0.7-70; Revell 2012), including only those bushbuck specimens for which we obtained both morphological and sequence data (N = 33). We used the function physignal() (multivariate version of the K-statistic; Blomberg et al. 2003, Adams 2014) to estimate the degree of phylogenetic signal present in the Procrustes-transformed shape coordinates of specimens for which we had sequence data.

Mitogenome reconstruction

Sampling

Cardini (2020) discussed the impact of autocorrelation in museum collections (specimens often come from the same locality and can often be closely related individuals), which can make them poor representatives of within-species variation. To counteract this, we have included specimens from many sampling locations across the species range, and the results of the phylogenetic analysis indicate that very few individuals in this study have an extremely close relationship (e.g. the individuals from Cameroon in Figs 4, 7).

Plots of size allometry examined via ln-transformed centroid size and principal component 1 (PC1) (A); or regression of shape (B). Note the strong influence of size on shape. The key is the same as that shown in Figure 1; labels refer to specimens in Table 1.
Figure 3.

Plots of size allometry examined via ln-transformed centroid size and principal component 1 (PC1) (A); or regression of shape (B). Note the strong influence of size on shape. The key is the same as that shown in Figure 1; labels refer to specimens in Table 1.

Maximum likelihood phylogeny of all sequenced bushbuck specimens with other tragelaphines. Node numbers indicate approximate likelihood ratio test (aLRT) support values; only nodes with > 95% support are shown. Nodes with < 80% support were collapsed. Numbers refer to specimens in Table 1 or GenBank reference specimens in Table 2. Dots are coloured by country and correspond to those in Figure 7.
Figure 4.

Maximum likelihood phylogeny of all sequenced bushbuck specimens with other tragelaphines. Node numbers indicate approximate likelihood ratio test (aLRT) support values; only nodes with > 95% support are shown. Nodes with < 80% support were collapsed. Numbers refer to specimens in Table 1 or GenBank reference specimens in Table 2. Dots are coloured by country and correspond to those in Figure 7.

Laboratory procedures

To determine the most productive, least destructive skeletal element source of DNA for this set of specimens, we first prepared ~30 mg of ground bone material sampled from five different locations on the cranium of one specimen (ZMB Mam 106608), then extracted the DNA using a QIAamp DNA Investigator Kit following the protocol described by Rohland and Hofreiter (2007) and modified by Dabney et al. (2013). After extraction, we determined the degree of DNA degradation using an Agilent D1000 DNA ScreenTape assay and TapeStation Analysis Software v.3.1.1. We measured DNA concentration with a Qubit 2.0 Fluorometer and dsDNA HS Assay Kit (Thermo Fisher Scientific). Only the sample from the nasal turbinate bone produced any detectable concentration of DNA (1.06 ng/µL). For the remaining specimens, we prepared 30–400 mg tissue samples from the nasal turbinate bones and tissue, extracted the DNA, and checked the degradation and concentration following the protocol outlined above (Supporting Information, Table S2).

We prepared multiplexed DNA libraries for paired-end Illumina sequencing using a NEXTFLEX Rapid DNA-Seq Kit 2.0 and Bundle with NEXTflex-HT Barcodes 1–96 (Bioo Scientific/PerkinElmer). We followed the manufacturer’s instructions for ‘Option1 – no size selection’, with the following exceptions: DNA was not sheared before library preparation; Clean-Up Bead Volume was set to 45 µL instead of 35 µL after Adapter Ligation, to avoid losing small target-fragments; and Resuspension Buffer was decreased to 25 µL to concentrate amplified libraries. Finally, library size, quality and quantity were determined as above. To obtain a sufficient quantity for subsequent enrichment, several libraries were re-amplified.

After this, we enriched the libraries for targeted sequencing using myBaits (Arbor Biosciences, USA). We designed two different biotinylated RNA baits, complementary to sequence targets for sylvaticus and scriptus subclades (Kit A and B), using existing mitogenomes from GenBank (Table 2). Assuming a high amount of non-target templates, we followed the recommendations for ‘Enriching low-copy, degraded and/or contaminated DNA Libraries’, using the Arbor Bioscience myBaits protocol for ‘Dual Round Enrichment’ and the myBaits manual v.4. In total, we enriched 44 individuals in pools of two to four libraries per enrichment reaction in the following groups: 16 individuals/libraries with Kit A (sylvaticus); 17 individuals/libraries with Kit B (scriptus); 6 ‘Unknown’ individuals/libraries, each sample with both kits; and 3 ‘Unknown’ individuals/libraries with Kit A (sylvaticus). Enrichment pools were checked for quality and quantity as above.

Table 2.

Ten additional Tragelaphus mitogenomic sequences (nine from GenBank and one from this study) included in the molecular analyses. These include two reference sequences (Tragelaphus scriptus) used in the bioinformatics workflow and eight sequences representing the remaining tragelaphine species.

TaxonGenBank ID/ Specimen No.Source
T. scriptusJN632707.1 (sylvaticus reference)Hassanin et al. (2012)
T. scriptusJN632705.1 (scriptus reference)Hassanin et al. (2012)
T. buxtoniNC_038064.1Li et al. (2018)
T. oryxNC_020750.1Hassanin et al. (2012)
T. strepsicerosNC_020752.1Hassanin et al. (2012)
T. eurycerusNC_020749.1Hassanin et al. (2012)
T. spekiiNC_020620.1Hassanin et al. (2012)
T. angasiiNC_020748.1Hassanin et al. (2012)
T. imberbisNC_020619.1Hassanin et al. (2012)
T. spekiiON474917 (ZMB Mam 107720)This study
TaxonGenBank ID/ Specimen No.Source
T. scriptusJN632707.1 (sylvaticus reference)Hassanin et al. (2012)
T. scriptusJN632705.1 (scriptus reference)Hassanin et al. (2012)
T. buxtoniNC_038064.1Li et al. (2018)
T. oryxNC_020750.1Hassanin et al. (2012)
T. strepsicerosNC_020752.1Hassanin et al. (2012)
T. eurycerusNC_020749.1Hassanin et al. (2012)
T. spekiiNC_020620.1Hassanin et al. (2012)
T. angasiiNC_020748.1Hassanin et al. (2012)
T. imberbisNC_020619.1Hassanin et al. (2012)
T. spekiiON474917 (ZMB Mam 107720)This study
Table 2.

Ten additional Tragelaphus mitogenomic sequences (nine from GenBank and one from this study) included in the molecular analyses. These include two reference sequences (Tragelaphus scriptus) used in the bioinformatics workflow and eight sequences representing the remaining tragelaphine species.

TaxonGenBank ID/ Specimen No.Source
T. scriptusJN632707.1 (sylvaticus reference)Hassanin et al. (2012)
T. scriptusJN632705.1 (scriptus reference)Hassanin et al. (2012)
T. buxtoniNC_038064.1Li et al. (2018)
T. oryxNC_020750.1Hassanin et al. (2012)
T. strepsicerosNC_020752.1Hassanin et al. (2012)
T. eurycerusNC_020749.1Hassanin et al. (2012)
T. spekiiNC_020620.1Hassanin et al. (2012)
T. angasiiNC_020748.1Hassanin et al. (2012)
T. imberbisNC_020619.1Hassanin et al. (2012)
T. spekiiON474917 (ZMB Mam 107720)This study
TaxonGenBank ID/ Specimen No.Source
T. scriptusJN632707.1 (sylvaticus reference)Hassanin et al. (2012)
T. scriptusJN632705.1 (scriptus reference)Hassanin et al. (2012)
T. buxtoniNC_038064.1Li et al. (2018)
T. oryxNC_020750.1Hassanin et al. (2012)
T. strepsicerosNC_020752.1Hassanin et al. (2012)
T. eurycerusNC_020749.1Hassanin et al. (2012)
T. spekiiNC_020620.1Hassanin et al. (2012)
T. angasiiNC_020748.1Hassanin et al. (2012)
T. imberbisNC_020619.1Hassanin et al. (2012)
T. spekiiON474917 (ZMB Mam 107720)This study

Sequencing was carried out at the Berlin Center for Genomics in Biodiversity Research (BeGenDiv/DE-Berlin). After test sequencing on an Illumina MiSeq, sequences were run in two batches on an Illumina NextSeq using paired-end reactions, the first with 26 enriched libraries and the second with 22 enriched libraries and one control, representing a total of 34 specimens: 33 bushbucks (T. scriptus) and one sitatunga (Tragelaphus spekii Speke, 1863).

Read cleaning and mitochondrial reconstruction

After sequencing, we cleaned the raw reads using a custom-designed workflow for museum specimens (see https://github.com/MozesBlom/NGSdata_tools). This workflow (see Supporting Information, Fig. S4) consists of: (i) removal of duplicates using Super deduper (v.1.0.0; Petersen et al. 2015); (ii) adapter trimming using Trimmomatic (v.0.36; Bolger et al. 2014); (iii) merging of overlapping read pairs with PEAR (v.0.9.10; Zhang et al. 2013); (iv) quality trimming with Trimmomatic (v.0.36; Bolger et al. 2014); and (v) removal of low-complexity reads using a Python script developed in house. Before and after cleaning, we inspected the quality of the reads visually using FastQC (v.0.11.5; Andrews 2010).

Previous studies have reported a deep mitochondrial split between T. scriptus and T. sylvaticus Sparrman, 1780 (see Introduction), and reference selection can have an impact on downstream evolutionary analyses (Prasad et al. 2022). To remove potential biases arising from unequal phylogenetic distance between focal specimens and the reference, we mapped each individual twice and used two reference genomes from GenBank (Table 2); one belonging to each of the two mitochondrial subclades (JN632705.1 for the scriptus subclade; JN632707.1 for sylvaticus). By mapping each individual against both references in parallel and comparing the obtained datasets, we were able to identify differing sites and thus mitigate reference bias.

We mapped the polished reads to each reference using the BWA-MEM algorithm (Li 2013) and used SAMtools (v.0.1.19; Li et al. 2009) as follows: (i) to convert the resultant SAM files into BAM format; (ii) to remove reads that did not pair properly with the reference; (iii) to sort and index the BAM files; and (iv) to merge the library-specific BAM files by individual. Next, we used FreeBayes (v.1.0.2; Garrison and Marth 2012) to call variants while considering only bases with a phred quality score > 20 from reads with a mapping quality score > 20. Subsequently, we filtered the variants with VCFtools (v.0.1.15; Danecek et al. 2011) and vcflib (Garrison et al. 2022) as follows: (i) by eliminating individual genotypes with a depth less than eight reads; (ii) by decomposing complex variants; and (iii) by removing indels. Finally, we created a consensus sequence for each individual and each reference by using BCFtools (v.1.3.1; Danecek et al. 2021). During variant calling, we ran FreeBayes deliberately under a fixed ploidy of two, requiring at least two reads supporting the alternative allele. However, mtDNA is haploid, and consistent support for alternative alleles could be the result of contamination. Therefore, we incorporated the reference allele in heterozygous sites only when calling the draft consensus sequences using BCFtools.

In order to reduce the risk of incorporating non-endogenous sequence data from other museum specimens, we carried out an additional, unconventional filtering procedure. This step consisted of a second mapping and variant calling iteration. We repeated the mapping and variant calling pipeline using the same settings as described above but used the consensus sequences called for each of the two original reference genomes instead. Ideally, reads mapped against their own consensus sequence should align perfectly, and spurious genetic variants could indicate DNA contamination or mapping errors. Therefore, we masked all positions at which alternative alleles persisted. After this second iteration, we also masked sites with a read depth below eight to avoid the incorporation of reference alleles in low-coverage regions. Finally, we compared the two consensus sequences for each individual, masked sites where they differed and used the resulting individual consensus sequence for downstream phylogenetic analyses.

In addition to the 33 bushbuck specimens successfully sequenced in this study, the two reference sequences, and the T. spekii sequence (ZMB Mam 107720), we also included in our phylogenetic analyses eight complete mitogenome sequences (from GenBank) representing all other tragelaphine species (Table 2). We aligned these sequences using the FFT-NS-2, progressive method in MAFFT (v.7.453; Katoh et al. 2002, Katoh and Frith 2012) and checked the alignment by eye in Mesquite (v.3.61; Maddison and Maddison 2019).

We then determined the best-fitting site model of gene evolution for this alignment (N = 49) using Bayesian information criterion in ModelFinder (Kalyaanamoorthy et al. 2017). We created a maximum likelihood (ML) tree using the desktop version of IQ-TREE (v.1.6.12; Nguyen et al. 2015) and calculated branch support using the Shimodaira–Hasegawa approximate likelihood ratio test (SH-aLRT; Guindon et al. 2010). We also produced a Bayesian tree in BEAST (v.2.6.7; Bouckaert et al. 2019; see Supporting Information, Fig. S3), which corroborated the results of the ML tree (Supporting Information, Fig. S4). To visualize the relationships among haplotypes, we used MEGA11 (v.11.0.8; Tamura et al. 2021) to remove all sites with ambiguous bases or gaps in one or more of the sequences in the alignment, then constructed a median-joining network (Bandelt et al. 1999) in PopART (v.1.7; Leigh and Bryant 2015). Given that most sequences were of similar length, sequences with many gaps were removed from the alignment before removing these sites and constructing the median-joining network. To calculate phylogenetic signal in the morphological data, we used the function physignal() to calculate a multivariate K-statistic and performed a Mantel test between the molecular and morphological distance matrices.

We used the following R packages for data manipulation and visualization: abind (v.1.4-5; Plate and Heiberger 2016), ade4 (Dray et al. 2007), ggpubr (v.0.4.0; Kassambara 2017), ggrepel (v.0.9.1; Slowikowski 2018), ips (v.0.0.11; Heibl et al. 2019), RColorBrewer (v.1.1-2; Neuwirth 2014), scales (v.1.1.1; Wickham and Seidel 2020), sf (v.0.9-8; Pebesma 2018) and tidyverse (v.1.3.1; Wickham et al. 2019).

RESULTS

Geometric morphometrics

Subclade, sex and size all have a significant effect on shape, although size and, especially, sex explain higher proportions of shape than does subclade (Table 3). After accounting for sex and size (shape ~ sex + size + subclade), differences in shape between scriptus and sylvaticus subclades are significant but small (Table 3). Although size and subclade both have a significant effect on shape, the interaction between size and subclade (shape ~ size * subclade) is non-significant (N = 33, P = 0.556, Z = −0.1414) in explaining shape, indicating similar size allometry in both scriptus and sylvaticus subclades. The interaction between size and sex in explaining shape (shape ~ size * sex) is also non-significant (P = 0.062, Z = 1.6077), indicating similar size allometry between males and females (despite the sexually dimorphic presence/absence of horns). The relationship between ln-transformed centroid size and the regression of size on shape is shown in Figure 3; a pairwise test of slopes confirms no significant differences between the two subclades (P = 0.0978).

Table 3.

Results of the ANOVA of sex, size and subclade on shape. Models that include the subclade variable were calculated using the subset of bushbuck specimens for which we obtained sequence data (N = 33); all other models included all bushbuck specimens (N = 43). The table includes significant results only; see the Results section for discussion of non-significant results.

Shape ~Residual d.f.d.f.RSSSSMSRsqFZPr(>F)
Size4110.0960.0150.0150.1396.5933.7260.001
Sex4110.0920.0200.0200.1778.8434.5860.001
Size + sex400.0880.0240.0150.2145.4374.5550.001
 Size10.0150.0150.1397.0473.7930.001
 Sex10.0080.0080.0753.8263.9770.001
 Subclade3110.0760.0090.0090.1073.7102.8610.002
Size + sex + subclade290.0580.0270.0090.3184.5084.2190.001
 Size10.0130.0130.1496.3463.6250.001
 Sex10.0080.0080.0903.8423.4880.001
 Subclade10.0070.0070.0783.3353.2650.001
Size + subclade300.0650.0200.0100.2344.5803.8400.001
 Size10.0130.0130.1495.8443.5420.001
 Subclade10.0070.0070.0853.3163.0350.001
Shape ~Residual d.f.d.f.RSSSSMSRsqFZPr(>F)
Size4110.0960.0150.0150.1396.5933.7260.001
Sex4110.0920.0200.0200.1778.8434.5860.001
Size + sex400.0880.0240.0150.2145.4374.5550.001
 Size10.0150.0150.1397.0473.7930.001
 Sex10.0080.0080.0753.8263.9770.001
 Subclade3110.0760.0090.0090.1073.7102.8610.002
Size + sex + subclade290.0580.0270.0090.3184.5084.2190.001
 Size10.0130.0130.1496.3463.6250.001
 Sex10.0080.0080.0903.8423.4880.001
 Subclade10.0070.0070.0783.3353.2650.001
Size + subclade300.0650.0200.0100.2344.5803.8400.001
 Size10.0130.0130.1495.8443.5420.001
 Subclade10.0070.0070.0853.3163.0350.001

Abbreviations: MS, mean squares; Pr(>F) = P-value for F-value; Rsq, coefficient of shape variation attributed to each covariate; RSS, residual sum of squares; SS, sum of squares; Z, effect size (based on F-values).

Table 3.

Results of the ANOVA of sex, size and subclade on shape. Models that include the subclade variable were calculated using the subset of bushbuck specimens for which we obtained sequence data (N = 33); all other models included all bushbuck specimens (N = 43). The table includes significant results only; see the Results section for discussion of non-significant results.

Shape ~Residual d.f.d.f.RSSSSMSRsqFZPr(>F)
Size4110.0960.0150.0150.1396.5933.7260.001
Sex4110.0920.0200.0200.1778.8434.5860.001
Size + sex400.0880.0240.0150.2145.4374.5550.001
 Size10.0150.0150.1397.0473.7930.001
 Sex10.0080.0080.0753.8263.9770.001
 Subclade3110.0760.0090.0090.1073.7102.8610.002
Size + sex + subclade290.0580.0270.0090.3184.5084.2190.001
 Size10.0130.0130.1496.3463.6250.001
 Sex10.0080.0080.0903.8423.4880.001
 Subclade10.0070.0070.0783.3353.2650.001
Size + subclade300.0650.0200.0100.2344.5803.8400.001
 Size10.0130.0130.1495.8443.5420.001
 Subclade10.0070.0070.0853.3163.0350.001
Shape ~Residual d.f.d.f.RSSSSMSRsqFZPr(>F)
Size4110.0960.0150.0150.1396.5933.7260.001
Sex4110.0920.0200.0200.1778.8434.5860.001
Size + sex400.0880.0240.0150.2145.4374.5550.001
 Size10.0150.0150.1397.0473.7930.001
 Sex10.0080.0080.0753.8263.9770.001
 Subclade3110.0760.0090.0090.1073.7102.8610.002
Size + sex + subclade290.0580.0270.0090.3184.5084.2190.001
 Size10.0130.0130.1496.3463.6250.001
 Sex10.0080.0080.0903.8423.4880.001
 Subclade10.0070.0070.0783.3353.2650.001
Size + subclade300.0650.0200.0100.2344.5803.8400.001
 Size10.0130.0130.1495.8443.5420.001
 Subclade10.0070.0070.0853.3163.0350.001

Abbreviations: MS, mean squares; Pr(>F) = P-value for F-value; Rsq, coefficient of shape variation attributed to each covariate; RSS, residual sum of squares; SS, sum of squares; Z, effect size (based on F-values).

A, plot of all specimens along the first two principal components (PCs) of shape. B, cranial outlines transformed via thin-plate spline from the mean shape to the extremes of PCs 1 and 2. Note the separation of subclades along PC1, which largely reflects sexual dimorphism and size allometry (see Fig. 3). The key is the same as that shown in Figure 1; labels refer to specimens in Table 1.
Figure 5.

A, plot of all specimens along the first two principal components (PCs) of shape. B, cranial outlines transformed via thin-plate spline from the mean shape to the extremes of PCs 1 and 2. Note the separation of subclades along PC1, which largely reflects sexual dimorphism and size allometry (see Fig. 3). The key is the same as that shown in Figure 1; labels refer to specimens in Table 1.

Differences in size between scriptus and sylvaticus subclades are non-significant (P = 0.243, Z = 0.7417). Considering males and females separately, differences in size (size ~ subclade) are also not significant (males: N = 19, P = 0.202, Z = 0.8839; and females: N = 14, P = 0.307, Z = 0.5835). Interestingly, however, the relationship between shape and subclade (shape ~ subclade) is small but significant (Table 3). When sexes are analysed separately, this is significant only for males (P = 0.004, Z = 2.9857), not for females (P = 0.06, Z = 1.6139).

Principal component analysis

In Figure 5, we plot all specimens along the first two PCs of shape, which describe 24% and 10% of the variation, respectively. The remaining 41 components each describe < 9% of the variation.

Although the morphological differences between all extremes of shape and size are subtle, the largest skull (Centroid Size Max) resembles the minimum PC1 skull, and the smallest skull (Centroid Size Min) resembles the maximum PC1 skull (Fig. 3A). This correlation between PC1 and size is unsurprising, because the first PC of geometric morphometric analyses often loosely reflects size–shape allometry.

Principal component 1 describes 24.4% of the total shape variation and relates to both sexual dimorphism and size, with differences in cranial shape relating to the length and robustness of the cranium. The differences in shape variation described by the PC1 are similar to those of the models of minimum and maximum centroid size (Supporting Information, Fig. S1). Principal component 2 describes 10.2% of the total variation in shape and is related to the length of the snout and roundness of the braincase. Principal component 3 describes 8.1% of the total variation in shape and is related to the medial curvature of the ventral skull.

Phylogenetic inference

Our molecular phylogeny agrees with the deep divergence between scriptus and sylvaticus subclades found by other mtDNA studies (Hassanin et al. 2018). We also find a high level of structure in the sylvaticus subclade compared with scriptus, which might be attributable to the smaller sample size of scriptus individuals in our study or might be because of collection bias (Cardini 2020). However, this could also represent a real pattern of higher genetic structure and diversity in the sylvaticus group (additional data on the genetic structure of the sample and the results of the molecular experiments are given in the Supporting Information, Supplemental material; Table S3).

Phylogenetic signal

Sylvaticus and scriptus clades can be distinguished readily along PC1 (primarily representing size allometry), but within each subclade there is no clearly discernible relationship between phylogeny and morphology (Fig. 6). There is no significant evidence for phylogenetic signal in the Procrustes-transformed shape data (K ≈ 0, P = 0.6355, Z = −0.3522). However, we do find a small but significant positive correlation between molecular and morphological distance (r = 0.226, P = 0.0013; for additional distance comparisons subdivided by sex and clade, see Supporting Information, Fig. S5; Supplemental material).

Phylomorphospace shows poor correspondence between mitogenomic relationships and morphological differences. Axes refer to the first and second principal components (PC1 and PC2) of the geometric morphometric analysis. The key is the same as that shown in Figure 1; labels refer to specimens in Table 1. Note that this figure includes only specimens for which mitochondrial DNA was extracted.
Figure 6.

Phylomorphospace shows poor correspondence between mitogenomic relationships and morphological differences. Axes refer to the first and second principal components (PC1 and PC2) of the geometric morphometric analysis. The key is the same as that shown in Figure 1; labels refer to specimens in Table 1. Note that this figure includes only specimens for which mitochondrial DNA was extracted.

DISCUSSION

Morphology

Although we find significant differences in shape between the scriptus and sylvaticus subclades, these are only between males and not females. The observed differences can also be accounted for by minor (non-significant) size differences, particularly given that males of the sylvaticus subclade are larger, on average, and attain a larger maximum size (with high overlap). The amount of cranial shape variation explained by differences between subclades is far less than that explained by sexual dimorphism or size differences (Table 3), indicating minimal morphological divergence. Furthermore, although phylogenetic relationships differ along PC1, which largely reflects size–shape allometry (Figs 3, 5, 6), and although molecular and morphological distances are positively correlated, the effect is small, and there is no phylogenetic signal in total cranial shape.

Changes with increasing size (and more characteristic of sylvaticus) include a deeper face, a taller skull, and higher and more laterally set orbits (Supporting Information, Fig. S1). However, whether such differences could be detected consistently by eye at the individual level is not at all clear. The small magnitude of these differences and their high range of overlap indicate that the two bushbuck subclades do not exhibit strong morphological differentiation, at least as far their cranial shape is concerned. Given that the specimens used in this study represent nearly the full continental range of the species, this low amount of difference in cranial shape between the two subclades, which are separated both geographically and temporally and which exhibit remarkable biogeographical variation in coat patterning (Kingdon 1982), is striking. Despite a deep mitochondrial divergence, our geometric morphometric analyses found no significant evidence for morphological divergence between the scriptus and sylvaticus subclades.

Mitogenomics

Our mitochondrial phylogeny and haplotype network (Figs 4, 7) agree with previous mitochondrial phylogenies of bushbuck, confirming the presence of two deeply divergent and geographically distinct subclades within the species.

Median-joining network of 28 sequences from bushbuck specimens, with missing and ambiguous sites removed. [Three sequences with many gaps (ZMB Mam 32456, ZMB Mam 40389 and ZMB Mam 107683) were removed before the analysis.] Branches are labelled with mutation distances; unlabelled branches have a distance of one. Note the structure, the long branch separating clades and the Tanzania/Nigeria haplogroup. One reference sequence was included from both clades (Table 2).
Figure 7.

Median-joining network of 28 sequences from bushbuck specimens, with missing and ambiguous sites removed. [Three sequences with many gaps (ZMB Mam 32456, ZMB Mam 40389 and ZMB Mam 107683) were removed before the analysis.] Branches are labelled with mutation distances; unlabelled branches have a distance of one. Note the structure, the long branch separating clades and the Tanzania/Nigeria haplogroup. One reference sequence was included from both clades (Table 2).

Using mitochondrial fragment data, Hassanin et al. (2018) dated the most recent common ancestor of sylvaticus to 2.61 [95% CI: 2.1–3.1] Mya and the most recent common ancestor of scriptus to 2.31 [1.7–3.0] Mya (largely in agreement with Moodley and Bruford 2007). Their tree based on the concatenation of nuclear genes corroborated this older origin of the sylvaticus clade: 0.78 [1.1–0.5] Mya for the most recent common ancestor of sylvaticus and 0.39 [0.6–0.2] Mya for that of scriptus. Rakotoarivelo et al. (2019b) reported similar findings and hypothesized that the bushbuck probably originated in East Africa, with scriptus dispersing north-west and sylvaticus dispersing south-east via a series of vicariance events. They described 23 phylogenetically distinct haplogroups (eight in scriptus; 15 in sylvaticus), showing that much of this structural variation aligns with habitat. This older age of origination in sylvaticus and the diversity of ecoregions in south-eastern Africa (owing, in part, to the greater span of latitude that sylvaticus inhabits) explain the higher level of phylogenetic structure in the sylvaticus subclade, something also observed in our findings (Fig. 8).

Distribution of bushbuck crania according to size (log centroid size; A) and along the first three principal components of shape (B–D). Differences in size are non-significant between subclades and when assessed separately for males and females, despite the attainment of a larger maximum size in sylvaticus males. Differences in shape between subclades are significant, but only for males, not females.
Figure 8.

Distribution of bushbuck crania according to size (log centroid size; A) and along the first three principal components of shape (B–D). Differences in size are non-significant between subclades and when assessed separately for males and females, despite the attainment of a larger maximum size in sylvaticus males. Differences in shape between subclades are significant, but only for males, not females.

Additionally, the advantages of our approach, which targeted morphology and DNA sequences from the same museum individuals, is highlighted by our finding of at least three specimens with mismatching geography and subclade. Based on their provenance, ZMB Mam 106620 from Angola (34 and 21 in Fig. 1) and ZMB Mam 106691 from central Tanzania (41 in Fig. 1) would have been assumed to belong to the sylvaticus subclade but appear in the scriptus group in Figure 4. However, their mitochondrial sequences place them securely within the scriptus clade. It is most likely that the location information associated with these specimens is misleading. Although it is possible that this incongruence indicates that dispersal and gene flow across the sylvaticus–scriptus divide can cross substantial distances, it is extremely unlikely that this is the case for specimen ZMB Mam 106691 (the locality given for this specimen is Dodoma, the capital of Tanzania, which might merely indicate that the specimen was acquired there, perhaps at a market). Interestingly, female bushbucks often spend their lives near their birthplace, whereas young male bushbucks tend to leave their home range at maturity (Kingdon 1982), which could promote the rapid formation of phylogeographical structure at the mitogenomic level, but not at the nuclear level.

CONCLUSION

Until recently, the incorporation of phylogenomic and morphological data had been restricted computationally and methodologically. Here, we took advantage of advances in integrative methods and incorporated 3D shape data with mitochondrial phylogenetics to investigate patterns of variation in mtDNA and cranial morphology in the bushbuck species complex. The lack of phylogenetic signal in cranial morphology suggests that the bushbuck should remain classified as a single species, in agreement with nuclear DNA analyses (Hassanin et al. 2018, Rakotoarivelo et al. 2019a, b). The success of this approach in a species delimitation problem supports the use and development of integrative methods in taxonomic studies and the further incorporation of morphological, phylogenetic, biogeographical and ecological datasets.

SUPPLEMENTARY DATA

Supplementary data is available at Zoological Journal of the Linnean Society online.

Table S1. Identification numbers and anatomical descriptions of the 53 landmarks used in the geometric morphometric analyses.

Table S2. Results of lab enrichment procedures.

Table S3. Coverage and DNA read statistics.

Figure S1. Models representing the extremes of centroid size in our sample set, as well as the minima and maxima of the first three principal components.

Figure S2. Map of specimens with specimen number and country of origin.

Figure S3. Bayesian phylogeny of all sequenced bushbuck specimens with other tragelaphines.

Figure S4. Visualisation of pipeline used for read cleaning.

Figure S5. Pairwise comparisons between molecular and morphological distance for all specimens with sequence data.

CONFLICT OF INTEREST

None declared.

ACKNOWLEDGEMENTS

We are grateful for the editorial staff of the Journal, who worked hard to facilitate the publication process during a time of transition; two anonymous referees for their time, suggestions, and comments, which contributed greatly to the development of the manuscript; and for support provided by the Abgeordnetenhaus of Berlin and the Museum für Naturkunde's Integrative Taxonomy funds. We also thank E. Amson, I. Baird, D. Breyer, C. Funk, N. Heckeberg, E. Hempel, I. Lazagabaster, K. Mahlow, F. Mayer, S. Mbedi, A. Petzold, T. Ramm, K. Spitzer and D. Willborn.

AUTHOR CONTRIBUTIONS

C.N.B. and F.B. designed the study and secured funding. C.N.B. scanned and landmarked all crania, with input from F.B. I.W. and C.N.B. isolated the DNA, and I.W. developed methods for DNA extraction and enrichment and prepared the libraries for sequencing. M.E. and M.P.K.B. conducted bioinformatics work and generated the sequence alignments. M.E., I.W., M.P.K.B. and F.B. provided support with the analyses and methodology. C.N.B. conducted all morphological and phylogenetic analyses with input from F.B. C.N.B. led the writing of the manuscript, and all authors contributed to revision of the manuscript.

FUNDING

This work was supported by the Taxonomy Fund of the Museum für Naturkunde; and the Studienstiftung des Abgeordnetenhauses von Berlin to C.N.B. The Berlin Center for Genomics in Biodiversity Research team was partially funded by the German Federal Ministry of Education and Research (BMBF, Förderkennzeichen 033W034A).

DATA AVAILABILITY

DNA sequences are available through GenBank. Three-dimensional surface models and an R script to reproduce all results are provided as Supplementary materials (https://doi.naturkundemuseum.berlin/data/10.7479/abew-e365).

REFERENCES

Adams
DC.
A generalized K statistic for estimating phylogenetic signal from shape and other high-dimensional multivariate data
.
Systematic Biology
2014
;
63
:
685
97
. https://doi.org/10.1093/sysbio/syu030

Adams
DC
,
Otárola-Castillo
E.
geomorph: an R package for the collection and analysis of geometric morphometric shape data
.
Methods in Ecology and Evolution
2013
;
4
:
393
9
. https://doi.org/10.1111/2041-210X.12035

Allen
GM.
A checklist of African mammals
.
Bulletin of the Museum of Comparative Zoology at Harvard College
1939
;
83
:
1
763
.

Andrews
S.
FastQC: A Quality Control Tool for High Throughput Sequence Data
.
2010
.
Available at
: www.bioinformatics.babraham.ac.uk/projects/fastqc/. (
20 August 2022, date last accessed
).

Bandelt
H
,
Forster
P
,
Röhl
A.
Median-joining networks for inferring intraspecific phylogenies
.
Molecular Biology and Evolution
1999
;
16
:
37
48
. https://doi.org/10.1093/oxfordjournals.molbev.a026036

Bibi
F
,
Tyler
J.
Evolution of the bovid cranium: morphological diversification under allometric constraint
.
Communications Biology
2022
;
5
:
69
. https://doi.org/10.1038/s42003-021-02877-6

Blomberg
SP
,
Garland
T
Jr
,
Ives
AR.
Testing for phylogenetic signal in comparative data: behavioral traits are more labile
.
Evolution
2003
;
57
:
717
45
. https://doi.org/10.1111/j.0014-3820.2003.tb00285.x

Bolger
AM
,
Lohse
M
,
Usadel
B.
Trimmomatic: a flexible trimmer for Illumina sequence data
.
Bioinformatics
2014
;
30
:
2114
20
. https://doi.org/10.1093/bioinformatics/btu170

Bouckaert
R
,
Vaughan
TG
,
Barido-Sottani
J
et al. .
BEAST 2.5: an advanced software platform for Bayesian evolutionary analysis
.
PLoS Computational Biology
2019
;
15
:
e1006650
. https://doi.org/10.1371/journal.pcbi.1006650

Cardini
A.
Modern morphometrics and the study of population differences: good data behind clever analyses and cool pictures
?
The Anatomical Record
2020
;
303
:
2747
65
. https://doi.org/10.1002/ar.24397

Castelló
JR.
Bovids of the World: Antelopes, Gazelles, Cattle, Goats, Sheep, and Relatives.
Princeton
:
Princeton University Press
,
2016
.

Catalano
SA
,
Ercoli
MD
,
Prevosti
FJ.
The more, the better: the use of multiple landmark configurations to solve the phylogenetic relationships in musteloids
.
Systematic Biology
2014
;
64
:
294
306
. doi:https://doi.org/10.1093/sysbio/syu107

Chaplin
K
,
Sumner
J
,
Hipsley
C
et al. .
An integrative approach using phylogenomics and high-resolution X-ray computed tomography (CT) for species delimitation in cryptic taxa
.
Systematic Biology
2020
;
69
:
294
307
. https://doi.org/10.1093/sysbio/syz048

Dabney
J
,
Knapp
M
,
Glocke
I
et al. .
Complete mitochondrial genome sequence of a Middle Pleistocene cave bear reconstructed from ultrashort DNA fragments
.
Proceedings of the National Academy of Sciences of the United States of America
2013
;
110
:
15758
63
. https://doi.org/10.1073/pnas.1314445110

Danecek
P
,
Auton
A
,
Abecasis
G
et al. .;
Genomes Project Analysis Group
.
The variant call format and VCFtools
.
Bioinformatics
2011
;
27
:
2156
8
. https://doi.org/10.1093/bioinformatics/btr330

Danecek
P
,
Bonfield
JK
,
Liddle
J
et al. .
Twelve years of SAMtools and BCFtools
.
GigaScience
2021
;
10
:
giab008
. https://doi.org/10.1093/gigascience/giab008

Dray
S
,
Dufour
AB
,
Chessel
D.
The ade4 package–II: two-table and K-table methods
.
R news
2007
;
7
:
47
52
. http://cran.fhcrc.org/doc/Rnews/Rnews_2007-2.pdf#page=47 (
1 August 2022, date last accessed
).

Garrison
E
,
Kronenberg
ZN
,
Dawson
ET
et al. .
A spectrum of free software tools for processing the VCF variant call format: vcflib, bio-vcf, cyvcf2, hts-nim and slivar
.
PLoS Computational Biology
2022
;
18
(
5
):
e1009123
. https://doi.org/10.1371/journal.pcbi.1009123

Garrison
E
,
Marth
G.
Haplotype-based variant detection from short-read sequencing
.
arXiv
https://doi.org/10.48550/arXiv.1207.3907,
2012
, preprint: not peer reviewed.

Gray
H
,
van Waerebeek
K
,
Owen
J
et al. .
Evolutionary drivers of morphological differentiation among three bottlenose dolphin lineages, Tursiops spp. (Delphinidae), in the northwest Indian Ocean utilizing linear and geometric morphometric techniques
.
Biological Journal of the Linnean Society
2022
;
135
:
610
29
. https://doi.org/10.1093/biolinnean/blab133

Groves
C
,
Grubb
P.
Ungulate Taxonomy
.
Baltimore
:
Johns Hopkins University Press
,
2011
.

Guindon
S
,
Dufayard
J-F
,
Lefort
V
et al. .
New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0
.
Systematic Biology
2010
;
59
:
307
21
. doi:https://doi.org/10.1093/sysbio/syq010

Hassanin
A
,
Delsuc
F
,
Ropiquet
A
et al. .
Pattern and timing of diversification of Cetartiodactyla (Mammalia, Laurasiatheria), as revealed by a comprehensive analysis of mitochondrial genomes
.
Comptes Rendus Biologies
2012
;
335
:
32
50
. https://doi.org/10.1016/j.crvi.2011.11.002

Hassanin
A
,
Houck
ML
,
Tshikung
D
et al. .
Multi-locus phylogeny of the tribe Tragelaphini (Mammalia, Bovidae) and species delimitation in bushbuck: evidence for chromosomal speciation mediated by interspecific hybridization
.
Molecular Phylogenetics and Evolution
2018
;
129
:
96
105
. https://doi.org/10.1016/j.ympev.2018.08.006

Heckeberg
NS.
The systematics of the Cervidae: a total evidence approach
.
PeerJ
2020
;
8
:
e8114
. https://doi.org/10.7717/peerj.8114

Heibl
C
,
Cusimano
N
,
Krah
FS.
ips: Interfaces to Phylogenetic Software in R
.
R package version 0.0.11
.
2019
. https://cran.r-project.org/web/packages/ips/index.html.

IUCN SSC Antelope Specialist Group
.
Tragelaphus scriptus (errata version published in 2017)
.
The IUCN Red List of Threatened Species 2016
, e.T22051A115165242.
2016
. https://dx.doi.org/10.2305/IUCN.UK.2016-3.RLTS.T22051A50196111.en (
8 December 2021, date last accessed
).

Kalyaanamoorthy
S
,
Minh
BQ
,
Wong
TK
et al. .
ModelFinder: fast model selection for accurate phylogenetic estimates
.
Nature Methods
2017
;
14
:
587
9
. https://doi.org/10.1038/nmeth.4285

Kassambara
A.
ggpubr: ‘ggplot2’ based publication ready plots
.
R package version 0.4.0
.
2017
. https://rpkgs.datanovia.com/ggpubr/

Katoh
K
,
Frith
MC.
Adding unaligned sequences into an existing alignment using MAFFT and LAST
.
Bioinformatics
2012
;
28
:
3144
6
. https://doi.org/10.1093/bioinformatics/bts578

Katoh
K
,
Misawa
K
,
Kuma
K-i
et al. .
MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform
.
Nucleic Acids Research
2002
;
30
:
3059
66
. https://doi.org/10.1093/nar/gkf436

Kingdon
J.
East African Mammals, Vol. III.C (Bovids)
.
London
:
Academic Press
,
1982
.

Leigh
JW
,
Bryant
D.
PopART: full-feature software for haplotype network construction
.
Methods in Ecology and Evolution
2015
;
6
:
1110
6
. https://doi.org/10.1111/2041-210X.12410

Li
H.
Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM
.
arXiv
, https://doi.org/10.48550/arXiv.1303.3997,
2013
, preprint: not peer reviewed.

Li
H
,
Handsaker
B
,
Wysoker
A
et al. .;
Genome Project Data Processing Subgroup
.
The Sequence Alignment/Map format and SAMtools
.
Bioinformatics
2009
;
25
:
2078
9
. https://doi.org/10.1093/bioinformatics/btp352

Li
Z
,
Yang
J
,
Wang
W
et al. .
Complete mitochondrial genome sequence of the mountain nyala (Tragelaphus buxtoni)
.
Conservation Genetics Resources
2018
;
10
:
547
50
. https://doi.org/10.1007/s12686-018-0988-1

Maddison
WP
,
Maddison
DR.
Mesquite: a modular system for evolutionary analysis
.
2019
.
Version 3.61
. http://www.mesquiteproject.org.
20 August 2022, date last accessed
).

Moodley
Y
,
Bruford
MW.
Molecular biogeography: towards an integrated framework for conserving pan-African biodiversity
.
PLoS One
2007
;
2
:
e454
. https://doi.org/10.1371/journal.pone.0000454

Moodley
Y
,
Bruford
MW
,
Bleidorn
C
et al. .
Analysis of mitochondrial DNA data reveals non-monophyly in the bushbuck (Tragelaphus scriptus) complex
.
Mammalian Biology
2009
;
74
:
418
22
. https://doi.org/10.1016/j.mambio.2008.05.003

Neuwirth
E.
RColorBrewer: ColorBrewer Palettes
.
R package version 1.1-2
.
2014
. https://cran.r-project.org/web/packages/RColorBrewer/index.html. (
20 August 2022, date last accessed
).

Nguyen
L-T
,
Schmidt
HA
,
Von Haeseler
A
et al. .
IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies
.
Molecular Biology and Evolution
2015
;
32
:
268
74
. https://doi.org/10.1093/molbev/msu300

Padial
JM
,
Miralles
A
,
De la Riva
I
et al. .
The integrative future of taxonomy
.
Frontiers in Zoology
2010
;
7
:
16
. https://doi.org/10.1186/1742-9994-7-16

Pebesma
E.
Simple features for R: standardized support for spatial vector data
.
The R Journal
2018
;
10
:
439
46
. https://doi.org/10.32614/RJ-2018-009

Petersen
KR
,
Streett
DA
,
Gerritsen
AT
,
Hunter
SS
,
Settles
ML.
Super deduper, fast PCR duplicate detection in fastq files
. In: Proceedings of the 6th ACM Conference on Bioinformatics, Computational Biology and Health Informatics, BCB ’15,
491
492
. New York:
Association for Computing Machinery
.
2015
. https://doi.org/10.1145/2808719.2811568.

Petzold
A
,
Hassanin
A.
A comparative approach for species delimitation based on multiple methods of multi-locus DNA sequence analysis: a case study of the genus Giraffa (Mammalia, Cetartiodactyla)
.
PLoS One
2020
;
15
:
e0217956
. https://doi.org/10.1371/journal.pone.0217956

Plate
T
,
Heiberger
R.
abind: combine multidimensional arrays
.
R package version 1.4-5
.
2016
. https://CRAN.R-project.org/package=abind. (
20 August 2022, date last accessed
).

Plumptre
AJ
,
Wronski
T.
Tragelaphus scriptus Bushbuck
. In:
Kingdon
J
,
Hoffmann
M
, eds,
Mammals of Africa
, Vol.
VI
.
London
:
Bloomsbury
,
2013
,
163
172
.

Prasad
A
,
Lorenzen
ED
,
Westbury
MV.
Evaluating the role of reference-genome phylogenetic distance on evolutionary inference
.
Molecular Ecology Resources
2022
;
22
:
45
55
. https://doi.org/10.1111/1755-0998.13457

R Core Team
.
R: a Language and Environment for Statistical Computing
.
Vienna
:
R Foundation for Statistical Computing
,
2020
.

Rakotoarivelo
AR
,
O’Donoghue
P
,
Bruford
MW
et al. .
An ancient hybridization event reconciles mito-nuclear discordance among spiral-horned antelopes
.
Journal of Mammalogy
2019a
;
100
:
1144
55
. https://doi.org/10.1093/jmammal/gyz089

Rakotoarivelo
AR
,
O’Donoghue
P
,
Bruford
MW
et al. .
Rapid ecological specialization despite constant population sizes
.
PeerJ
2019b
;
7
:
e6476
. https://doi.org/10.7717/peerj.6476

Revell
LJ.
phytools: an R package for phylogenetic comparative biology (and other things)
.
Methods in Ecology and Evolution
2012
;
3
:
217
23
. https://doi.org/10.1111/j.2041-210X.2011.00169.x

Rohland
N
,
Hofreiter
M.
Ancient DNA extraction from bones and teeth
.
Nature Protocols
2007
;
2
:
1756
62
. doi:https://doi.org/10.1038/nprot.2007.247

RStudio Team
.
RStudio: Integrated Development Environment for R
.
Boston
:
RStudio
,
2019
.

Schlick-Steiner
BC
,
Steiner
FM
,
Seifert
B
et al. .
Integrative taxonomy: a multisource approach to exploring biodiversity
.
Annual Review of Entomology
2010
;
55
:
421
38
. doi:https://doi.org/10.1146/annurev-ento-112408-085432

Slowikowski
K.
ggrepel: automatically position non-overlapping text labels with ‘ggplot2’
.
R package version 0.9.1
.
2018
. https://github.com/slowkow/ggrepel. (
20 August 2022, date last accessed
).

Tamura
K
,
Stecher
G
,
Kumar
S.
MEGA11: molecular evolutionary genetics analysis version 11
.
Molecular Biology and Evolution
2021
;
38
:
3022
7
. https://doi.org/10.1093/molbev/msab120

Wickham
H
,
Averick
M
,
Bryan
J
et al. .
Welcome to the tidyverse
.
Journal of Open Source Software
2019
;
4
:
1686
. https://doi.org/10.21105/joss.01686

Wickham
H
,
Seidl
DP.
scales: scale functions for visualization
.
R package version 1.1.1
.
2020
. https://cran.r-project.org/web/packages/scales/index.html. (
20 August 2022, date last accessed
).

Willows-Munro
S
,
Robinson
TJ
,
Matthee
CA.
Utility of nuclear DNA intron markers at lower taxonomic levels: Phylogenetic resolution among nine Tragelaphus spp
.
Molecular Phylogenetics and Evolution
2005
;
35
:
624
36
. https://doi.org/10.1016/j.ympev.2005.01.018

Zhang
J
,
Kobert
K
,
Flouri
T
et al. .
PEAR: a fast and accurate Illumina Paired-End reAd mergeR
.
Bioinformatics
2013
;
30
:
614
20
. https://doi.org/10.1093/bioinformatics/btt593

This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact [email protected]