An Ancient Baboon Genome Demonstrates Long-Term Population Continuity in Southern Africa

Abstract Baboons are one of the most abundant large nonhuman primates and are widely studied in biomedical, behavioral, and anthropological research. Despite this, our knowledge of their evolutionary and demographic history remains incomplete. Here, we report a 0.9-fold coverage genome sequence from a 5800-year-old baboon from the site of Ha Makotoko in Lesotho. The ancient baboon is closely related to present-day Papio ursinus individuals from southern Africa—indicating a high degree of continuity in the southern African baboon population. This level of population continuity is rare in recent human populations but may provide a good model for the evolution of Homo and other large primates over similar timespans in structured populations throughout Africa.


Introduction
Baboons (genus Papio) are Old World Monkeys, widely distributed throughout Africa and the Arabian Peninsula. The six extant species of baboon occupy largely independent geographic ranges (Jolly 1993;Zinner et al. 2013) but readily hybridize in contact regions (Nagel 1973;Samuels and Altmann 1986;Jolly 1993;Jolly et al. 2011). The oldest splits among them date to 1.5-2 Ma, between Northern (Papio hamadryas, Papio anubis, and Papio papio) and Southern (Papio ursinus and Papio cynocephalus) clades (Zinner, Groeneveld, et al. 2009;Zinner et al. 2013;Rogers et al. 2019). The southernmost species (P. ursinus) has two deeply diverged subspecies (ursinus and grisepes), whose history and distribution may have been shaped by historical changes in range driven by aridification cycles (Sithaldeen et al. 2009;Sithaldeen et al. 2015). Thus, the boundary between the P. ursinus subspecies, as well as between P. ursinus and other species, may have shifted over time. Here, we test this by analyzing the genome of a 5800-year-old baboon from close to the present-day ursinus/grisepes contact zone. As well as illuminating this phylogeographic question, our results show more broadly the usefulness of ancient DNA for understanding the history, evolution and paleoenvironmental context of African primates.

Results
We extracted and sequenced DNA from a baboon proximal phalanx excavated at the archaeological site of Ha Makotoko, western Lesotho, and directly dated to $5,800 calBP ( fig. 1A and B, supplementary table 1 and methods, Supplementary Material online). We generated, sequenced and aligned 13 libraries (supplementary table 2, Supplementary Material online) to the Panu_2.0 (P. anubis) nuclear genome, and to the P. ursinus mitochondrial genome. For these libraries, we estimated an average endogenous DNA content of 8.5% and obtained a total mean mapped autosomal coverage of 0.93Â and mitochondrial coverage of 36.6Â. The Panu_2.0 reference genome does not contain a Y chromosome, but comparison of coverage on the X chromosome (0.47) to the autosomes (mean 0.93, range 0.84-1.04) indicates that the phalanx belonged to a male. Fragment lengths (supplementary fig. 1, Supplementary Material online) and damage patterns are consistent with authentic ancient DNA (Dabney et al. 2013) with C>T transitions at 5 0 ends affecting $15% of bases in the last position (supplementary fig. 2, Supplementary Material online). Low mitochondrial (1.3%) and X chromosome (0.47%) consensus mismatch at nonreference, nondamage sites indicates that contamination (from other baboons) is low. We restricted our analysis to reads with evidence of cytosine deamination, characteristic of authentic ancient DNA (Skoglund et al. 2014), and find results consistent with the unrestricted data (supplementary table 3,  Supplementary Material online).
We compared the ancient baboon with mitochondrial data from 66 present-day baboons (Zinner, Groeneveld, et al. 2009), including Guinea (P. papio), olive (P. anubis), hamadryas (P. hamadryas), Kinda (P. kindae), yellow (P. cynocephalus), and chacma (P. ursinus) baboons. We included geladas (Theropithecus gelada) as an outgroup. These publicly available data include the complete coding sequence (CDS) of CYTB, part of the CDS of NADH5 and the complete tRNA-His, tRNA-Ser, and tRNA-Leu sequences. The resulting tree ( fig. 1C) shows that the ancient baboon has haplogroup U13/U14 and clusters with southern P. ursinus (i.e., Cape chacma, P. ursinus ursinus). Ha Makotoko is at the eastern end of the range of this subspecies, which extends to the south and west of the Kalahari Desert (Sithaldeen et al. 2015). The most closely related specimens come from the Giant's Castle Game Reserve and the Goegap Nature Reserve ( fig. 1A). We also compared the complete ancient mitochondrial genome (36Â coverage) with complete mitochondrial genomes from 10 present-day baboons (Zinner et al. 2013), and to the 16 present-day baboons from the baboon genome project diversity panel (Rogers et al. 2019), confirming that it is closely related to present-day southern P. ursinus (supplementary fig. 3, Supplementary Material online).
Next, we analyzed the autosomal genome together with 14 present-day baboons sequenced as part of the baboon genome project diversity panel (Rogers et al. 2019). Heterozygosity at sites polymorphic in P. cynocephalus is similar in the ancient baboon (8.5%) to present-day P. ursinus (8.7%), suggesting a relatively constant level of genetic diversity. In principal component analysis (PCA), the ancient baboon falls closest to the two P. ursinus individuals ( fig. 2A). D statistics (Patterson et al. 2012) suggest that the ancient baboon might carry some ancestry related to Northern clade subspecies such as P. anubis. In particular, D(T. gelada, P. anubis, P. ursinus, Ancient) has a Z score of 12.1 suggesting that the ancient baboon shares significant drift with P. anubis to the exclusion of P. ursinus. However, this is also consistent with differential attraction to the reference genome (generated from a P. anubis individual)-a common source of bias in ancient DNA studies (Cahill et al. 2018;Gunther and Nettelblad 2019;Sheng et al. 2019). This interpretation is supported by the D statistic D(T. gelada, Ancient, P. anubis, papAnu2) that has a Z score of 18.6 indicating that the ancient baboon shares more drift with the reference than other P. anubis individuals.
To further investigate the effect of reference bias, we constructed admixture graphs using TreeMix (Pickrell and Pritchard 2012) (Patterson et al. 2012) (f 3 (P. ursinus; Ancient; P. cynocephalus); Z ¼ 15.9) so it may be an artefact of the graph fitting or reflect structure within the P. ursinus population rather than admixture.
The Y chromosomal TSPY locus can distinguish between subspecies (Tosi et al. 2003;Jolly et al. 2011). We aligned reads to the T. gelada TSPY sequence (Tosi et al. 2003) and compared with seven reported sequences (Tosi et al. 2003;Zinner, Arnold, et al. 2009;Jolly et al. 2011) (supplementary table 4, Supplementary Material online). High coverage at this locus (because of multiple TSPY copies) confirms the male sex determination. The ancient baboon haplotype appears consistent with that reported for P. ursinus griseipes in the ursinus/kindae hybrid zone in Zambia (Jolly et al. 2011). At least one other P. ursinus ursinus individual from Southern Africa (Zinner, Arnold, et al. 2009) has a different haplotype (supplementary table 4, Supplementary Material online), raising the possibility of discordant mitochondrial and Y chromosome phylogenies, consistent with female philopatry and male dispersal in P. ursinus (Kopp et al. 2014).
We investigated the metrics of the phalanx (supplementary fig. 6 respectively are within the range of variation of an identified male P. ursinus (UCT/87/09) (n ¼ 7, mean ¼ 28.3 and 6.3 mm, SD ¼ 5.1 and 0.8 mm) and a larger sample of n ¼ 16 males represented by one third phalanx per individual (mean ¼ 28.5 and 7.2 mm, SD ¼ 2.4 and 0.7 mm) (Vernon 2013). However, the medio-lateral breadth at the base (the proximal joint) is 12.5 mm larger and outside the range of variation of both comparative samples (mean ¼ 10.5 and 10.5 mm, SD ¼ 0.9 and 0.8 mm). This suggests that it was more robust than a modern animal. Because the genetic data show that it is not a hybrid, this could reflect the effect of altitude (Bergman 1847; Sayers 2014). Carbon and nitrogen isotope values (d 13 C ¼ -16.99 6 0.19& and d 15 N ¼ 4.46 6 0.18& and a C:N ratio of 3.3) reflect a predominantly C 3 diet, typical of P. ursinus, although the low value may reflect a diet more heavily focused on browse or fleshy fruits. The nitrogen values reflect low trophic level species and a largely vegetarian diet.

Discussion
Our data demonstrate that P. ursinus ursinus persisted in the foothills of the Maloti-Drakensberg Mountains throughout the past 6,000 years. This is despite the fact that paleoenvironmental proxies predict drier conditions across the summer rainfall zone which may, for example, have limited human occupation across all of southeastern Southern Africa from 6.0 to 3.5 ka (Stewart and Mitchell 2018). It remains to be seen how far back in time this continuity extends, but more ancient genomes would address that question. Additional nuclear genomes from ancient and present-day baboons will allow us to estimate the extent and timing of gene flow between the ursinus and grisepes subspecies.
These data demonstrate that it is possible to extract and sequence high quality ancient genomes from southern Africa. This 0.9Â genome is $3,500 years older than the oldest human shotgun genome from the region (Schlebusch et al. 2017) and demonstrates that it should be possible to obtain much older human genomes, but also genomes from the different baboon species and other nonhuman primates. In particular, the temporal resolution provided by ancient DNA allows precise comparisons with paleoclimate data, allowing tests of specific hypotheses about the relationship between climatic variation and phylogeography. Although the vast majority of effort in ancient DNA is geared toward humans or domesticated species, this study underscores the utility of ancient DNA for understanding the history and evolution of nonhuman species under natural conditions.
We assessed the authenticity of the ancient DNA by confirming damage patterns characteristic of ancient samples, that is, DNA fragmentation and increased C>T transitions at the 5 0 ends of DNA molecules. We used bamdamage (v20140602) from the bammds package (Malaspinas et al. 2014). The distribution of read lengths shows a peak around 40 bp. C>T transitions show high rates at 5 0 ends, affecting 15% of cytosines (supplementary fig. 1, Supplementary Material online), consistent with the presence of authentic ancient DNA. For some analyses, we restricted to reads with evidence of damage using pmdtools (Skoglund et al. 2014) with the option -threshold ¼ 3.
We tested for contamination by counting the proportion of reads that mapped to the mitochondria and did not match the majority call at sites where the majority call was nonreference. Across all sites we found that 3.6% of reads did not match, and at sites where a mismatch could not be the result of deamination, 1.3% of reads did not match. Because it is possible that potential contaminants share nonreference variants with the ancient individual, this is not a direct estimate of contamination, but nevertheless supports the authenticity of the data. We repeated the same analysis for the X chromosome, finding that 0.63% of all sites and 0.47% of nondeamination sites did not match.

Mitochondrial Analysis
To call the mitochondrial genome of our ancient baboon we skipped the removal of nonuniquely aligned reads, as NUMTs in the nuclear genome result in missed coverage in the mitochondrial genome. We required each site to be covered by 10 or more reads with base qualities > 30 and that at least 80% consensus. For the partial mitochondrial genomes of Zinner, Groeneveld, et al. (2009) we aligned the data using Mafft v7.305b (Katoh and Standley 2013) and built a tree using

Autosomal Analysis
We generated pseudohaploid calls by picking a random base from all reads covering each site in the genome. We obtained baboon genome project diversity panel SNP calls from Rogers et al. (2019), merged with the pseudohaploid ancient calls, and restricted to transversions that were polymorphic in present-day baboons for all analysis. We ran TreeMix v1.13 (Pickrell and Pritchard 2012) with the "-root" option to use T. gelada as an outgroup, and the option "-noss" to turn off sample size correction. We ran qpGraph v6065 (Patterson et al. 2012), starting with the tree inferred by TreeMix and manually adding admixture edges until the absolute value of the worst D statistic Z score was <3.
We estimated conditional nucleotide diversity (CND) by restricting to sites that were polymorphic in a single P. cynocephalus individual, and counted how many were heterozygous in present-day P. ursinus. For the ancient baboon, we counted total n i and alternative k i allele counts at each SNP i, restricted to the N SNPs where n i > 1 and then estimated CND ¼ À 2=N Á P N i¼1 ½n 2 i À k 2 i À n i À k i ð Þ 2 =½n i ðn i À 1Þ n o . We averaged the results obtained from ascertaining sites in each of the two P. cynocephalus individuals, which were very similar.
To analyze TSPY we aligned reads from the ancient baboon that had not aligned to the reference to the T. gelada TSPY sequence; GenBank: AF284278.2 (Tosi et al. 2003). We also obtained the P. hamadryas sequence from the same reference, and four other partial sequences (Zinner, Arnold, et al. 2009) which we aligned to the T. gelada sequence to identify differences.

Supplementary Material
Supplementary data are available at Genome Biology and Evolution online.