Abstract

The study of Y chromosome variation has helped reconstruct demographic events associated with the spread of languages, agriculture, and pastoralism in sub-Saharan Africa, but little attention has been given to the early history of the continent. In order to overcome this lack of knowledge, we carried out a phylogeographic analysis of haplogroups A and B in a broad data set of sub-Saharan populations. These two lineages are particularly suitable for this objective because they are the two most deeply rooted branches of the Y chromosome genealogy. Their distribution is almost exclusively restricted to sub-Saharan Africa where their frequency peaks at 65% in groups of foragers. The combined high-resolution single nucleotide polymorphism analysis with short tandem repeats variation of their subclades reveals strong geographic and population structure for both haplogroups. This has allowed us to identify specific lineages related to regional preagricultural dynamics in different areas of sub-Saharan Africa. In addition, we observed signatures of relatively recent contact, both among Pygmies and between them and Khoisan speaker groups from southern Africa, thus contributing to the understanding of the complex evolutionary relationships among African hunter-gatherers. Finally, by revising the phylogeography of the very early human Y chromosome lineages, we have obtained support for the role of southern Africa as a sink, rather than a source, of the first migrations of modern humans from eastern and central parts of the continent. These results open new perspectives on the early history of Homo sapiens in Africa, with particular attention to areas of the continent where human fossil remains and archaeological data are scant.

Introduction

In the last few decades, the analysis of genetic variation in human populations has increased exponentially and has provided significant insights on the history of our species (Destro-Bisol et al. 2010; Renfrew 2010). One of the most frequently replicated results has been the support of the “Recent Out of Africa” model, initially based on mitochondrial DNA (mtDNA; Cann et al. 1987) and later gaining support from other genomic regions (Underhill et al. 2001; Rosenberg et al. 2002; Li et al. 2008). Systematic investigation of the genetic diversity in African populations focusing on mtDNA (Salas et al. 2002; Behar et al. 2008), Y chromosomes (Underhill et al. 2001; Cruciani et al. 2002; Tishkoff et al. 2007), and autosomal regions (Tishkoff et al. 2009) has started to provide insights on African-specific demographic events. However, although mtDNA variation has been thoroughly investigated by detailed dissection of the most informative lineages (Salas et al. 2002; Gonder et al. 2007; Behar et al. 2008), and, more recently, autosomal variation has begun to be explored in detail (Tishkoff et al. 2009), such a level of resolution has been only partially applied to Y chromosome African haplogroups. Sub-Saharan African Y chromosome diversity is represented by five main haplogroups (hgs): A, B, E, J, and R (Underhill et al. 2001; Cruciani et al. 2002; Tishkoff et al. 2007). Hgs J and R are geographically restricted to eastern and central Africa, respectively, whereas hg E shows a wider continental distribution (see also Berniell-Lee et al. 2009; Cruciani et al. 2010). Despite the phylogeographic dissection of hg E is still ongoing, it has been suggested that this clade might be linked, at least in part, with the diffusion of agriculture and pastoralism in the continent during the last 4,000–5,000 years, as initially indicated by its parallel distribution to Bantu-speaking communities (Underhill et al. 2001; Henn et al. 2008). The other two lineages, A and B, represent the most basal branches within the human Y chromosome genealogy and are dispersed across different geographic areas and populations, with considerably high frequencies in hunter-gatherer populations. These hgs have been related to demographic dynamics that are independent to the recent introduction of practices for active food production mentioned above, thus suggesting an association with complex and potentially more ancient demographic events (Underhill et al. 2001; Cruciani et al. 2002; Tishkoff et al. 2007; Berniell-Lee et al. 2009).

In this work, we present a detailed phylogeographic dissection of hgs A and B in a broad data set of sub-Saharan populations, with the aim of providing new insights into the complex and poorly investigated dynamics that characterize the preagricultural history of sub-Saharan Africa, with special attention given to the relationships among Pygmy and Khoisan-speaking populations from southern Africa. In addition, we aim to contribute to the debate on the geographic origin of Homo sapiens in Africa by testing whether the male-specific signals of early human origins are retained only among communities from eastern Africa (as suggested by fossil remains and mtDNA; White et al. 2003; McDougall et al. 2005; Behar et al. 2008) or whether they can also be found within groups from southern Africa (as indicated by genome-wide scans and early Y chromosome analyses; Hammer et al. 2001; Semino et al. 2002; Hellenthal et al. 2008; Tishkoff et al. 2009).

Materials and Methods

Single Nucleotide Polymorphisms and Short Tandem Repeat Genotyping

A database of 641 chromosomes (supplementary table S1, Supplementary Material online) was generated by collecting previously published data, analyzing novel samples, and extending the molecular analysis of previously genotyped samples. All DNA samples were obtained from blood, buccal swabs, or saliva samples and collected from unrelated healthy individuals who gave the appropriate informed consent.

Samples were genotyped with different sets of markers (supplementary table S1, Supplementary Material online). Single nucleotide polymorphism (SNP) scoring was carried out using minisequencing multiplex reactions and direct sequencing. A total of 33 markers were selected within haplogroups A and B according to the most updated Y chromosome genealogy presented in Karafet et al. 2008. These were divided among four different single base extension (SBE) assays, here referred to as MAI, MAII, MB, and MB2b (see supplementary table S2, Supplementary Material online). Primers for multiplex PCR amplification were designed using Primer3Plus software (Untergasser et al. 2007) and are presented in supplementary tables S3 and S4 (Supplementary Material online). Self- and cross-compatibility among all primer pairs included in the same reaction were tested with the software Autodimer (see Web resources in Acknowledgments). Y chromosome specificity of each primer was tested using BlastN (basic local alignment search tool).

The Qiagen Multiplex PCR kit and conditions specified by the producer were applied with primer concentrations ranging between 0.15 and 0.8 μM. PCR products (1.5 μl) were cleaned using 1.5 μl of ExoSAP-IT (USB Corporation) for 15 min at 37 °C followed by 15 min at 80 °C.

Minisequencing SBE primers were selected using allele-specific primer extension tools in the National Institute of Standards and Technology (NIST) Online DNA Analysis tools Page (see Web resources in Acknowledgments), and nonspecific tails of different lengths were added to each in order to ensure complete capillary separation of SNaPshot products (supplementary tables S5 and S6, Supplementary Material online). The multiplex minisequencing assays were performed using 1 μl of purified product in a total volume of 5 μl using 2 μl of SNaPshot reaction mix (Applied Biosystems Carlsbad, CA) according to the SNaPshot Kit protocol. Fluorescently labeled dideoxy nucleotide triphosphates in excess were inactivated, and 1 μl of cleaned multiplex extension products were run on an ABI PRISM 3130 Genetic Analyzer. Allele calling was performed using GeneMapper software (v. 3.7; Applied Biosystems Carlsbad, CA, USA).

Direct sequencing was used to screen markers P108 and P114. Primers for amplification are reported in supplementary table S3 (Supplementary Material online). Amplification of MSY2 was carried out according to Bao et al. 2000.

Short tandem repeats (STR) genotyping was conducted using commercially available STR kits (Krenke et al. 2005; Mulero et al. 2006) as well as multiplexes developed in-house (Beleza et al. 2003). All the samples included here were genotyped for ten STRs: DYS19, DYS389-I, DYS389-II (the allele reported in supplementary table S1, Supplementary Material online, has been obtained by subtracting the DYS389-I allele), DYS390, DYS391, DYS392, DYS393, DYS437, DYS438, and DYS439. A subset of the samples was tested for an additional five loci (DYS448, DYS456, DYS458, DYS635, and Y-GATA-H4). In the statistical analyses, specific loci (DYS385, DYS389-II, DYS390, DYS448, and DYS635) were excluded due to allelic homoplasy as reported in the NIST Y-STR Fact Sheets (see Web resources in Acknowledgments). Following this, eight STR loci were used in both phylogeographic and intralineage analyses in order to maintain broad population coverage.

Network Reconstruction and Diversity Estimation

Median-Joining networks (Bandelt et al. 1999) of both SNP and STR haplotypes were constructed using Network 4.5 (see Web resources in Acknowledgments). Weights were estimated using the inverse of the within-clade variances of individual STR loci. SNPs were weighted according to their hierarchical position in the genealogy identified in the present paper (see supplementary fig. S2c and d, Supplementary Material online). Within-hg diversity was investigated using Arlequin 3.0 (Excoffier et al. 2005). The variance was estimated as the within-locus mean allele variance averaged across all loci. Confidence intervals (CIs) were based on 10,000 resamplings performed across individuals. Samples showing missing data at any locus were not considered in the calculation of intralineage variation parameters.

Dating

The between- and within-lineage date estimates were obtained by using the model-free statistics average squared distance (ASD; Goldstein et al. 1995a, 1995b). An indication of the time of lineage split can be obtained using ASD calculated between lineages (ASD = 2μT; Goldstein et al. 1995a, 1995b). ASD is based on a strict single stepwise mutation model, and in the presence of multistep mutational events the squaring process is expected to heavily influence the distance estimation, corrupting the linearity with time. In order to take into account such occurrences and avoid the impact of multistep mutations, we calculated the expected ASD asymptotic value (Goldstein et al. 1995a) as an indication of the maximum expected ASD value per locus comparison. These values were used as locus-specific thresholds to identify and remove STR markers potentially showing between-lineage multistep mutational events. Mutation rate is a critical factor influencing the extension of ASD time-linearity. To control for this, we selected the set of eight markers among those available after multistep removal that showed the lowest mutation rate (based on the data presented on the Y-STR haplotype reference database (YHRD) webpage, release33; Willuweit et al. 2007; see also supplementary table S7, Supplementary Material online), for each interlineage comparison. In order to compare inter- and intralineage estimates, we used the same number of STRs (eight) for the within-lineage estimates (see below). ASD upper limit linearity with time can be estimated as described in Goldstein et al. (1995a). Simulations have shown that the expected values tend to overestimate the range of linearity and only provide a broad indication of the upper limit of ASD linearity with time (Goldstein et al. 1995a). We used these values as reference thresholds to ensure that all the between-lineage estimates reported in table 1a do not cross these boundaries. The starting set of markers comprised the 8 STRs used for Network analysis and diversity estimates and was extended to 11 by including DYS456, DYS458, and YGATA-H4 loci. Due to multistep correction, different sets of STRs were used (supplementary table S7, Supplementary Material online), and the average mutation rate was estimated using locus-specific values (YHRD, release33; Willuweit et al. 2007). The reported 95% CIs were estimated by averaging across the locus-specific upper and lower mutation rate estimates (YHRD, release 33; Willuweit et al. 2007). Given the limitation related to ASD saturation, some potentially interesting interlineage comparisons were beyond the available resolution dictated by the STRs we used, as, for example, the ASD between A and B clades, which is expected to give an estimate of the time to the most recent common ancestor (TMRCA) of the entire human Y chromosome genealogy. In order to provide an independent estimate for the TMRCA of a pair of lineages, we also used a Bayesian approach as described in Walsh (2001) and implemented in the software ASHEs (Tofanelli et al. 2009). In brief, this approach calculates the likelihood distribution of the TMRCA for each haplotype–haplotype comparison across n generations. In our analysis, the following parameters were used : λ(1/Ne) = 0.0002 (Walsh 2001), 10,000 generations, and the same set of STRs/mutation rates as for the corresponding ASD calculations. The maximum likelihood estimations of the number of generations to the most recent common ancestor were collected for each run and the average of these values used to obtain an indication of lineage separation. To calculate the CI, the same procedure was repeated by using the average upper and lower estimates for the locus-specific mutation rate, which was also performed for the ASD-based estimates.

Table 1.

Between-Lineage (a) and Within-Lineage (b) TMRCA estimates based on ASD and maximum likelihood (ML).

  N Years BP (95% CI) 
(a) TMRCAs among lineages    
    B2b2 versus B2b3 (ASD) 7 versus 7 10,695 (3,534–17,143) 
    B2b2 versus B2b3 (ML)  10,478 (6,882–16,523) 
    B2b2 versus B2b4b (ASD) 7 versus 6 14,322 (9,300–22,909) 
    B2b2 versus B2b4b (ML)  15,221 (10,013–23,932) 
    A2 SKHO versus WPYG (ASD) 5 versus 2 2,883 (1,891–4,619) 
    A2 SKHO versus WPYG (ML)  3,379 (2,201–5,363) 
    B2b4 SKHO versus WPYG (ASD) 3 versus 3 3,627 (2,356–5,766) 
    B2b4 SKHO versus WPYG (ML)  4,371 (2,821–6,913) 
(b) TMRCAs within lineages (ASD with modal) 
    A1-M31* 19 10,540 (4,185–23,684) 
    A1-M31 West Africa only* 12 8,091 (3,100–19,437) 
    A2-South 15 6,200 (2,232–14,198) 
    A3b1 22 10,261 (4,464–23,095) 
    A3b2 93 9,083 (3,720–20,274) 
    B2a 233 6,107 (2,263–14,012) 
    B2b3 10 1,984 (372–6,510) 
    B2b4b 11 713 (31–3,906) 
    B2b2 3,131 (868–8,990) 
  N Years BP (95% CI) 
(a) TMRCAs among lineages    
    B2b2 versus B2b3 (ASD) 7 versus 7 10,695 (3,534–17,143) 
    B2b2 versus B2b3 (ML)  10,478 (6,882–16,523) 
    B2b2 versus B2b4b (ASD) 7 versus 6 14,322 (9,300–22,909) 
    B2b2 versus B2b4b (ML)  15,221 (10,013–23,932) 
    A2 SKHO versus WPYG (ASD) 5 versus 2 2,883 (1,891–4,619) 
    A2 SKHO versus WPYG (ML)  3,379 (2,201–5,363) 
    B2b4 SKHO versus WPYG (ASD) 3 versus 3 3,627 (2,356–5,766) 
    B2b4 SKHO versus WPYG (ML)  4,371 (2,821–6,913) 
(b) TMRCAs within lineages (ASD with modal) 
    A1-M31* 19 10,540 (4,185–23,684) 
    A1-M31 West Africa only* 12 8,091 (3,100–19,437) 
    A2-South 15 6,200 (2,232–14,198) 
    A3b1 22 10,261 (4,464–23,095) 
    A3b2 93 9,083 (3,720–20,274) 
    B2a 233 6,107 (2,263–14,012) 
    B2b3 10 1,984 (372–6,510) 
    B2b4b 11 713 (31–3,906) 
    B2b2 3,131 (868–8,990) 

NOTE.—Generation time has been considered as 31 years (Helgason et al. 2003). Loci showing multistep mutational behavior were removed, and mutation rate per locus has been estimated as in the YHRD, release 33 (Willuweit et al. 2007; see supplementary table S7, Supplementary Material online, for details). For the clades indicated with (*), only seven STRs have been used for dating (see Materials and Methods for details). For population group abbreviations, refer to legend of figure 1 and supplementary table S8 (Supplementary Material online). N, number of chromosomes included in the calculation; BP, before present; SKHO, southern Khoisan speakers; WPYG, western Pygmies.

The TMRCA of a clade was estimated by calculating the ASD between all chromosomes in a lineage and the founder haplotype that we reconstructed by combining the modal alleles at single loci (Thomas et al. 1998). ASD estimated in this way has an expected value of μT, where μ is the average effective mutation rate at the loci and T is the separation time expressed in number of generations. This approach is expected to underestimate the age of the clade if the reconstructed founder haplotype differs from the true one. The 95% CIs were estimated using the software Ytime, based on a constant-size demographic model (Behar et al. 2003). The locus-specific mutation rate was estimated using data from the YHRD, release 33 (Willuweit et al. 2007). We focused on the same set of eight STRs used in the Network analyses. For estimates within hg A1, we removed the locus DYS438 due to its multistep behavior within this lineage and performed estimates on the remaining seven STRs. It should be also noted that many of these lineages are particularly rare and that the within-clade variation might have been only partially surveyed, a condition that may divert current estimates toward the lower bound of the real genealogical depth (see Petraglia et al. 2010). For all estimates, a generation time of 31 years was used (Helgason et al. 2003). The average mutation rate used for the dating estimates ranges between 1.6 and 2.2 × 10−3 mutations per locus per generation depending of which set of STRS markers was used (supplementary table S7, Supplementary Material online). These values are not substantially different from other estimates based on pedigree data and are approximately two to three times faster than the more general and non-locus-specific “evolutionary” rate (6.9 × 10−4 mutations per locus per generation; Zhivotovsky et al. 2004; see also Ravid-Amir and Rosset 2010).

Results

Hg Distribution and Variation

We genotyped both novel and previously partially investigated samples and surveyed literature data for a total of ∼10,000 males from more than 180 populations (supplementary table S8, Supplementary Material online), collecting data for 184 hg A and 457 hg B Y chromosomes (supplementary table S1, Supplementary Material online). Outside Africa, these clades have been sporadically found in Europe and the Americas, probably as a result of recent migrants (Semino et al. 2000; Luis et al. 2004; Capelli et al. 2006; Hammer et al. 2006; King et al. 2007). Hg A is rarely found in North, West, and Central Africa, whereas it is more frequent in the eastern and southern parts of the continent (fig. 1). Rare in both northern and western Africa, the distribution of hg B in the rest of the continent can be described by that of its two main subclades B2a and B2b (fig. 1). The former appears to be associated with food-producing communities and populations in contact with them, as also previously observed (Beleza et al. 2005; Berniell-Lee et al. 2009; Gomes et al. 2010), and it is present at low frequencies in all sub-Saharan areas. In contrast, B2b is mostly present in foraging communities in eastern and central Africa. The different geographic distributions of hgs A and B2b are mirrored at the population level (fig. 1 and supplementary table S8, Supplementary Material online). Little or no hg A is present in Pygmies and eastern African (EA) Khoisan speakers (for the use of the word Khoisan, issues with population classification in southern Africa, and the case of eastern Khoisan speakers, see Mitchell 2010), whereas B2b is commonly found in these populations. On the other hand, hg A is more frequent than B among southern African (SA) Khoisan speakers (∼40%), with B2b representing ∼16% of the Y chromosome types present in these populations (fig. 1 and supplementary table S8, Supplementary Material online).

FIG. 1.

Frequencies of haplogroups A (yellow), B2a (light blue), and B2b (dark blue) in Africa. For details on specific populations included in these groups, please refer to the column “Group code” in supplementary table S8 (Supplementary Material online). NFPR, northern food producers; WFPR, western food producers; WPYG, western Pygmies; CFPR, central food producers; EPYG, eastern Pygmies; EKHO, eastern Khoisan speakers; EFPR, eastern food producers; SKHO, southern Khoisan speakers; SFPR, southern food producers.

FIG. 1.

Frequencies of haplogroups A (yellow), B2a (light blue), and B2b (dark blue) in Africa. For details on specific populations included in these groups, please refer to the column “Group code” in supplementary table S8 (Supplementary Material online). NFPR, northern food producers; WFPR, western food producers; WPYG, western Pygmies; CFPR, central food producers; EPYG, eastern Pygmies; EKHO, eastern Khoisan speakers; EFPR, eastern food producers; SKHO, southern Khoisan speakers; SFPR, southern food producers.

Diversity indices are shown in table 2. Overall, hg A shows higher diversity than B and, within the latter, B2b is more variable than B2a. Network analysis based on eight STR haplotypes shows substantial phylogeographic patterns for A and B2b hgs (data not shown), whereas hg B2a reveals no clear population/geographic structure and a high level of reticulation, which is expected for lineages with a relatively short evolutionary history, associated with recent demographic expansions (table 1b and supplementary fig. S1, Supplementary Material online; see also Beleza et al. 2005; Berniell-Lee et al. 2009; Gomes et al. 2010). These results, together with the virtual absence of B2a in foraging populations, support our decision to focus the phylogeographic analysis on hgs A and B2b only, in order to address questions related to the early history of sub-Saharan Africa. The evolutionary relationships among haplotypes within these hgs, based on both SNPs and STRs, are shown in figure 2.

Table 2.

Diversity Indices for hg A and B, Including Subhaplogroups B2a and B2b, based on eight STRs (DYS19, DYS389I, DYS391, DYS392, DYS393, DYS437, DYS438, and DYS439).

Haplogroup N k/N Haplotype Diversity (SD) Variance (CI 2.5–97.5%) 
180 0.589 0.988 (0.003) 1.099 (0.955–1.217) 
443 0.400 0.987 (0.002) 0.562 (0.523–0.594) 
B2a 233 0.373 0.965 (0.005) 0.294 (0.264–0.328) 
B2b 184 0.451 0.980 (0.003) 0.743 (0.689–0.784) 
Haplogroup N k/N Haplotype Diversity (SD) Variance (CI 2.5–97.5%) 
180 0.589 0.988 (0.003) 1.099 (0.955–1.217) 
443 0.400 0.987 (0.002) 0.562 (0.523–0.594) 
B2a 233 0.373 0.965 (0.005) 0.294 (0.264–0.328) 
B2b 184 0.451 0.980 (0.003) 0.743 (0.689–0.784) 

NOTE.—Only samples with all the eight STRs available were included. N, number of chromosomes included in the calculation; k, number of different haplotypes; SD, standard deviation.

FIG. 2.

Evolutionary relationships among A and B chromosomes. (a) Haplogroup A network, combined STR and SNP haplotypes; (b) haplogroup B2b network, combined STR and SNP haplotypes; (c) haplogroups A and B, SNP-based haplotypes. Haplotypes are colored according to the key in the figure, and circle size is proportional to the number of haplotypes, with the smallest representing n = 1. STR loci used in the present analysis are DYS19, DYS389-I, DYS391, DYS392, DYS393, DYS437, DYS438, and DYS439. The star represents the root of the Y chromosome tree as inferred from Karafet et al. 2008. For population group abbreviations, refer to legend of figure 1 and supplementary table S8 (Supplementary Material online). ‘!’ indicates back-mutation.

FIG. 2.

Evolutionary relationships among A and B chromosomes. (a) Haplogroup A network, combined STR and SNP haplotypes; (b) haplogroup B2b network, combined STR and SNP haplotypes; (c) haplogroups A and B, SNP-based haplotypes. Haplotypes are colored according to the key in the figure, and circle size is proportional to the number of haplotypes, with the smallest representing n = 1. STR loci used in the present analysis are DYS19, DYS389-I, DYS391, DYS392, DYS393, DYS437, DYS438, and DYS439. The star represents the root of the Y chromosome tree as inferred from Karafet et al. 2008. For population group abbreviations, refer to legend of figure 1 and supplementary table S8 (Supplementary Material online). ‘!’ indicates back-mutation.

Of the A subclades, A1 is found only in western and central Africa, whereas A3b1 and A3b2 are southern and central/eastern African specific, respectively. Hg A2 is mostly represented by SA samples with only a few central African haplotypes (fig. 2a and c). Similarly, B2b1/B2b4a and B2b2 are geographically restricted to southern and eastern Africa, respectively, whereas B2b3, B2b4b, and B2b4* (as well as the previously undescribed MSY2* lineage; supplementary fig. S2d, Supplementary Material online) are specific to central Africa, albeit with few B2b4* SA haplotypes (fig. 2b and c). A prevalence of EA chromosomes is observed within B2b* together with considerable variation at the haplotype level, suggesting the possibility of yet undetected SNP-defined subclades within this group (fig. 2b and c). The geographically structured distribution within the B2b clade is shaped by the presence of population-specific lineages (fig. 2b and c). In fact, whereas B2b3, B2b4b, and B2b4* are almost exclusively found among western Pygmies, B2b2 and B2b1-B2b4a are found only in eastern Pygmies and SA Khoisan speakers, respectively. Similarly, the majority of the A3b1 and A2 types are found among SA Khoisan speakers, with hg A2 also present in western Pygmies (fig. 2a and c). Pygmies and SA Khoisan speakers also share evolutionarily closely related lineages within the B2b4 clade (fig. 2b and c; see also Wood et al. 2005).

A and B Genealogies

Our extensive survey of SNP variation in hgs A and B Y chromosomes enabled us to detect genealogical incompatibilities and propose some refinements within the recently proposed topology (see supplementary fig. S2, Supplementary Material online, for a comparison with the trees by Karafet et al. 2008). The PK1 marker, originally thought to be associated with the A2 lineage only, was found to cluster both A2 and A3 chromosomes. Similarly, new A2 lineages have been identified (supplementary fig. S2c, Supplementary Material online). M190 had been indicated as A3b-specific (Karafet et al. 2008); however, our analysis showed that it is derived in all A3 lineages. Within hg B, P7 appears to be basal to most of the B2b lineages and, within the P7-derived chromosomes, the MSY2 marker clusters lineages defined by M211, M115/M169, M30/M129 variants (supplementary fig. S2d, Supplementary Material online). The identification of two chromosomes derived at MSY2 and M30 (one of which is also derived for M129) but not for P7 suggests that this polymorphism might be prone to recurrent mutations (see fig. 2c). The physical proximity of P7 to P25 makes Y–Y gene conversion a possible explanation for this finding (see Adams et al. 2006). For simplicity, we have retained the same nomenclature as recently described in Karafet et al. 2008 (with the exception of MSY2*, see above), but renaming will be necessary as more data become available.

Discussion

Insights into the Male Genetic History of Sub-Saharan Hunter-Gatherers

Pygmy Groups

We dated the Eastern–Western Pygmy separation using the divergence between the B2b2 and B2b4b/B2b3 clades (table 1a). These estimates span similar time intervals and suggest a separation time of 10–15 thousand years ago (Kya), broadly overlapping with the generally more ancient estimates provided by mtDNA and autosomal data (Destro-Bisol et al. 2004; Patin et al. 2009; Batini et al. 2011). In particular, the youngest date suggested by B2b2 versus B2b3 (10.7 [3.5–17.1] Kya; 10.5 [6.8–16.5] Kya) might indicate post–Last Glacial Maximum (LGM; 19–26.5 Kya; Clark et al. 2009) male-mediated contacts between the two groups. This could account for the contrast between the lack of shared recent mitochondrial ancestry among Pygmy populations (Batini et al. 2011) and the quite intense post-LGM gene flow suggested by autosomal loci (Patin et al. 2009). However, the uncertainty related to STR choice and their time-linearity suggests that older scenarios might not be excluded (Busby G, Capelli C, personal communication). We also note that the within-clade diversity/antiquity is extremely reduced for these Pygmy-specific lineages, suggesting a bottleneck in the relatively recent demographic history of these groups, as it has been observed for other loci (see table 1b; Weiss and von Haeseler 1998; Excoffier and Schneider 1999; Patin et al. 2009; Batini et al. 2011).

Pygmies and San

We identified evolutionary links between western Pygmies and San in both A and B clades, developing the initial findings presented in Wood et al. (2005). Hg A2, found among SA Khoisan speakers at 25–45% (Wood et al. 2005; supplementary table S9, Supplementary Material online), was detected for the first time in the present work at nontrivial frequency (5%) among the Baka Pygmies from Cameroon and Gabon. On the other hand, B2b4 was present at 6–7% among Khoisan speakers but reached 45–67% in both Biaka and Baka Pygmies (Wood et al. 2005; supplementary table S9, Supplementary Material online). We dated the TMRCA among the western Pygmy- and San-specific subclades of these two haplogroups to between 3 and 4 Kya (CI 1.9–4.6 and 2.2–5.4 Kya for A2; 2.3–5.8 and 2.8–6.9 Kya for B2b4; table 1a). It should be pointed out that the large number of mutations specific to the Khoisan A2 lineage (see fig. 2a and c) is probably the result of the SNP discovery process, which included Khoisan but not Western Pygmy A2 chromosomes (Underhill et al. 2001), thus making the use of STRs for dating the most obvious choice. Evidence for a Pygmy/San link has also been provided by recent genome-wide studies. In the work presented by Hellenthal et al. (2008), the first genetic link to emerge among human populations was indeed between the San and the western Pygmies. Furthermore, a shared ancestry between the San and the eastern Pygmies has been observed recently, and more generally, between the western Pygmies and the Hadza from Tanzania (Tishkoff et al. 2009), even though this has been interpreted as the result of a possibly more ancient common genetic background than the one suggested by our results. Intriguingly, the genetic link seems to be paralleled by the sharing of cultural traits such as those found in the rock art geometric designs produced by Pygmies from the Ituri forest and the Khoe-speaking groups from southern Africa (Smith 1995, 1997, 2006; Smith and Ouzman 2004). According to this model, Khoe-speaking pastoralists would have moved from an area in Central-South Africa bringing pastoralism into southern Africa before the Bantu dispersion in the region, having previously experienced cultural and genetic exchanges with central and EA populations (Henn et al. 2008; Rocha 2010).

Genetic Evidence for the Peopling of Sub-Saharan Africa before the Diffusion of Agriculture

West Africa

Haplogroup A in western Africa is represented only by the A1a lineage. The variation within this clade dates back to 10.5 (4.2–23.7) Kya and to 8 (3.1–19.4) Kya when only western African haplotypes are considered (see table 1b), which is in agreement with the archaeological and linguistic evidence related to the peopling of this region. The Ounanian culture has in fact been recorded in Mali as far back as 9–10 Kya (Clark 1980; Raimbault 1990; Mac Donald 1998), and the lithic and ceramic assemblages from Ounjougou date back to 12 Kya (Huysecom et al. 2004; Huysecom et al. 2009). Similarly, the origin of the early Niger-Congo Atlantic branch has been placed at least 8 Kya (Ehret 2000; Blench 2006). The detection of a specific genetic signal associated with early human presence in this area is of interest given the homogeneity between western and central African populations that has been observed so far for genome-wide analysis (Cruciani et al. 2002; Wood et al. 2005; Tishkoff et al. 2007; Li et al. 2008; Tishkoff et al. 2009).

South Africa

We dated variation in SA hgs A2 and A3b1 to 6.2 (2.2–14.1) Kya and 10.2 (4.4–23) Kya, respectively (table 1b). These dates do not extend beyond the LGM, which contrasts with the early human presence in southern Africa suggested by fossil and archaeological remains (McBrearty and Brooks 2000; White et al. 2003; Lewin and Foley 2004; McDougall et al. 2005; Marean et al. 2007). This could be possibly due to our partial population coverage, as suggested by extensive population surveys (Quintana-Murci et al. 2010; Marks S, Capelli C, unpublished data), as well as to past lineage extinctions (see Petraglia et al. 2010) that followed the significant demographic changes during the Marine Isotope Stage 3 (25–60 Kya) and the LGM (Mitchell 2008). Moreover, the possible limitation of available STRs in exploring events dating further back in time may also have had an effect (Busby G, Capelli C, personal communication). It is also worth considering the possibility that A2 and A3b1 retain signatures of two independent pre-Bantu dispersal events in the region. This scenario is also supported by the different geographic distribution of these two clades: A3b1 is present across all of southern Africa, whereas A2 is almost exclusively associated with populations in south-western Africa or those originally from this area (supplementary tables S1 and S9, Supplementary Material online; see also table 2 in De Filippo et al. 2010 and unpublished data from Lesotho and additional South African populations, where A3b1 but not A2 chromosomes were found—Marks S, Capelli C, personal communication). The A2 distribution broadly overlaps that of Khoe-speakers and could potentially represent a genetic signature of the contacts/migrations of the Khoe-speaking pastoralist societies from northern Botswana, southern Angola, and western Zambia area, ∼2 Kya (see also above; Mitchell and Whitelaw 2005).

South-East Africa

Hg B2b4* chromosomes were present in the Mozambican samples, a lineage that is mainly shared with Baka Pygmies from Cameroon. The low frequency of these chromosomes in the SA and EA populations, together with the lack of appropriate evidence of a link among early inhabitants of these regions with western Pygmies, leaves the issue difficult to disentangle and calls for more detailed and focused investigation. In this sense, a scenario worth exploring could be based on the presence of this lineage in pre-Bantu populations already settled in the regions, which could either have been absorbed by the incoming agro-pastoralist groups (Sikora et al. 2010), or reflect the broader network of contacts around central-southern Africa (see above).

East Africa

The subclade A3b2 is present at high frequencies in EA populations, in particular among Nilo-Saharan speakers. Based on the analysis of this lineage in Uganda, Gomes et al. 2010 proposed its association with this linguistic phylum. Our estimates of A3b2 antiquity (9 Kya; CI 3.7–20.2 Kya) do not refute this hypothesis as they are broadly in agreement with the initial date for the spread of Nilo-Saharan phylum approximately between 12 and 18 Kya (Ehret 2000; Blench 2006).

B2a as a Marker of the Bantu Expansion?

Although B2a has not been investigated with the same resolution as the A and B2b hgs, our data support its association with Bantu-speaking populations, as previously reported (see supplementary table S1, Supplementary Material online; Beleza et al. 2005; Berniell-Lee et al. 2009). Within-clade variation suggests a more recent origin for B2a than B2b, whereas network analysis did not reveal population-specific or geographically localized STR-based clusters (supplementary fig. S1, Supplementary Material online). However, the relatively deep within-clade dating (6.1 [2.2–14] Kya) suggests a scenario possibly pre-dating the diffusion of Bantu languages, in line with what has been observed for some subclades of hg E (Montano V, Destro-Bisol G, Comas D, personal communication). Deeper phylogenetic resolution within the B2a clade, coupled with additional population sampling, may help to clarify the demographic dynamics associated with its dispersal.

The Emergence of Modern Humans

Whereas the dissection of single Y-chromosomal clades or subclades has helped define the relationships between specific populations/groups, as well as reconstruct the demographic impact of migratory and cultural events, a wider and exhaustive phylogeographic analysis may indicate areas of the African continent where the extant human Y chromosome diversity first originated. Haplogroups A and B are ideal candidates for this task, given their distribution in Africa and the fact that they represent the earliest lineages to branch off within the Y chromosome genealogy. Previous analysis of the Y chromosome variation pointed to an SA/EA origin following the identification of hg A3b and, to a lower extent, B types in populations from these areas (Hammer et al. 2001; Semino et al. 2002). However, our results clearly indicate that A3b branched later within hg A, making it uninformative on the origin of the early human Y lineages. Hg A is divided into two branches: A1, represented by western and central African types, and A2-A3, containing SA and EA chromosomes, with a few from central Africa. Hg A2 is mostly composed of southern Africa types; however, an early branch in A2 is found in central Africa. Within hg A3, A3b1, the southern Africa clade, is a sister clade to A3b2, common in eastern Africa, whereas A3a is only found among EAs (fig. 2c). In hg B, B2a and B2b are two sister clades, whereas B*(xB2) aggregates a number of chromosomes from central Africa that were ancestral for the set of SNPs we tested. B2a has a very wide distribution and is mainly present in Bantu-speaking populations. Within hg B2b, B2b* contains samples from eastern, south-eastern, and central Africa, with P6-derived chromosomes from South Africa and P7 types mainly from hunter-gatherer populations from central, eastern, and southern Africa (see fig. 2c). These results seem to indicate that southern Africa was an early destination of ancient human migrations from other regions other than the original source, which fails to support the hypothesis presented in a recent large-scale study of autosomal loci (Tishkoff et al. 2009). With respect to the roles of eastern and central Africa, the data set presented here, although tentatively pointing toward a wide-scale preservation of ancient lineages in central Africa, is still compatible with a primary role for eastern Africa, in agreement with hypotheses generated from both mtDNA analysis and the study of the earliest Homo sapiens fossil remains (White et al. 2003; McDougall et al. 2005; Behar et al. 2008).

Concluding Remarks

Detailed phylogeographic analysis of human Y chromosome hgs A and B, combined with a large population survey and extensive sublineages characterization, has allowed us to gain new insights into the processes that shaped the preagricultural peopling of the African continent. Our results provide a male-specific perspective on some key aspects of the genetic history of sub-Saharan Africa and form the basis for future research.

We have shown evidence for further complexity in the evolutionary relationships among African hunter-gatherers. Phylogeographic analyses of mtDNA point to an ancient separation among ancestral populations, with limited or no subsequent gene flow after the split (see Salas et al. 2002; Destro-Bisol et al. 2004; Batini et al. 2007; Behar et al. 2008; Quintana-Murci et al. 2008; Batini et al. 2011). Conversely, the analysis of autosomal loci suggests a common, and possibly more recent, genetic background (see Tishkoff et al. 2009), with contrasting evidence concerning the reciprocal relationships among Pygmies and San (see Hellenthal et al. 2008; Li et al. 2008; Tishkoff et al. 2009), although this lacks a well-defined temporal context. Our extensive phylogeographic and dating approach has provided evidence for relatively recent contact both among Pygmies and between them and San groups from southern Africa. Our current estimates for the coalescent time between Eastern and Western Pygmy-specific Y chromosome clades (10–15 Kya) are compatible with post-LGM contact among the two groups, with evidence for recent bottlenecks in the demographic histories of the two groups (see also Patin et al. 2009; Batini et al. 2011). Otherwise, the very recent common ancestry detected among Western Pygmies and San (3–4 Kya) suggests that this could be the signature of Khoe-speaking pastoralist-mediated contact among the two groups, rather than resulting from retention of ancient traits.

Lastly, the peopling of sub-Saharan Africa has been studied from linguistic, archaeological, and genetic perspectives in the last decade, but its most ancient period is not yet well understood (see Campbell and Tishkoff 2010, Scheinfeldt et al. 2010). We have highlighted some signatures of preagricultural peopling undetected by previous research work. In fact, West, East, and South African populations show specific clades whose TMRCAs are compatible with a differentiation pre-dating the arrival of Bantu-speaking people and farming in the area. Intriguingly, even B2a, which has been mainly found in Bantu-speaking communities, has been dated (6 [2–14] Kya) before the supposed time of diffusion of Bantu languages. A novel link among Pygmy hunter-gatherers from west-central Africa and farmers from Mozambique has been identified, pointing to a shared genetic legacy between these two geographically separate and anthropologically distinct population groups (see also Sikora et al. 2010).

Finally, our study contributes to the debate on the geographical origin of Homo sapiens in sub-Saharan Africa, providing evidence for the retention of early Y chromosome lineages in East and Central but not in Southern Africa. However, we note that the current absence of significant palaeo-anthropological investigation, together with the possibility of different fossil preservation conditions in central Africa, makes the extremely long human fossil record in eastern Africa inconclusive in solving this issue. The screening of Y-chromosomal variation at an increased level of resolution, combined with additional sampling from these regions, is expected to further elucidate the early steps of Homo sapiens in Africa.

Supplementary Material

Supplementary tables S1S9 and figures S1 and S2 are available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).

We would like to thank Sergio Tofanelli and Davide Merlitti for giving access to early versions of the ASHEs software; Jim Wilson, Fabio Verginelli, and Renato Mariani-Costantini for providing samples and unpublished data; Peter Mitchell for helpful discussions on the African archaeological record; Marco Giorgi and Isabel Mendizabal for providing scripts used during data analysis; and Mònica Vallés, Stéphanie Plaza, and Roger Anglada (Universitat Pompeu Fabra) and Milena Alu’ (Università di Modena e Reggio Emilia) for technical support. Finally, we would like to express our gratitude to all the people that have made this work possible by donating their DNA. The research presented was supported by the Dirección General de Investigación, Ministerio de Educación y Ciencia, Spain (CGL2007-61016), and Direcció General de Recerca, Generalitat de Catalunya (2009SGR1101). G.D.-B. and G.S. were supported by the University of Rome “La Sapienza” (C26A09EA9C/2009). J.R. was supported by the Fundação para a Ciência e a Tecnologia (PTDC/BIA-BDE/68999/2006). P.S.-D. is supported by the Isidro Parga Pondal program (Plan Galego de Investigación, Desenvolvemento e Innovación Tecnolóxica-INCITE [2006–2010] from Xunta de Galicia, Spain). C.C. is a Research Council UK Academic Fellow. C.B. and C.C. designed the research. C.B., G.F., D.C., and C.C. conceived and designed the experiments. G.D.-B., D.L., J.R., T.S., A.B., V.M., N.E.E., G.S., M.E.D.A., N.M., P.E., and D.C. provided the samples and part of the genotypings. C.B., G.F., F.B., and P.S.-D. performed the experiments. C.B. and C.C. analyzed the data. C.B. and C.C. wrote the paper with the contribution of G.D.-B. and D.C. All co-authors have reviewed the manuscript prior to submission. Web resources—Autodimer: http://cstl.nist.gov/; NIST Online DNA Analysis tools page: http://yellow.nist.gov:8444/dnaAnalysis/index.do; Y-STR Fact Sheets: http://www.cstl.nist.gov/strbase/ystr_fact.htm; Network 4.5: www.fluxus-engineering.com; ASHEs: http://ashes.codeplex.com/.

References

Adams
SM
King
TE
Bosch
E
Jobling
MA
The case of the unreliable SNP: recurrent back-mutation of Y-chromosomal marker P25 through gene conversion
Forensic Sci Int.
 , 
2006
, vol. 
159
 (pg. 
14
-
20
)
Bandelt
HJ
Forster
P
Rohl
A
Median-joining networks for inferring intraspecific phylogenies
Mol Biol Evol.
 , 
1999
, vol. 
16
 (pg. 
37
-
48
)
Bao
W
Zhu
S
Pandya
A
Zerjal
T
Xu
J
Shu
Q
Du
R
Yang
H
Tyler-Smith
C
MSY2: a slowly evolving minisatellite on the human Y chromosome which provides a useful polymorphic marker in Chinese populations
Gene
 , 
2000
, vol. 
244
 (pg. 
29
-
33
)
Batini
C
Coia
V
Battaggia
C
Rocha
J
Pilkington
MM
Spedini
G
Comas
D
Destro-Bisol
G
Calafell
F
Phylogeography of the human mitochondrial L1c haplogroup: genetic signatures of the prehistory of Central Africa
Mol Phylogenet Evol.
 , 
2007
, vol. 
43
 (pg. 
635
-
644
)
Batini
C
Lopes
J
Behar
DM
Calafell
F
Jorde
LB
van der Veen
L
Quintana-Murci
L
Spedini
G
Destro-Bisol
G
Comas
D
Insights into the demographic history of African Pygmies from complete mitochondrial genomes
Mol Biol Evol.
 , 
2011
, vol. 
28
 (pg. 
1099
-
1110
)
Behar
DM
Thomas
MG
Skorecki
K
, et al.  . 
(12 co-authors)
Multiple origins of Ashkenazi Levites: Y chromosome evidence for both near eastern and European ancestries
Am J Hum Genet.
 , 
2003
, vol. 
73
 (pg. 
768
-
779
)
Behar
DM
Villems
R
Soodyall
H
, et al.  . 
(15 co-authors)
The dawn of human matrilineal diversity
Am J Hum Genet.
 , 
2008
, vol. 
82
 (pg. 
1130
-
1140
)
Beleza
S
Alves
C
Gonzalez-Neira
A
Lareu
M
Amorim
A
Carracedo
A
Gusmao
L
Extending STR markers in Y chromosome haplotypes
Int J Legal Med.
 , 
2003
, vol. 
117
 (pg. 
27
-
33
)
Beleza
S
Gusmao
L
Amorim
A
Carracedo
A
Salas
A
The genetic legacy of western Bantu migrations
Hum Genet.
 , 
2005
, vol. 
117
 (pg. 
366
-
375
)
Berniell-Lee
G
Calafell
F
Bosch
E
Heyer
E
Sica
L
Mouguiama-Daouda
P
van der Veen
L
Hombert
JM
Quintana-Murci
L
Comas
D
Genetic and demographic implications of the Bantu expansion: insights from human paternal lineages
Mol Biol Evol.
 , 
2009
, vol. 
26
 (pg. 
1581
-
1589
)
Blench
R
Archaeology, language, and the African past
 , 
2006
Lanham (MD)
AltaMira Press
Campbell
MC
Tishkoff
SA
The evolution of human genetic and phenotypic variation in Africa
Curr Biol.
 , 
2010
, vol. 
20
 (pg. 
R166
-
R173
)
Cann
RL
Stoneking
M
Wilson
AC
Mitochondrial DNA and human evolution
Nature
 , 
1987
, vol. 
325
 (pg. 
31
-
36
)
Capelli
C
Redhead
N
Romano
V
, et al.  . 
(18 co-authors)
Population structure in the Mediterranean basin: a Y chromosome perspective
Ann Hum Genet.
 , 
2006
, vol. 
70
 (pg. 
207
-
225
)
Clark
JD
William
MAJ
Faure
H
Human populations and cultural adaptations in the Sahara and the Nile during prehistoric times
The Sahara and the Nile: quaternary environments and prehistoric occupation on Northern Africa
 , 
1980
Rotterdam (The Netherlands)
Balkema
 
p. 527–582
Clark
PU
Dyke
AS
Shakun
JD
Carlson
AE
Clark
J
Wohlfarth
B
Mitrovica
JX
Hostetler
SW
McCabe
AM
The last glacial maximum
Science
 , 
2009
, vol. 
325
 (pg. 
710
-
714
)
Cruciani
F
Santolamazza
P
Shen
P
, et al.  . 
(16 co-authors)
A back migration from Asia to sub-Saharan Africa is supported by high-resolution analysis of human Y-chromosome haplotypes
Am J Hum Genet.
 , 
2002
, vol. 
70
 (pg. 
1197
-
1214
)
Cruciani
F
Trombetta
B
Sellitto
D
Massaia
A
Destro-Bisol
G
Watson
E
Beraud Colomb
E
Dugoujon
JM
Moral
P
Scozzari
R
Human Y chromosome haplogroup R-V88: a paternal genetic record of early mid Holocene trans-Saharan connections and the spread of Chadic languages
Eur J Hum Genet.
 , 
2010
, vol. 
18
 (pg. 
800
-
807
)
De Filippo
C
Heyn
P
Barham
L
Stoneking
M
Pakendorf
B
Genetic perspectives on forager-farmer interaction in the Luangwa valley of Zambia
Am J Phys Anthropol.
 , 
2010
, vol. 
141
 (pg. 
382
-
394
)
Destro-Bisol
G
Coia
V
Boschi
I
Verginelli
F
Caglia
A
Pascali
V
Spedini
G
Calafell
F
The analysis of variation of mtDNA hypervariable region 1 suggests that eastern and western pygmies diverged before the Bantu expansion
Am Nat.
 , 
2004
, vol. 
163
 (pg. 
212
-
226
)
Destro-Bisol
G
Jobling
MA
Rocha
J
Novembre
J
Richards
MB
Mulligan
C
Batini
C
Manni
F
Molecular anthropology in the genomic era
J Anthropol Sci.
 , 
2010
, vol. 
88
 (pg. 
93
-
112
)
Ehret
C
Heine
B
Nurse
D
Language and history
African languages: an introduction
 , 
2000
Cambridge
Cambridge University Press
(pg. 
272
-
297
)
Excoffier
L
Laval
G
Schneider
S
Arlequin (version 3.0): an integrated software package for population genetics data analysis
Evol Bioinform Online.
 , 
2005
, vol. 
1
 (pg. 
47
-
50
)
Excoffier
L
Schneider
S
Why hunter-gatherer populations do not show signs of pleistocene demographic expansions
Proc Natl Acad Sci U S A.
 , 
1999
, vol. 
96
 (pg. 
10597
-
10602
)
Goldstein
DB
Ruiz Linares
A
Cavalli-Sforza
LL
Feldman
MW
An evaluation of genetic distances for use with microsatellite loci
Genetics
 , 
1995
, vol. 
139
 (pg. 
463
-
471
)
Goldstein
DB
Ruiz Linares
A
Cavalli-Sforza
LL
Feldman
MW
Genetic absolute dating based on microsatellites and the origin of modern humans
Proc Natl Acad Sci U S A.
 , 
1995
, vol. 
92
 (pg. 
6723
-
6727
)
Gomes
V
Sanchez-Diz
P
Amorim
A
Carracedo
A
Gusmao
L
Digging deeper into east African human Y chromosome lineages
Hum Genet.
 , 
2010
, vol. 
127
 (pg. 
603
-
613
)
Gonder
MK
Mortensen
HM
Reed
FA
de Sousa
A
Tishkoff
SA
Whole-mtDNA genome sequence analysis of ancient African lineages
Mol Biol Evol.
 , 
2007
, vol. 
24
 (pg. 
757
-
768
)
Hammer
MF
Chamberlain
VF
Kearney
VF
Stover
D
Zhang
G
Karafet
T
Walsh
B
Redd
AJ
Population structure of Y chromosome SNP haplogroups in the United States and forensic implications for constructing Y chromosome STR databases
Forensic Sci Int.
 , 
2006
, vol. 
164
 (pg. 
45
-
55
)
Hammer
MF
Karafet
TM
Redd
AJ
Jarjanazi
H
Santachiara-Benerecetti
S
Soodyall
H
Zegura
SL
Hierarchical patterns of global human Y-chromosome diversity
Mol Biol Evol.
 , 
2001
, vol. 
18
 (pg. 
1189
-
1203
)
Helgason
A
Hrafnkelsson
B
Gulcher
JR
Ward
R
Stefansson
K
A populationwide coalescent analysis of Icelandic matrilineal and patrilineal genealogies: evidence for a faster evolutionary rate of mtDNA lineages than Y chromosomes
Am J Hum Genet.
 , 
2003
, vol. 
72
 (pg. 
1370
-
1388
)
Hellenthal
G
Auton
A
Falush
D
Inferring human colonization history using a copying model
PLoS Genet.
 , 
2008
, vol. 
4
 pg. 
e1000078
 
Henn
BM
Gignoux
C
Lin
AA
Oefner
PJ
Shen
P
Scozzari
R
Cruciani
F
Tishkoff
SA
Mountain
JL
Underhill
PA
Y-chromosomal evidence of a pastoralist migration through Tanzania to southern Africa
Proc Natl Acad Sci U S A.
 , 
2008
, vol. 
105
 (pg. 
10693
-
10698
)
Huysecom
E
Ozainne
S
Raeli
F
Ballouche
A
Rasse
M
Stokes
S
Ounjougou (mali): a history of Holocene settlement at the southern edge of the Sahara
Antiquity
 , 
2004
, vol. 
78
 (pg. 
579
-
593
)
Huysecom
E
Rasse
M
Lespez
L
Neumann
K
Fahmy
A
Ballouche
A
Ozainne
S
Maggetti
M
Tribolo
C
Soriano
S
The emergence of pottery in Africa during the tenth millennium cal BC: new evidence from Ounjougou (Mali)
Antiquity
 , 
2009
, vol. 
83
 (pg. 
905
-
917
)
Karafet
TM
Mendez
FL
Meilerman
MB
Underhill
PA
Zegura
SL
Hammer
MF
New binary polymorphisms reshape and increase resolution of the human Y chromosomal haplogroup tree
Genome Res.
 , 
2008
, vol. 
18
 (pg. 
830
-
838
)
King
TE
Parkin
EJ
Swinfield
G
Cruciani
F
Scozzari
R
Rosa
A
Lim
SK
Xue
Y
Tyler-Smith
C
Jobling
MA
Africans in Yorkshire? The deepest-rooting clade of the Y phylogeny within an English genealogy
Eur J Hum Genet.
 , 
2007
, vol. 
15
 (pg. 
288
-
293
)
Krenke
BE
Viculis
L
Richard
ML
, et al.  . 
(14 co-authors)
Validation of male-specific, 12-locus fluorescent short tandem repeat (STR) multiplex
Forensic Sci Int.
 , 
2005
, vol. 
151
 (pg. 
111
-
124
)
Lewin
R
Foley
R
Principles of human evolution
 , 
2004
Oxford
Wiley-Blackwell
Li
JZ
Absher
DM
Tang
H
, et al.  . 
(11 co-authors)
Worldwide human relationships inferred from genome-wide patterns of variation
Science
 , 
2008
, vol. 
319
 (pg. 
1100
-
1104
)
Luis
JR
Rowold
DJ
Regueiro
M
Caeiro
B
Cinnioglu
C
Roseman
C
Underhill
PA
Cavalli-Sforza
LL
Herrera
RJ
The Levant versus the horn of Africa: evidence for bidirectional corridors of human migrations
Am J Hum Genet.
 , 
2004
, vol. 
74
 (pg. 
532
-
544
)
Mac Donald
KC
Blench
R
Spriggs
M
Archaeology, language and the peopling of west Africa: a consideration of the evidence
Archaeology and language II: correlating archaeological and linguistic hypotheses
 , 
1998
London
Routledge
 
p. 33–66
Marean
CW
Bar-Matthews
M
Bernatchez
J
Fisher
E
Goldberg
P
Herries
AIR
Jacobs
Z
Jerardino
A
Karkanas
P
Minichillo
T
Early human use of marine resources and pigment in South Africa during the middle pleistocene
Nature
 , 
2007
, vol. 
449
 (pg. 
905
-
908
)
McBrearty
S
Brooks
AS
The revolution that wasn't: a new interpretation of the origin of modern human behavior
J Hum Evol.
 , 
2000
, vol. 
39
 (pg. 
453
-
563
)
McDougall
I
Brown
FH
Fleagle
JG
Stratigraphic placement and age of modern humans from Kibish, Ethiopia
Nature
 , 
2005
, vol. 
433
 (pg. 
733
-
736
)
Mitchell
P
Developing the archaeology of marine isotope stage 3
S Afr Archaeol Soc Goodwin.
 , 
2008
, vol. 
10
 (pg. 
52
-
65
)
Mitchell
P
Genetics and southern African prehistory: an archaeological view
J Anthropol Sci.
 , 
2010
, vol. 
88
 (pg. 
73
-
92
)
Mitchell
P
Whitelaw
G
The archaeology of southernmost Africa from c. 2000 bp to the early 1800s: a review of recent research
J Afr Hist.
 , 
2005
, vol. 
46
 (pg. 
209
-
241
)
Mulero
JJ
Chang
CW
Calandro
LM
Green
RL
Li
Y
Johnson
CL
Hennessy
LK
Development and validation of the AmpFlSTR Yfiler PCR amplification kit: a male specific, single amplification 17 Y-STR multiplex system
J Forensic Sci.
 , 
2006
, vol. 
51
 (pg. 
64
-
75
)
Patin
E
Laval
G
Barreiro
LB
, et al.  . 
(15 co-authors)
Inferring the demographic history of African farmers and Pygmy hunter-gatherers using a multilocus resequencing data set
PLoS Genet.
 , 
2009
, vol. 
5
 pg. 
e1000448
 
Petraglia
MD
Haslam
M
Fuller
DQ
Boivin
N
Clarkson
C
Out of Africa: new hypotheses and evidence for the dispersal of Homo sapiens along the Indian Ocean rim
Ann Hum Biol.
 , 
2010
, vol. 
37
 (pg. 
288
-
311
)
Quintana-Murci
L
Harmant
C
Quach
H
Balanovsky
O
Zaporozhchenko
V
Bormans
C
van Helden
PD
Hoal
EG
Behar
DM
Strong maternal Khoisan contribution to the South African coloured population: a case of gender-biased admixture
Am J Hum Genet.
 , 
2010
, vol. 
86
 (pg. 
611
-
620
)
Quintana-Murci
L
Quach
H
Harmant
C
, et al.  . 
(23 co-authors)
Maternal traces of deep common ancestry and asymmetric gene flow between Pygmy hunter-gatherers and Bantu-speaking farmers
Proc Natl Acad Sci USA.
 , 
2008
, vol. 
105
 (pg. 
1596
-
1601
)
Raimbault
M
Pour une approche du néolithique du sahara malien
Trav du LAPMO.
 , 
1990
, vol. 
1990
 (pg. 
67
-
82
)
Ravid-Amir
O
Rosset
S
Maximum likelihood estimation of locus-specific mutation rates in Y-chromosome short tandem repeats
Bioinformatics
 , 
2010
, vol. 
26
 (pg. 
i440
-
i445
)
Renfrew
C
Archaeogenetics—towards a 'new synthesis'?
Curr Biol.
 , 
2010
, vol. 
20
 (pg. 
R162
-
R165
)
Rocha
J
Bantu-Khoisan interactions at the edge of the Bantu expansions: insights from southern Angola
J Anthropol Sci
 , 
2010
, vol. 
88
 (pg. 
5
-
8
)
Rosenberg
NA
Pritchard
JK
Weber
JL
Cann
HM
Kidd
KK
Zhivotovsky
LA
Feldman
MW
Genetic structure of human populations
Science
 , 
2002
, vol. 
298
 (pg. 
2381
-
2385
)
Salas
A
Richards
M
De la Fe
T
Lareu
MV
Sobrino
B
Sanchez-Diz
P
Macaulay
V
Carracedo
A
The making of the African mtDNA landscape
Am J Hum Genet.
 , 
2002
, vol. 
71
 (pg. 
1082
-
1111
)
Scheinfeldt
LB
Soi
S
Tishkoff
SA
Colloquium paper: working toward a synthesis of archaeological, linguistic, and genetic data for inferring African population history
Proc Natl Acad Sci U S A.
 , 
2010
, vol. 
107
 
Suppl 2
(pg. 
8931
-
8938
)
Semino
O
Passarino
G
Oefner
PJ
, et al.  . 
(17 co-authors)
The genetic legacy of paleolithic Homo sapiens sapiens in extant Europeans: a Y chromosome perspective
Science
 , 
2000
, vol. 
290
 (pg. 
1155
-
1159
)
Semino
O
Santachiara-Benerecetti
AS
Falaschi
F
Cavalli-Sforza
LL
Underhill
PA
Ethiopians and Khoisan share the deepest clades of the human Y-chromosome phylogeny
Am J Hum Genet.
 , 
2002
, vol. 
70
 (pg. 
265
-
268
)
Sikora
M
Laayouni
H
Calafell
F
Comas
D
Bertranpetit
J
A genomic analysis identifies a novel component in the genetic structure of sub-Saharan African populations
Eur J Hum Genet.
 , 
2010
, vol. 
19
 (pg. 
84
-
88
)
Smith
BW
Rock art in south-central Africa
 , 
1995
 
[PhD thesis]. Cambridge: Department of Archaeology, University of Cambridge.
Smith
BW
Zambia's ancient rock art: the painting of Kasama
 , 
1997
Oxford
Nuffield Press for the National Heritage Conservation Commission of Zambia
Smith
BW
Soodyall
H
Reading rock art and writing genetic history: regionalism, ethnicity and the rock art of southern Africa
The prehistory of Africa: tracing the lineage of modern man
 , 
2006
Cape Town
Jonathan Ball Publishers
 
p. 76–96
Smith
BW
Ouzman
S
Taking stock: identifying Khoekhoen Herder rock art in southern Africa
Curr Anthropol.
 , 
2004
, vol. 
45
 (pg. 
499
-
526
)
Thomas
MG
Skorecki
K
Ben-Ami
H
Parfitt
T
Bradman
N
Goldstein
DB
Origins of old testament priests
Nature
 , 
1998
, vol. 
394
 (pg. 
138
-
140
)
Tishkoff
SA
Gonder
MK
Henn
BM
, et al.  . 
(12 co-authors)
History of click-speaking populations of Africa inferred from mtDNA and Y chromosome genetic variation
Mol Biol Evol.
 , 
2007
, vol. 
24
 (pg. 
2180
-
2195
)
Tishkoff
SA
Reed
FA
Friedlaender
FR
, et al.  . 
(25 co-authors)
The genetic structure and history of Africans and African Americans
Science
 , 
2009
, vol. 
324
 (pg. 
1035
-
1044
)
Tofanelli
S
Bertoncini
S
Castri
L
Luiselli
D
Calafell
F
Donati
G
Paoli
G
On the origins and admixture of Malagasy: new evidence from high-resolution analyses of paternal and maternal lineages
Mol Biol Evol.
 , 
2009
, vol. 
26
 (pg. 
2109
-
2124
)
Underhill
PA
Passarino
G
Lin
AA
Shen
P
Mirazón Lahr
M
Foley
RA
Oefner
PJ
Cavalli-Sforza
LL
The phylogeography of Y chromosome binary haplotypes and the origins of modern human populations
Ann Hum Genet.
 , 
2001
, vol. 
65
 (pg. 
43
-
62
)
Untergasser
A
Nijveen
H
Rao
X
Bisseling
T
Geurts
R
Leunissen
JA
Primer3Plus, an enhanced web interface to Primer3
Nucleic Acids Res.
 , 
2007
, vol. 
35
 (pg. 
W71
-
W74
)
Walsh
B
Estimating the time to the most recent common ancestor for the Y chromosome or mitochondrial DNA for a pair of individuals
Genetics
 , 
2001
, vol. 
158
 (pg. 
897
-
912
)
Weiss
G
von Haeseler
A
Inference of population history using a likelihood approach
Genetics
 , 
1998
, vol. 
149
 (pg. 
1539
-
1546
)
White
TD
Asfaw
B
DeGusta
D
Gilbert
H
Richards
GD
Suwa
G
Howell
FC
Pleistocene Homo sapiens from Middle Awash, Ethiopia
Nature
 , 
2003
, vol. 
423
 (pg. 
742
-
747
)
Willuweit
S
Roewer
L
International Forensic Y Chromosome User Group
Y chromosome haplotype reference database (YHRD): update
Forensic Sci Int Genet.
 , 
2007
, vol. 
1
 (pg. 
83
-
87
)
Wood
ET
Stover
DA
Ehret
C
, et al.  . 
(11 co-authors)
Contrasting patterns of Y chromosome and mtDNA variation in Africa: evidence for sex-biased demographic processes
Eur J Hum Genet.
 , 
2005
, vol. 
13
 (pg. 
867
-
876
)
Zhivotovsky
LA
Underhill
PA
Cinnioğlu
C
, et al.  . 
(18 co-authors)
The effective mutation rate at Y chromosome short tandem repeats, with application to human population-divergence time
Am J Hum Genet.
 , 
2004
, vol. 
74
 (pg. 
50
-
61
)

Author notes

Present address: Department of Genetics, University of Leicester, Leicester, United Kingdom
Associate editor: Beth Shapiro