Abstract

Polymorphisms at di-, tri-, and tetranucleotide microsatellite loci have been analyzed in 14 worldwide populations. A statistical index of population expansion, denoted Sk, is introduced to detect historical changes in population size using the variation at the microsatellites. The index takes the value 0 at equilibrium with constant population size and is positive or negative according to whether the population is expanding or contracting, respectively. The use of Sk requires estimation of properties of the mutation distribution for which we use both family data of Dib et al. for dinucleotide loci and our population data on tri- and tetranucleotide loci. Statistical estimates of the expansion index, as well as their confidence intervals from bootstrap resampling, are provided. In addition, a dynamical analysis of Sk is presented under various assumptions on population growth or decline. The studied populations are classified as having high, intermediate, or low values of Sk and genetic variation, and we use these to interpret the data in terms of possible population dynamics. Observed values of Sk for samples of di-, tri-, and tetranucleotide data are compatible with population expansion earlier than 60,000 years ago in Africa, Asia, and Europe if the initial population size before the expansion was on the order of 500. Larger initial population sizes force the lower bound for the time since expansion to be much earlier. We find it unlikely that bottlenecks occurred in Central African, East Asian, or European populations, and the estimated expansion times are rather similar for all of these populations. This analysis presented here suggests that modern human populations departed from Africa long before they began to expand in size. Subsequently, the major groups (the African, East Asian, and European groups) started to grow at approximately same time. Populations of South America and Oceania show almost no growth. The Mbuti population from Zaire appears to have experienced a bottleneck during its expansion.

Introduction

Advances in DNA technology have facilitated the study of evolution of many species (Hoelzel 1998<$REFLINK> ) and particularly the history of human populations. Most of these studies emphasize the unique role of the sub-Saharan African populations in the evolution of modern humans and strongly support the out-of-Africa hypothesis for the origin of modern humans (e.g., Cann, Stoneking, and Wilson 1987<$REFLINK> ; Vigilant et al. 1991<$REFLINK> ; Bowcock et al. 1994<$REFLINK> ; Goldstein et al. 1995a, 1995b<$REFLINK> ; Horai et al. 1995<$REFLINK> ; Jorde et al. 1995<$REFLINK> ; Stoneking 1997<$REFLINK> ). Genetic data have also led to the suggestion that modern human populations expanded recently (e.g., Rogers and Harpending 1992<$REFLINK> ; Sherry et al. 1994<$REFLINK> ; Cavalli-Sforza, Menozzi, and Piazza 1996; Jorde et al. 1997<$REFLINK> ; Stoneking et al. 1997<$REFLINK> ; Kimmel et al. 1998<$REFLINK> ; Reich and Goldstein 1998<$REFLINK> ; Reich, Feldman, and Goldstein 1999<$REFLINK> ). These studies suggest more ancient and more rapid growth of African populations than of non-African populations, including European and East Asian populations. However, many details of the historical demography of modern humans remain to be clarified and are subject to continuing controversy. The issues that remain to resolved include the following: (1) Was the effective population size of the main human populations small or large prior to the emergence from Africa? (2) Did a population bottleneck occur in any of the main ancestral populations, in particular in the non-African populations? (3) Did African populations expand before the emergence of groups that left Africa or after, or was there an intervening bottleneck?

An alternative to this out-of-Africa hypothesis, the multiregional hypothesis, holds that the major human subgroups evolved in situ after an original expansion some 800,000 years ago, with the current similarity of human populations being due largely to gene flow (Wolpoff 1989<$REFLINK> ; Frayer et al. 1993<$REFLINK> ). This contrasts with the idea of an African origin for modern humans who replaced earlier populations between 30,000 and 100,000 years ago (Stringer and Andrews 1988<$REFLINK> ).

The dynamics of genetic diversity depend on many population parameters that are difficult to estimate. In order to decide among the above alternatives, a large set of genetic markers is essential, as are statistics for the analysis of these data that are sensitive to some key parameters and insensitive to others. We also need a dynamic model to predict the behavior of these statistics in ways that allow choices from the above options to be made. Our use of the term “population expansion” refers to an increase in size rather than an increase in space.

Recently, Reich and Goldstein (1998)<$REFLINK> (see also Reich, Feldman, and Goldstein 1999<$REFLINK> ) suggested two tests to determine whether population data for a set of microsatellite loci exhibit signs of population expansion. The first test examines the magnitude of a linear combination of the squared variance, the variance, and the fourth central moment of the repeat scores at each locus. If a sufficiently large number of values of this statistic across a set of loci are negative, then a history of population expansion is indicated. Their second test compares the variance across loci of the variance in repeat scores within a locus with its expectation under a single stepwise model as found by Zhivotovsky and Feldman (1995)<$REFLINK> . The first test, a within-locus test, applied to the dinucleotide data of Bowcock et al. (1994)<$REFLINK> showed no sign of historical population expansion, although there was a suggestion of expansion from tetranucleotide polymorphisms in two southern African populations. The second test, an interlocus test, showed signs of expansion in the African data of Bowcock et al. (1994)<$REFLINK> . This expansion is estimated to have occurred between 49,000 and 640,000 years ago.

In this paper, we present an alternative to the within-locus test, a statistic that is motivated by the expected value of the fourth central moment in a general symmetric stepwise mutation model for microsatellite evolution. This statistic, which we call Sk, is expected to be 0 at equilibrium with constant population size and is positive or negative according to whether the population is expanding or contracting, respectively. The use of Sk requires estimation of properties of the mutation distribution, and we use data obtained by Dib et al. (1996)<$REFLINK> for this purpose. Observed values of Sk for samples of di-, tri-, and tetranucleotide data are compatible with population expansion earlier than 60,000 years ago in Africa, Asia, and Europe if the initial population size before the expansion was on the order of 500. Larger initial population sizes force the lower bound for the time since expansion to be much earlier. We find it unlikely that bottlenecks occurred in Central African, East Asian, or European populations, and the estimated expansion times are rather similar for all of them. Populations of South America and Oceania show almost no growth in their history. Alternatively, they could slowly grow but had a severe bottleneck. The Mbuti population from Zaire appears to have experienced a bottleneck during its expansion.

Materials and Methods

Data

We used data on variation at dinucleotide microsatellites (Bowcock et al. 1994<$REFLINK> ), as well as tri- and tetranucleotide loci (unpublished data). The loci used are the following: 29 dinucleotide loci (CA repeats) on chromosomes 13 and the 15—D13S270, D13S126, D13S119, D13S118, D13S125, D13S144, UTSW1523, ACTC, D15S171, D15S169, D13S133, D13S137, D13S227, GABRB3, D13S192, D13S193, HLIP, D15S98, D15S97, D15S100, D15S101, D13S115, D15S95, D15S108, D13S71, D15S102, D15S117, D15S148, and D15S11 (note that locus FES from the original data set by Bowcock et al. [1994]<$REFLINK> has been removed since it is a tetranucleotide); 22 unlinked trinucleotide loci (ATA repeats)—D1S1589, D2S1353, D3S2409, D4S2394, D5S652, D6S1027, D7S2842, D8S1459, D9S910, D10S1223, D11S2362, D12S1045, D13S777, D14S597, D15S652, D16S748, D17S1297, D18S843, D19S583, D20S473, D21S1440, and D22S1045; 21 unlinked tetranucleotide loci (GATA repeats)—D1S1612, D2S1399, D3S1746, D4S1627, D5S816, D6S474, D7S820, D8S1179, D9S925, D10S1237, D11S1999, D13S317, D14S606, D15S659, D16S540, D17S1290, D18S535, D19S253, D20S470, D21S1436, and D22S691.

All data were obtained from the same samples, representing 14 regions around the world (see the map in Barbujani et al. 1997<$REFLINK> ), including Mbuti Pygmies from Zaire (ZAI) and Biaka Pygmies from the Central African Republic (CAR). Another central African sample was from Lisongo Bantu (LIS). The other samples were from Europe (Italy [ITA] and Northern Europe [NEU]), East Asia (China [CHI], Japan [JAP], and Cambodia [CAM]), Oceania (Australia [AUS], Melanesia [MEL], and New Guinea [NGN]), Central America (Mayan [MAY]) and South America (Karitiana [KAR] and Surui [SUR]).

At least 8 chromosomes (up to 30) per sample were scored. The trinucleotide locus D15S652 was not represented in CAM, and locus D6S1027 was not in the SUR sample. The dinucleotide locus D15S98 was not represented in SUR. We omitted data for the dinucleotide locus GABRB3 in ITA and D15S101 in NGN because of small sample sizes (4 and 2, respectively).

Additionally, we analyzed the distribution of variances in allele size of the 14 samples across all loci for each type of microsatellite and found two outliers in the value of variance in CAR: the trinucleotide locus D10S1223 and the tetranucleotide locus D10S1237. These two observations were also omitted from the analysis. Nine outliers were found for dinucleotides, all from the same locus, D13S133. This locus was removed from all analyses, leaving the actual number of dinucleotide loci analyzed at 28.

Model

For this analysis, we expand our previous model of microsatellite variation (Zhivotovsky and Feldman 1995<$REFLINK> ; Zhivotovsky, Feldman, and Grishechkin 1997<$REFLINK> ) to include changes in population size. Consider a diploid population of size N0. At the initial time t = 0 the population suddenly changes its size to N1 (N1 may be greater or less than N0) and then remains at this size or starts to grow at time t0. If it grows, the population is assumed to follow the logistic law: N(t) = N1I exp a(tt0)/(I − 1 + exp a(tt0)), where a is the growth rate and I is the factor by which the population size ultimately will exceed N1. (In the case of no sudden change, t0 = 0 and N1 = N0.)

The markers studied are subject to a stepwise mutation scheme in which we assume that mutations occur at rate μ without bias, so that the expected size of each allele is the same as that of its parental allele. Denote the variance and fourth central moment in the size of mutational changes (in the number of repeats) by σ2m and km, respectively, and set w = μσ2m, k = μkm. We use the term “effective mutation rate” for the product w = μσ2m. For a one-step symmetric mutation process, σ2m = 1 and km = 1, and in this case, w and k are simply the mutation rate. With multistep mutations, w and k are larger than the mutation rate and km > σ2m. The dynamics for the central moments of allele size in the population are derived in the appendix.

Statistical Analysis

We used standard statistical methods described elsewhere. To obtain evolutionary stochastic errors and confidence intervals, bootstrapping over loci was used (Efron and Tibshirani 1993; Weir 1996<$REFLINK> ).

Results

The Expansion Index

In order to test whether the data suggest that population expansion has occurred, we use the following statistic, which we call the expansion index. It is computed from the ratio of estimates of the kurtosis and the squared variance:  

formula
Here, Rk = km2m, and and are the unnormalized kurtosis (the fourth central moment) and the variance of allele sizes in the population, estimated from a sample and corrected for sampling bias (see Cramer 1946<$REFLINK> ), then averaged over loci (separately for di-, tri-, and tetranucleotide microsatellites). The index Sk is based on the relationship between the variance and the kurtosis in a population at mutation-drift equilibrium derived by Zhivotovsky and Feldman (1995<$REFLINK> , eq. 8). Since at equilibrium Sk is expected to be 0, the deviation of Sk from 0 can indicate that the population size has changed if the population was in equilibrium before the expansion. The dynamic analysis of Sk is given in terms of equation system (5) of the appendix. In particular, numerical analysis of system (5) shows that Sk is expected to be positive in a growing population and is expected to be negative if the effective population size is decreasing. Figure 1 demonstrates two trajectories for Sk, one (a) in which the population grows exponentially from the outset, and another (b) in which there is a sharp reduction in the population size followed by exponential growth. The former produces a positive Sk and the latter a markedly negative Sk through 10,000 generations.

The statistic introduced by Reich and Goldstein (1998)<$REFLINK> to reveal the history of population growth is based on a function that is linear in the kurtosis and quadratic in the variance. Their approach allowed a test of the null hypothesis of constant population size against the alternative that there was change in population size. In our analysis, we use dynamics of the index Sk to infer properties of possible expansions and to estimate the initial time of population expansion. We study the variance, V, and Sk simultaneously because they can reveal different aspects of the population expansion history.

In using dynamical analysis of the statistics V and Sk to interpret their observed values, we assume that the population was at equilibrium in the initial generation prior to expansion, a common assumption in evolutionary history analyses. To this end, we calculate the expectation of the population trajectory with respect to the distribution in the population at the initial generation at equilibrium (see the appendix), which enables us to estimate the expansion index Sk by computing in equation (1) the statistics V and K averaged over loci carrying the same kind of microsatellites.

Estimates of Mutation Parameters

There are few experimental data that permit direct estimation of mutation parameters for microsatellites with tri- and tetranucleotide repeats. Dinucleotide markers are most appropriate for this, because there already exist data on more than 5,000 dinucleotide polymorphisms (Dib et al. 1996<$REFLINK> ). Dib et al. (1996)<$REFLINK> analyzed mutation events (including those due to “double recombination”) and estimated the mutation rate as 6.2 × 10−4. They also found that about 68% of the mutations were represented by single-repeat changes. Using a truncated Poisson distribution, we estimate the variance of mutational events, σ2m, to be 2.45 and the ratio of the fourth central moment and the variance, km2m, to be about 6.3 (table 1 ). (Using the same data and a geometric distribution of mutational events, Feldman, Kumm, and Pritchard (1999)<$REFLINK> estimated σ2m to be 2.5.)

Properties of the parameters for the other kinds of microsatellites can be assessed from sample moments. It can be shown that the dynamics over time of the ratios of values of 1 − Sk are nearly independent of mutation parameters within a moderate range. Also, it follows from equation (6) in the appendix that the ratio of variances computed for two kinds of loci does not depend on time and equals the ratio of the corresponding values of w. Therefore, we suggest the following estimators of mutational parameters given values for dinucleotide loci:  

formula
where the subscripts D and T are used to distinguish values of the parameters for dinucleotides and those for tri- and tetranucleotides, respectively. The numerical estimates are presented in table 1 and suggest that the effective mutation rate for dinucleotides is almost twice as high as that for tri- and tetranucleotides, the value close to that inferred from genetic distances and variances (Feldman, Kumm, and Pritchard 1999<$REFLINK> ). It should be noted that although the estimate of wT/wD does not depend on time, the estimate for kmm2 may do so. The form of equation (2) is taken from the equilibrium situation with constant population size as obtained by Zhivotovsky and Feldman (1995)<$REFLINK> . Although Weber and Wong (1993)<$REFLINK> reported a higher mutation rate for a small set of tetranucleotides, this does not contradict our data since wT and wD in equation (2) refer to the effective mutation rate. (Our findings agree with Chakraborty et al. [1997], who estimated that dinucleotides have higher mutation rates.) Fortunately, the dynamics of the index Sk depend only weakly on kmm2 (data not shown), and therefore their values can be averaged over different kinds of microsatellite loci.

Estimates of the Within-Population Variance

Previous studies of mtDNA and microsatellite polymorphisms have demonstrated that African populations have greater variation than non-African populations (Cann, Stoneking, and Wilson 1987<$REFLINK> ; Vigilant et al. 1991<$REFLINK> ; Bowcock et al. 1994<$REFLINK> ; Horai et al. 1995<$REFLINK> ; Armour et al. 1996<$REFLINK> ; Tishkoff et al. 1996<$REFLINK> ; Jorde et al. 1997<$REFLINK> ; Stoneking et al. 1997<$REFLINK> ). We averaged the variances over loci and populations (within and outside of Africa) and found significantly greater variance in allele size among Africans (table 2 ), in agreement with Jorde et al. (1997)<$REFLINK> . More detailed information on the variances in the 14 populations is contained in tables 3 and 4 . We conclude from these data that according to the size of the variance, V, the populations can be grouped into three major clusters: high (Africans), intermediate (Europeans and East Asians), and low (Amerindians and Oceanic) genetic variance (table 4 ).

Estimates of the Expansion Index

Generally, the increase in allele size variance correlates with the increase in the value of the expansion index Sk: the correlation coefficient between average V and average Sk in table 3 is 0.58, corresponding to a general finding from our analysis that both V and Sk are increasing in a growing population. Indeed, the values of genetic variance V in Central Africans, Europeans, and East Asians are highest, and these sets of populations show higher values of the expansion index than the other populations, especially Amerinds and Melanesians, whose 95% confidence intervals include zero (table 4 ). It is important to note that statistics for the different kinds of microsatellite loci are not highly correlated and show some differences. In particular, the pairwise correlation coefficients, di × tri, di × tetra, and tri × tetra, between di-, tri-, and tetranucleotide loci in the point estimates of Sk (table 3 ) are 0.39, −0.10, and 0.50, respectively. Also, the Italian sample shows a rather small estimate of Sk at dinucleotide loci compared to those at tri- and tetranucleotide loci, whereas the sample from Zaire shows the opposite tendency (table 3 ). This can be due to large standard errors for the estimates of Sk (table 3 ) or to different properties of different microsatellite loci. These differences might change with an amount of data on tri- and tetranucleotide loci comparable with those for microsatellites with dinucleotide repeats (Dib et al. 1996<$REFLINK> ).

Although the expansion index and the variance are correlated, their correlation is moderate (0.58), and V and Sk may react differently to the changes in population size and have different patterns in different populations. In particular, the Mbuti population shows highest genetic variation but a lower value for the expansion index compared with those for Central Africans, Europeans, and East Asians (table 4 ). According to the values of Sk, particularly the lower bound of the confidence intervals for Sk, the populations can be grouped into three clusters: high (Central Africans, Europeans, and East Asians), intermediate (Mbuti Pygmies and Sahulanders), and low (Amerindians and Melanesians) (table 4 ).

The Dynamics of the Variance and the Expansion Index

The dynamics of the genetic variance V and the expansion index Sk over time depend weakly on the rate of increase of population size and the final population size (data not shown). Only extreme differences in the rate of increase in population size produce substantial differences in the statistics (fig. 2 ).

Sk and V differ from each other in their dependence on the model parameters. In particular, the effective mutation rate w greatly affects the variance but not the expansion index if w does not change over time (fig. 3 ). This can be interpreted in terms of loci having different mutation rates: the values of Sk are similar for different kinds of loci having different effective mutation rates (e.g., for di-, tri-, or tetranucleotide types), unlike the values of V. Indeed, as can be seen in tables 2 and 3 , the arithmetic mean of the genetic variance V computed over the 14 populations is greater for dinucleotide loci than for tri- and tetranucleotides (5.90 vs. 3.14 and 3.44, respectively), and this suggests a higher effective mutation rate for dinucleotides. At the same time, the arithmetic means of Sk are almost identical for these loci: 0.29, 0.28, and 0.31, respectively. However, if w changed when the population began to grow, both V and Sk would grow faster with larger values of w (data not shown).

Importantly, N0 has opposite effects on the dynamics of V and Sk: the lower N0, the slower is the increase in V, whereas Sk increases faster with lower values of N0 (fig. 4 ). A bottleneck just before population growth may greatly affect both V and Sk (fig. 5 ). However, if a bottleneck occurs during population expansion, it may greatly influence Sk but affect V only slightly (fig. 6 ).

Using the average values of the mutation parameters given in table 1 , estimates of expansion time can be computed from system (5) for a given initial population size N0 and an observed value of Sk, assuming logistic population growth (table 5 ). This table shows that with moderate values (N0 ≤ 2,000 and Sk ≤ 0.20), the estimates of expansion time do not exceed several dozens of thousand years, whereas higher values of Sk can only be explained by much earlier population expansion. Moreover, in the latter case, the estimates of expansion time are not robust with respect to Sk because, as seen from figure 4 and table 5 , the estimate of expansion time is a steep function of Sk beyond 0.30–0.35.

Discussion

Our estimates of the expansion indices have rather wide confidence intervals (table 4 ). The upper bounds for these estimates cannot be used for any reliable conclusion. Indeed, all but one exceed 0.40 (table 4 ), and if the upper bound of Sk is greater than about 0.35, there is no satisfactory interpretation for the upper bound of the time since expansion occurred (table 5 ). To be reliable, estimates of the expansion time should have much narrower confidence intervals and, therefore, be based on hundreds of microsatellite loci, a conclusion already suggested for estimates of divergence time (Zhivotovsky and Feldman 1995<$REFLINK> ; Goldstein et al. 1996; see also Jorde et al. 1997<$REFLINK> ). We agree with Penny et al. (1995) that more attention should be paid to lower bounds for time estimates, and therefore we focus on the point estimates of the expansion time (plug-in estimates) and their lower confidence bounds.

The similarity in values of the expansion index Sk in Central African, European, and East Asian populations (table 4 ) might be due to the populations being genetically well established before they began to expand in size; that is, they were nearly at mutation-drift equilibrium. We cannot discard the possibility that each of these populations had a nearly constant effective population size for a very long time before their expansion. The first important conclusion that follows from table 4 is that the African, European, and East Asian populations started to grow at nearly the same time. This conclusion does not contradict the multiregional hypothesis. It is also compatible with the out-of-Africa hypothesis if it is assumed that separation of major continental groups preceded their expansion in size. Pritchard et al. (1999)<$REFLINK> also did not find evidence of major differences in the pattern of expansion between Asia, Africa, and Europe.

The second important conclusion that can be drawn from our results is that it is not necessary to assume a bottleneck prior to or soon after expansion in order to explain the observed pattern of variation in microsatellites in Central Africa, Europe, and East Asia. Indeed, the expansion index values for these populations are similar (table 4 ). If a bottleneck occurred in the non-African populations around the time at which expansion began but did not occur in Africans, a decrease in both V and Sk below the Central African values would be observed in non-African populations (fig. 5 ). The absence of a significant bottleneck has also been suggested by Ayala and Escalante (1996)<$REFLINK> from their studies of the DQB1 locus of the human major histocompatibility complex.

A third important conclusion from our findings is that early modern humans may have had a low effective population size, on the order of a few hundred. Indeed, since the point estimates of Sk for the Central Africans, Europeans, and East Asians are 0.38–0.40, with the lower 95% bound about 0.28 (table 4 ), it follows from table 5 that several hundred thousand years would be required to achieve such values of Sk if the effective population size were 5,000. In the framework of the multiregional hypothesis, our analysis suggests that initial effective population sizes prior to expansion would be 5,000 or larger. However, from the viewpoint of an African origin for modern humans who replaced earlier populations between 30,000 and 100,000 years ago (Stringer and Andrews 1988<$REFLINK> ), even Ne = 2,000 is still too large to be consistent with the data (tables 4 and 5 ). Only an effective size as small as 500 is compatible with the high values of Sk observed in the main regions (tables 4 and 5 ). Also, as follows from tables 4 and 5 , the lower 95% bound for the time of expansion of the Central Africans, Asians, and Europeans is about 60,000 years (assuming a generation time of 25 years) if their initial effective population size was 500 (table 6 ). If the size had been larger, the expansion time would have a larger lower bound (table 5 ). Our estimate of the initial effective population size prior to expansion, Ne = 500, is close to the lower bound of Rogers and Harpending (1992)<$REFLINK> , Rogers (1995)<$REFLINK> , and Rogers and Jorde (1995)<$REFLINK> , who have estimated from mtDNA data that human populations expanded by more than 100-fold from an original size of between 1,500 and 7,000 breeding females. (A possible alternative explanation for high values of Sk in the Central Africans, Europeans, and East Asians might be an increase in mutation rates in these populations. Such an increase could have been caused by some global environmental change at the time of expansion, or by other sources. This scenario should not be immediately excluded, although it does not seem likely.)

An initial effective population size of 500 might be thought to be too small within the framework of the out-of-Africa hypothesis. However, this figure does not imply that each major group (Africans, East Asians, Europeans) was this small prior to its expansion. It is possible that a major group was divided into subpopulations (tribes), each of a few hundred, but only one of them became dominant and expanded, while the others went extinct. This scenario might be considered as a “subpopulation” version of the weak Garden of Eden hypothesis (Harpending et al. 1993<$REFLINK> ). Under the multiregional hypothesis, tables 4 and 5 would suggest that larger initial populations, of 5,000 or more, expanded much earlier, perhaps several hundred thousand years ago, with a lower 95% bound of about 300,000 years.

The estimates of Sk in table 4 can be converted to estimates of expansion time for these particular populations. Table 6 shows the point estimates and their lower 95% bounds assuming an initial population size of 500 individuals and logistic population growth. We notice that greater initial population size and/or a bottleneck prior to or during expansion can only increase the estimates of time since expansion occurred, because these factors decrease the value of Sk. Recall that estimates of expansion time are not robust with respect to changes in Sk beyond 0.30–0.35. The lower-bound estimates should be more reliable, especially for interpretation of data on the major groups (Central African, European, and East Asian), which have large values of Sk (table 4 ).

The point estimates of the expansion time for the major groups (African, European, and East Asian) given in table 6 are large, more than 400,000 years, with a lower 95% bound of about 60,000 years. A possible scenario might be that the African populations began to grow much earlier, before the emergence from Africa, and that a possible decline in population size among the migrants was rapidly reversed after emergence. However, a striking finding is that while the European and East Asian populations have similar values of Sk, which are higher than those averaged over three African samples (Zaire and Central Africa; see table 4 ), both European and East Asian populations have similar values of V, which are lower than those in Africa. Because the establishment of variances requires some evolutionary time, a scenario according to which modern humans departed from Africa and migrated to East Asia and Europe with some reduction in effective population size, and then, much later, the African, East Asian, and European populations started to grow at approximately the same time is consistent with our estimates of population variance and the expansion index (table 4 ) and their predicted behavior (see fig. 4 ).

The lower-bound estimates in table 6 suggest that the African, East Asian, and European groups started to grow about 60,000 years ago or earlier. This estimate corresponds well with the estimates of expansion time given by Rogers and Jorde (1995)<$REFLINK> based on mitochondrial DNA differences: 33,000–150,000 years ago. In contrast, Pritchard et al. (1999)<$REFLINK> , using Y chromosome microsatellite polymorphisms, estimated that significant exponential population growth commenced much later, about 20,000 years ago. Note that our estimates of expansion time do not themselves reject the multiregional hypothesis, but even under that hypothesis, we would conclude that the times of expansion for the Africans, Europeans, and East Asians were similar. However, other genetic data, as well as archeological findings, strongly support the out-of-Africa hypothesis (Stringer and Andrews 1988<$REFLINK> ; Stoneking 1997<$REFLINK> ).

The substantially lower value of Sk in the Mbuti population of Zaire with V the same as in the rest of the African populations can be explained by a short-term bottleneck that could have occurred during the expansion (see fig. 6 ). Alternatively, the Mbuti population began to grow much later than the other African populations (table 6 ). Low values of both V and Sk in the populations from South America and Oceania (tables 3 and 4 ) are explained by their later expansion or a combination of slow growth with a bottleneck. Moreover, the lower-bound estimates suggest no significant expansion in their sizes (table 6 ). These conclusions are consistent with a recent human invasion of Oceania and, especially, America (see Ward 1997<$REFLINK> ).

Keith A. Crandall, Reviewing Editor

1

Keywords: human population growth expansion index microsatellites bottleneck effective size stepwise mutation

2

Address for correspondence and reprints: Marcus W. Feldman, Department of Biological Sciences, Stanford University, Stanford, California 94305. E-mail: marc@charles.stanford.edu.

Table 1 Estimates of Mutation Parameters for Di-, Tri-, and Tetranucleotides

Table 2 The Difference in the Genetic Variance V Between African and Non-African Populations

Table 3 Estimates of the Expansion Index Sk in 14 Worldwide Populations

Table 4 Estimates of the Expansion Index Sk and the Genetic Variance V for Seven Regional Clusters

Table 5 Predicted Values of Expansion Time (in thousands of years ago) Based on Sk with Different Initial Population Sizes N0

Table 6 Estimates of the Time Since Expansion (in thousands of years ago) for Seven Regional Clusters

Fig. 1.—Examples of the dynamics of the expansion index Sk in a growing population and in a population after a fivefold population size reduction in the initial generation. Parameters are k = 5w, w = 0.001, and a = 0.002. Crosses: N0 = 5,000, N1 = 5,000, I = 1,000; diamonds: N0 = 5,000, N1 = 1,000, I = 1

Fig. 1.—Examples of the dynamics of the expansion index Sk in a growing population and in a population after a fivefold population size reduction in the initial generation. Parameters are k = 5w, w = 0.001, and a = 0.002. Crosses: N0 = 5,000, N1 = 5,000, I = 1,000; diamonds: N0 = 5,000, N1 = 1,000, I = 1

Fig. 2.—Dynamics of the variance, V, the expansion index, Sk, and the population size for different population growth rates. k = 5w, w = 0.001, N0 = 500, N1 = 500, and I = 1,000. Crosses: a = 0.01; diamonds: a = 0.001; circles: a = 0.0005

Fig. 2.—Dynamics of the variance, V, the expansion index, Sk, and the population size for different population growth rates. k = 5w, w = 0.001, N0 = 500, N1 = 500, and I = 1,000. Crosses: a = 0.01; diamonds: a = 0.001; circles: a = 0.0005

Fig. 3.—Dynamics of the variance, V, and expansion index, Sk, for different values of the effective mutation rate, w. N0 = 5,000, N1 = 5,000, I = 1,000, a = 0.002, and k = 5w. Crosses: w = 0.0005; diamonds: w = 0.001; circles: w = 0.002

Fig. 3.—Dynamics of the variance, V, and expansion index, Sk, for different values of the effective mutation rate, w. N0 = 5,000, N1 = 5,000, I = 1,000, a = 0.002, and k = 5w. Crosses: w = 0.0005; diamonds: w = 0.001; circles: w = 0.002

Fig. 4.—Dynamics of the variance, V, the expansion index, Sk, and the population size, with a different initial population size but the same final population size (5 × 106). k = 5w, w = 0.001, and a = 0.002. Crosses: N0 = 500, N1 = 500, I = 10,000; diamonds: N0 = 2,000, N1 = 2,000, I = 2,500; circles: N0 = 5,000, N1 = 5,000, I = 1,000.

Fig. 4.—Dynamics of the variance, V, the expansion index, Sk, and the population size, with a different initial population size but the same final population size (5 × 106). k = 5w, w = 0.001, and a = 0.002. Crosses: N0 = 500, N1 = 500, I = 10,000; diamonds: N0 = 2,000, N1 = 2,000, I = 2,500; circles: N0 = 5,000, N1 = 5,000, I = 1,000.

Fig. 5.—Dynamics of the variance, V, the expansion index, Sk, and the population size under a 10-fold bottleneck. N0 = 5,000, k = 5w, w = 0.001; the population size during the bottleneck is 500, and that after the bottleneck is 5,000. It then grows with a = 0.002, I = 1,000. Crosses: no bottleneck; diamonds: the bottleneck lasts 100 generations; circles: the bottleneck lasts 500 generations

Fig. 5.—Dynamics of the variance, V, the expansion index, Sk, and the population size under a 10-fold bottleneck. N0 = 5,000, k = 5w, w = 0.001; the population size during the bottleneck is 500, and that after the bottleneck is 5,000. It then grows with a = 0.002, I = 1,000. Crosses: no bottleneck; diamonds: the bottleneck lasts 100 generations; circles: the bottleneck lasts 500 generations

Fig. 6.—Dynamics of the variance, V, the expansion index, Sk, and the population size under a severe bottleneck during population expansion. The population grows from the initial generation t = 0 with N0 = 500, I = 1,000, a = 0.002, and then drops to N = 200 at generation t′ = 1,000. Then it remains constant until generation t0. After t0, it again grows with N1 = 1,000, I = 1,000, a = 0.002. (Mutational parameters are k = 5w, w = 0.001). Crosses: the length of the bottleneck, t0t′, is 1 generation; diamonds: t0t′ = 50 generations; circles: t0t′ = 200 generations

Fig. 6.—Dynamics of the variance, V, the expansion index, Sk, and the population size under a severe bottleneck during population expansion. The population grows from the initial generation t = 0 with N0 = 500, I = 1,000, a = 0.002, and then drops to N = 200 at generation t′ = 1,000. Then it remains constant until generation t0. After t0, it again grows with N1 = 1,000, I = 1,000, a = 0.002. (Mutational parameters are k = 5w, w = 0.001). Crosses: the length of the bottleneck, t0t′, is 1 generation; diamonds: t0t′ = 50 generations; circles: t0t′ = 200 generations

We are indebted to an anonymous reviewer for helpful suggestions on an earlier draft. This research was supported in part by NIH grants GM 28016, GM 28428, and 1 R03 TW00491-01.

literature cited

Armour, J. A. L., T. Anttinen, C. A. May, E. E. Vega, A. Sajantila, J. R. Kidd, K. K. Kidd, J. Bertranpetit, S. Paabo, and A. J. Jeffreys.
1996
. Minisatellite diversity supports a recent African origin for modern humans.
Nat. Genet.
 
13
:
154
–160.
Ayala, F., and A. A. Escalante.
1996
. The evolution of human populations: a molecular perspective.
Mol. Phylogenet. Evol.
 
5
:
188
–201.
Barbujani, G., A. Magagni, E. Minch, and L. L. Cavalli-Sforza.
1997
. An appointment of human DNA diversity. Proc. Natl. Acad. Sci. USA 94:4516–4519.
Bowcock, A. M., A. Ruiz-Linares, J. Tomfohrde, E. Minch, J. R. Kidd, and L. L. Cavalli-Sforza.
1994
. High resolution of human evolutionary trees with polymorphic microsatellites. Nature 368:455–457.
Cann, R. L., M. Stoneking, and A. Wilson.
1987
. Mitochondrial DNA and human evolution. Nature 325:31–36.
Cavalli-Sforza, L. L., P. Menozzi, and A. Piazza.
1996
. The history and geography of human genes. Princeton University Press, Princeton, N.J.
Chakraborty, R., M. Kimmel, D. N. Stivers, J. Davidson, and R. Deka.
1997
. Relative mutation rates at di-, tri- and tetranucleotide microsatellite loci. Proc. Natl. Acad. Sci. USA 94:1041–1046.
Cramer, H.
1946
. Mathematical methods of statistics. 9th edition. Princeton University Press, Princeton, N.J.
Dib, C., S. Faure, C. Fizames et al. (14 co-authors).
1996
. A comprehensive genetic map of the human genome based on 5,264 microsatellites. Nature 380:152–154.
Efron, B., and R. J. Tibshirani.
1993
. An introduction to the bootstrap. Chapman and Hall, N.Y.
Feldman, M. W., J. Kumm, and J. K. Pritchard.
1999
. Mutation and migration in models of microsatellite evolution. In D. G. Goldstein and C. Schlotterer, eds. Microsatellites: evolution and applications. Oxford University Press, Oxford, U.K.
Frayer, D. W., M. H. Wolpoff, A. G. Thorne, F. H. Smith, and G. G. Pope.
1993
. Theories of modern human origins: the paleontological test.
Am. Anthropol.
 
95
:
14
–50.
Goldstein, D. B., A. R. Linares, L. L. Cavalli-Sforza, and M. W. Feldman. 1995a. An evaluation of genetic distances for use with microsatellite loci. Genetics 139:463–471.
———. 1995b. Genetic absolute dating based on microsatellites and the origin of modern humans. Proc. Natl. Acad. Sci. USA 92:6723–6727.
Goldstein, D. B., L. A. Zhivotovsky, K. Nayar, A. R. Linares, L. L. Cavalli-Sforza, and M. W. Feldman.
1996
. Statistical properties of the variation at linked microsatellite loci: implications for the history of human Y chromosomes.
Mol. Biol. Evol.
 
13
:
1213
–1218.
Harpending, H. C., S. T. Sherry, A. R. Rogers, and M. Stoneking.
1993
. The genetic structure of ancient human populations.
Curr. Anthropol.
 
43
:
483
–496.
Hoelzel, A. R., ed.
1998
. Molecular genetic analysis of populations. 2nd edition. IRL Press at Oxford University Press, Oxford, England.
Horai, S., K. Hayasaka, R. Kondo, K. Tsugane, and N. Takahata.
1995
. Recent African origin of modern humans revealed by complete sequences of hominoid mitochondrial DNAs. Proc. Natl. Acad. Sci. USA 92:532–536.
Jorde, L. B., M. Bamshad, W. S. Watkins, R. Zenger, A. E. Fraley, P. A. Krakowiak, K. D. Carpenter, H. Soodyall, T. Jenkins, and A. R. Rogers.
1995
. Origins and affinities of modern humans: a comparison of mitochindrial and nuclear genetic data.
Am. J. Hum. Genet.
 
57
:
523
–538.
Jorde, L. B., A. R. Rogers, M. Bamshad, W. S. Watkins, P. Krakowiak, S. Sung, J. Kere, and H. Harpending.
1997
. Microsatellite diversity and the demographic history of modern humans. Proc. Natl. Acad. Sci. USA 94:3100–3103.
Kimmel, M., R. Chakraborty, J. P. King, M. Bamshad, W. S. Watkins, and L. B. Jorde.
1998
. Signatures of population expansion in microsatellite repeat data. Genetics 148:1921–1930.
Penny, D., M. Steel, P. J. Waddell, and M. D. Hendy.
1995
. Improved analysis of human mtDNA sequences supported a recent African origin for Homo sapiens.
Mol. Biol. Evol.
 
12
:
863
–882.
Pritchard, J. K., and M. W. Feldman.
1996
. Statistics for microsatellite variation based on coalescence.
Theor. Popul. Biol.
 
45
:
265
–270.
Pritchard, J. K., M. T. Seielstad, A. Perez-Lezaun, and M. W. Feldman.
1999
. Population growth of human Y chromosome: a study of Y chromosome microsatellites.
Mol. Biol. Evol.
 
16
:
1791
–1798.
Reich, D. E., M. W. Feldman, and D. B. Goldstein.
1999
. Statistical properties of two tests that use multilocus data sets to detect population expansions.
Mol. Biol. Evol.
 
16
:
453
–466.
Reich, D. E., and D. B. Goldstein.
1998
. Genetic evidence for a Paleolithic human population expansion in Africa. Proc. Natl. Acad. Sci. USA 95:8119–8123.
Rogers, A. R.
1995
. Genetic evidence for a Pleistocene population explosion. Evolution 49:608–615.
Rogers, A. R., and H. C. Harpending.
1992
. Population growth makes waves in the distribution of pairwise differences.
Mol. Biol. Evol.
 
9
:
552
–569.
Rogers, A. R., and L. B. Jorde.
1995
. Genetic evidence on modern human origins.
Hum. Biol.
 
67
:
1
–36.
Sherry, S. T., A. R. Rogers, H. C. Harpending, H. Soodyall, T. Jemkins, and M. Stoneking.
1994
. Mismatch distributions of mtDNA reveal recent human population expansion.
Hum. Biol.
 
66
:
761
–775.
Sokal, R. R., and F. J. Rohlf.
1995
. Biometry. 3rd edition. Freeman, N.Y.
Stoneking, M.
1997
. Recent African origin of human mitichondrial DNA: review of the evidence and current status of the hypothesis. Pp. 1–13 in P. Donelly and S. Tavare, eds. Progress in population genetics and human evolution. Springer, N.Y.
Stoneking, M., J. J. Fontius, S. L. Clifford et al. (11 co-authors).
1997
. Alu insertion polymorphisms and human evolution: evidence for a larger population size in Africa.
Genome Res.
 
7
:
1061
–1071.
Stringer, C. B., and P. Andrews.
1988
. Genetic and fossil evidence for the origin of modern humans. Science 239:1263–1268.
Tishkoff, S. A., E. Dietzsch, W. Speed et al. (15 co-authors). Global pattern of linkage disequilibrium at the CD4 locus and modern human origins. Science 271:1380–1387.
Vigilant, L., M. Stoneking, H. Harpending, K. Hawres, and A. C. Wilson.
1991
. African populations and the evolution of human mitochondrial DNA. Science 253:1503–1507.
Ward, R. H.
1997
. Phylogeography of human mtDNA: an Amerindian perspective. Pp. 33–53 in P. Donelly and S. Tavare, eds. Progress in population genetics and human evolution. Springer, N.Y.
Weber, A. O. M., and C. Wong.
1993
. Mutation of human short tandem repeats.
Hum. Mol. Genet.
 
2
:
1123
–1128.
Weir, B.
1996
. Genetic data analysis II. Sinauer, Sunderland, Mass.
Wolfram, S.
1996
. Mathematica. A system for doing mathematics by computer. 3rd edition. Wolfram Media/Cambridge University Press, N.Y.
Wolpoff, M. H.
1989
. Multiregional evolution: the fossil alternative to Eden. Pp. 62–108 in P. Mellars and C. Stringer, eds. The human revolution: behavioural and biological perspectives on the origins of modern humans. Princeton University Press, Princeton, N.J.
Zhivotovsky, L. A., and M. W. Feldman.
1995
. Microsatellite variability and genetic distances. Proc. Natl. Acad. Sci. USA 92:11549–11552.
Zhivotovsky, L. A., M. W. Feldman, and S. A. Grishechkin.
1997
. Biased mutations and microsatellite variation.
Mol. Biol. Evol.
 
14
:
926
–933.