## Abstract

A number of statistical tests for detecting population growth are described. We compared the statistical power of these tests with that of others available in the literature. The tests evaluated fall into three categories: those tests based on the distribution of the mutation frequencies, on the haplotype distribution, and on the mismatch distribution. We found that, for an extensive variety of cases, the most powerful tests for detecting population growth are Fu's *F _{S}* test and the newly developed

*R*

_{2}test. The behavior of the

*R*

_{2}test is superior for small sample sizes, whereas

*F*is better for large sample sizes. We also show that some popular statistics based on the mismatch distribution are very conservative.

_{S}## Introduction

Comparison of DNA sequences within and between species is a powerful approach not only for determining the evolutionary forces acting in specific gene regions but also for determining relevant aspects of the evolutionary history of the species (for reviews see, Takahata 1996<$REFLINK> ; Rogers 1997<$REFLINK> ; Harpending et al. 1998<$REFLINK> ; Jorde, Bamshad, and Rogers 1998<$REFLINK> ; Cann 2001<$REFLINK> ). The coalescent theory (Kingman 1982*a*<$REFLINK> , 1982*b*<$REFLINK> ; Hudson 1990<$REFLINK> ; Donnelly and Tavaré 1995<$REFLINK> ; Fu and Li 1999<$REFLINK> ) is the most powerful theoretical approach for interpreting DNA sequence data. The coalescent is a population genetic model focused primarily on the neutral evolution of gene trees; this model provides the framework for the development of statistical tests and also provides very efficient computer simulations methods.

Tajima (1989*b*)<$REFLINK> , Slatkin and Hudson (1991)<$REFLINK> , and Rogers and Harpending (1992)<$REFLINK> pioneered the study of the effect of some demographic events on DNA sequence data. They have shown that a relatively recent demographic event, such as a population growth, causes most of the coalescent events to occur before the expansion and, consequently, samples of these populations have gene genealogies stretched near the external nodes and compressed near the root (i.e., star genealogies). Thus, population size changes can leave a particular footprint that may eventually be detected in DNA sequence data. This theoretical framework prompted the development of statistical tests for detecting population expansion.

The analysis of the distribution of pairwise differences, or mismatch distribution (Slatkin and Hudson 1991<$REFLINK> ; Rogers and Harpending 1992<$REFLINK> ), provides a method for inferring such demographic events. These authors have shown that, for nonrecombining DNA regions, constant size populations presented mismatch distributions with shapes with very little resemblance to that expected in growing populations. This prompted the development of some statistical tests for detecting expansion processes (Harpending et al. 1993<$REFLINK> ; Harpending 1994<$REFLINK> ; Eller and Harpending 1996<$REFLINK> ; Rogers et al. 1996<$REFLINK> ). One of the most frequently used tests is the raggedness statistic *rg* (Harpending et al. 1993<$REFLINK> ). Although the distribution of the *rg* statistic is unknown, its confidence intervals could be obtained by computer simulations based on the coalescent algorithm. But because methods based on the mismatch distribution use little information accumulated in the data (Felsenstein 1992<$REFLINK> ), tests based on the mismatch distribution should be very conservative.

In recent years, a number of authors have developed several methods of statistical inference and statistical tests using different approaches (e.g., Griffiths and Tavaré 1994<$REFLINK> ; Bertorelle and Slatkin 1995<$REFLINK> ; Rogers 1995<$REFLINK> ; Aris-Brosou and Excoffier 1996<$REFLINK> ; Fu 1996<$REFLINK> , 1997<$REFLINK> ; Kuhner, Yamato, and Felsenstein 1998<$REFLINK> ; Weiss and Von Haeseler 1998<$REFLINK> ; Galtier, Depaulis, and Barton 2000<$REFLINK> ; Furlong and Brookfield 2001<$REFLINK> ). More recently, specific methods for detecting population expansions have also been developed for the analysis of microsatellite data (e.g., Kimmel et al. 1998<$REFLINK> ; Reich and Goldstein 1998<$REFLINK> ; Beaumont 1999<$REFLINK> ; Reich, Feldman, and Goldstein 1999<$REFLINK> ; King, Kimmel, and Chakraborty 2000<$REFLINK> ).

Here we report the development of new statistical tests for detecting past population growth. We performed an extensive analysis of their statistical power against different alternative hypotheses, and we compared their relative performance with respect to others published in the literature. Although some authors (Braverman et al. 1995<$REFLINK> ; Simonsen, Churchill, and Aquadro 1995<$REFLINK> ; Fu 1996<$REFLINK> , 1997<$REFLINK> ) have also investigated the power of some statistical tests against population growth and genetic hitchhiking (which leave similar footprints in DNA sequences), at present there is no exhaustive comparative analysis. The major population growth model investigated was the sudden (instantaneous) growth, although we also studied the power under the logistic model of population growth. The power of these tests was evaluated using random data sets generated by computer simulations based on the coalescent (Hudson 1990<$REFLINK> ).

## Materials and Methods

We analyzed the performance of 17 statistical tests to distinguish specific models of population growth from the null hypothesis of a constant size population under the neutral model. Thus, we determined the power of these tests to reject the null hypothesis when the alternative hypothesis is really true. On the basis of the sequence information used, the test statistics evaluated have been classified into three major classes, namely classes I, II and III (see below). We developed several new statistical tests based on high-order moments (within classes I and III) because the distortion of the gene tree caused by the population growth would suggest that these types of tests could be more powerful than other tests available in the literature.

### Class I Statistics

Class I statistics use information of the mutation (segregating site) frequency. These statistics could be appropriate to distinguish population growth from constant size populations because the former generates an excess of mutations in external branches of the genealogy (i.e., recent mutations) and therefore an excess of singletons (substitutions present in only one sampled sequence) (Tajima 1989*a*<$REFLINK> , 1989*b*<$REFLINK> ; Slatkin and Hudson 1991<$REFLINK> ).

We studied the following test statistics: Tajima's *D,* and Fu and Li's *D**, *F**, *D* (named *D ^{F}*) and

*F*statistics (Tajima 1989

*a*<$REFLINK> ; Fu and Li 1993<$REFLINK> ; see also Simonsen, Churchill, and Aquadro 1995<$REFLINK> ). These tests are based on the difference between two alternative estimates of the mutational parameter 𝛉 = 2

*Nu,*where

*N*is the effective number of gene copies in the population (the number of females in the population for mtDNA regions or double the population size for an autosomal region) and

*u*is the mutation rate. Tajima's

*D*and Fu and Li's

*D** and

*F** statistics use information from only intraspecific data, whereas Fu and Li's

*D*and

^{F}*F*statistics use information from the number of recent mutations; the latter, therefore, requires the presence of an outgroup to be computed.

We developed a number of tests based on the difference between the number of singleton mutations and the average number of nucleotide differences. The *R*_{2} statistic is defined as

*n*is the sample size,

*S*the total number of segregating sites,

*k*the average number of nucleotide differences between two sequences, and

*U*the number of singleton mutations in sequence

_{i}*i.*The rationale of this test is that the expected numbers of singletons on a genealogy branch after a recent severe population growth event is

*k*/2; consequently, lower values of

*R*

_{2}are expected under this demographic scenario. The

*R*

_{2}statistic will be computed in the next version of the DnaSP (Rozas and Rozas 1999<$REFLINK> ) software.

We also built two *R*_{2} related tests namely, *R*_{3} and *R*_{4}. These statistics differ from the *R*_{2} test in the power exponent values; in *R*_{3} and *R*_{4}, the exponent values of 2 and 1/2 (eq. 1) are replaced by 3 and 1/3, and by 4 and 1/4, respectively.

We have constructed three additional test statistics (*R*_{2E}*, R*_{3E}, and *R*_{4E}) that use information on the number of mutations in external branches; thus, an outgroup will be required for their estimation. The *R*_{2E} test is defined as

*V*is the number of external mutations in sequence

_{i}*i.*The

*R*

_{3E}and

*R*

_{4E}tests differ from the

*R*

_{2E}test in the power values; the exponent values of 2 and 1/2 of equation (2) are replaced by 3 and 1/3, and by 4 and 1/4, respectively.

We have also developed two other tests (*Ch* and *Che*) based on the difference between the number of singleton (and also for recent) mutations and their expected value:

*U*is the total number of singleton mutations. The

*Che*test is constructed in the same way but by using information on the external mutations.

### Class II Statistics

In class II, we include statistical tests that use information from the haplotype distribution. We have only studied Fu's *F _{S}* test statistic (Fu 1997<$REFLINK> ) within this class. This statistic, which is based on the Ewens' sampling distribution (Ewens 1972<$REFLINK> ), has low values with the excess of singleton mutations caused by the expansion.

### Class III Statistics

Class III statistical tests use information from the distribution of the pairwise sequence differences (or mismatch distribution). It has been shown that population expansions leave a particular signature in the distribution of the pairwise sequence differences (Slatkin and Hudson 1991<$REFLINK> ; Rogers and Harpending 1992<$REFLINK> ); therefore, statistics based on the mismatch distribution can be used to test for demographic events. We evaluated the following statistics. (1) The raggedness *rg* statistic (Harpending et al. 1993<$REFLINK> ; Harpending 1994<$REFLINK> ). The raggedness statistic, which measures the smoothness of the mismatch distribution, differs among constant size and growing populations: lower *rg* values are expected under the population growth model. (2) The mean absolute error (*MAE*) between the observed and the theoretical mismatch distribution (Rogers et al. 1996<$REFLINK> ). (3) We also developed a new statistical test, the *ku* test, based on the fourth central moment (i.e., on the kurtosis) of the mismatch distribution. Given that population expansion generates more smoothly peaked distributions, this statistic can distinguish between constant size and growing populations. Let *d,**n*_{c}, and *W _{i}* be the maximum number of differences in the mismatch distribution, the number of pairwise comparisons (=

*n*(

*n*− 1)/2), and the frequency of pairs of DNA sequences that differ by

*i*mutations, respectively. We define:

### Empirical Distributions

We obtained the empirical distribution of each statistical test by Monte Carlo simulations based on the coalescent process for a neutral infinite-sites model, assuming a large population size (Kingman 1982*a*<$REFLINK> , 1982*b*<$REFLINK> ; Hudson 1990<$REFLINK> ). We also assumed that there is neither intragenic recombination nor migration and that the mutation rate is homogeneous across the DNA region. We performed the simulations conditional on the number of segregating sites (*S*); that is, placing randomly *S* mutations along the tree (the so-called fixed *S* method). Given that the actual value of 𝛉 is usually unknown, this method seems to be appropriate for testing purposes (Hudson 1993<$REFLINK> ). The routine *ran1* (Press et al. 1992<$REFLINK> ) was used as a random number generator. We conducted coalescent simulations for constant population size (null hypothesis) and for population growth (alternative hypothesis); the empirical distribution was estimated from 100,000 computer replicates for both the null and the alternative hypotheses.

For the constant size model (null hypothesis), the samples were generated using conventional procedures (Hudson 1990<$REFLINK> ); in this model only two parameters are required: the sample size and the number of segregating sites. The sudden population growth model (Rogers and Harpending 1992<$REFLINK> ) considers a population that was formerly at equilibrium, but *t _{e}* generations before the present one the population grew suddenly to the current size. Coalescent simulations under the sudden expansion model require four parameters:

*n,*

*S,*

*t*

_{e}, and

*D*, where

_{e}*D*, the degree of the expansion event, is:

_{e}*N*is the maximum population size (i.e., the current population size under the sudden expansion model) and

_{max}*N*

_{0}is the initial population size. For the simulations the

*t*values were scaled in terms of

_{e}*N*generations (denoted by

_{max}*T*). Coalescent simulations under the sudden expansion model were performed by changing the time of the nodes as in where

_{e}*T*and

*T*

_{1}are the coalescence times (measured in

*N*generations) under the constant size (i.e., the standard coalescent) and under the sudden expansion models, respectively (see Nordborg 2001<$REFLINK> ). Under the later scenario, we generated samples using an extensive set of values of the parameter space.

_{max}We also conducted some coalescent simulations assuming that the population follows the logistic model of growth. In this model

were*N*is the maximum population size,

_{max}*N*the population size at time

_{T}*T*(the time is measured in

*N*generations),

_{max}*r*the growth rate,

*T*the elapsed time (measured in

_{S}*N*generations) from the beginning of the growth event, and

_{max}*c*represents the reflection point of the growth curve (see eq. 16 in Fu 1997<$REFLINK> ). It should be noted that under this model the current population size could be equal or lower than the

*N*

_{max}.Coalescent simulations under the logistic model of growth were generated changing the times of the nodes according to the population size. These times are given by

where*T*and

*T*

_{1}are the coalescence times (measured in

*N*generations) of each node under the constant size and under the demographic models, respectively,

_{max}*N*the current population size (i.e.,

_{c}*T*= 0) which can be obtained from equation 6, and

*N(x)*the population size at time

*x*(eq. 6). Therefore, we will compare two empirical distributions (under the null and the alternative hypotheses) with the same population size (

*N*) at the sampling time.

_{c}### Critical Values and the Power of the Tests

We determined the critical values of each statistical test from its empirical distribution. The power of each test, or the probability of rejecting the null hypothesis (constant size population) when the alternative hypothesis (population growth) is true, was estimated as the proportion of computer replicates generated under the alternative hypothesis for which the null hypothesis was rejected. For the analysis, we fixed a significance level of α = 0.05. Because the critical region for all alternative hypotheses would consist of only one side of the distribution, we conducted one-tailed tests. Specifically, all analyzed statistics, except *Ch,**Che,* and *ku* had lower values under the population growth model.

Given that under the null hypothesis the empirical distribution of some statistics presented a reduced number of points (e.g., the distribution of *D** statistic; see *Results*), the actual probability of rejecting the null hypothesis when it is true (i.e., the size of the test) could be lower than the nominal significance level of 0.05.

## Results

### Sudden Population Growth Model

We studied the power of 17 statistical tests under different values of *n,**S,**D _{e}*, and

*T*. Although we have examined the power for a wide range of the parameter space, we will show only the most relevant cases (additional results and figures are available from the authors). The parameters fixed for illustrating the power were

_{e}*n*= 10 and

*n*= 50,

*S*= 10 and

*S*= 50,

*D*= 10 and

_{e}*D*= 100, and

_{e}*T*= 0.1 (time for the maximum power; see below). These values give a clear view of the statistical power under some realistic cases: for small and big sample sizes, for a low and high number of mutations and for reasonable population growth parameters. In all cases, the parameter sets were chosen to avoid saturation of the power curves.

_{e}The power analysis of the tests *R*_{3}, *R*_{4}, *R*_{3E}, and *R*_{4E} show a similar power than the *R*_{2} and will not be presented here. Nevertheless, for some specific set of parameters the *R*_{4} and *R*_{4E} tests presented a slightly higher power than *R*_{2}. Generally, results of the statistical power of all statistical tests that use interspecific data presented a similar power than its equivalent statistic using intraspecific information (figures not shown).

Figure 1 shows the effect of *T _{e}*—the time elapsed since the expansion event—on the statistical power of different statistical tests. It can be observed that

*R*

_{2}and Fu's

*F*are the most powerful tests: the

_{S}*R*

_{2}test is the most powerful for small sample sizes, whereas the behavior of Fu's

*F*is better for large samples. The power of Tajima's

_{S}*D*and Fu and Li's

*F**is lower than

*R*

_{2}and

*F*. The results also indicate that some commonly used tests based on the mismatch distribution,

_{S}*rg*and

*MAE*, are among the least powerful. All statistical tests show a peak in the statistical power at intermediate values of

*T*(

_{e}*T*∼ 0.1); thus, it is unlikely to detect a population expansion when

_{e}*T*is too small or too large. This result agrees with that obtained by Simonsen, Churchill, and Aquadro (1995)<$REFLINK> and Fu (1997)<$REFLINK> .

_{e}The results of the effect of *D _{e}* on the power to reject the constant size model are depicted in figure 2 . All statistical tests, except class III, increase the power to reject the constant size model with increasing

*D*; therefore, large samples will be needed to detect small population growth events. Again, tests based on the mismatch distribution are very insensitive in detecting population growths. The most powerful statistics are the

_{e}*R*

_{2}and the

*F*. Tajima's

_{S}*D,*Fu and Li's

*F** and

*D** and

*Ch*have comparatively less power.

Figures 3 and 4 show the effect of the sample size and the number of segregating sites on the power to reject the neutral constant size model under specific alternative hypotheses. It should be expected that both variables have a major effect on the statistical power, the larger the values of *n* or *S,* the more the power of the tests. But the effect on the power is different for different statistics: for small sample sizes (and a small number of segregating sites) the *R*_{2} statistical test is the most powerful (figs. 3*A* and 4*A* ), whereas for larger sample sizes *F _{S}* is the most powerful one. Moreover, for small sample sizes the power of

*D*and

^{F}*F*is better than the counterpart tests without outgroup, although they are not as powerful as

*R*

_{2}and

*F*(figure not shown). The results also indicate that statistical tests based on the mismatch distribution, the

_{S}*rg*, and the

*MAE*are among the least powerful. In fact, in some cases, the power decreases as the sample size increases.

It should be noted that statistics *D** (figs. 3*A* and 4*B* ) and *F _{S}* (fig. 4

*A*) have an irregular behavior because they show some atypical power drops with increasing sample size or the number of segregating sites. This unexpected pattern has two different explanations. In the case of Fu and Li's

*D** statistic, the power drop is caused by a marked decrease in the actual significance level (i.e., the size of the test). In fact, the

*D** empirical distribution has a reduced number of possible points causing, for some specific values, this level to drop to 0.02. The atypical pattern of the

*F*test is due to the intrinsic structure of the statistic. In fact, the empirical distribution of

_{S}*F*(both under

_{S}*H*

_{0}and

*H*

_{1}hypotheses) presents pronounced changes at specific ranges of values. That pattern causes marked changes in the power when these values are within the rejection region (results not shown). Nevertheless, this irregular behavior is not present in coalescent simulations conditional on the value of 𝛉 (results not shown).

### Logistic Population Growth Model

We also conducted the analysis of power under a more realistic population growth scenario, the logistic population growth model. Using this model, we performed an explorative analysis of the most relevant cases to validate the conclusions of our work. We found that the assumption of the logistic population growth model does not change the major conclusions of the work. Even so, in comparison with the sudden growth model the maximum power of the tests is reached at higher values of the elapsed time; for instance, for the parameter sets used in Fu (1997)<$REFLINK> (*r* = 10, *c* = 1) the maximum power is at *T _{s}* ∼ 1.2. In general, as expected (1) all statistical tests have less power under the logistic than under the sudden growth models; nevertheless the decrease in the power is relatively uniform for all statistical tests and (2) the larger the value of

*r,*the more power the tests have.

### Application to DNA Sequence Data

The present results have been applied to two published DNA data sets: the mtDNA variation analysis of a Turkish human population (Comas et al. 1996<$REFLINK> ), and the survey of a human noncoding autosomal region (Alonso and Armour 2001<$REFLINK> ). Comas et al. (1996)<$REFLINK> sequenced 360 base pairs of the region I of the mtDNA D-loop in 45 individuals. From the mismatch distribution analysis the authors suggested that the Turkish population had expanded recently. We determined the power of the different tests to identify which is most powerful against population growth. For the total data (*n* = 45; *S* = 56) and considering that *D _{e}* = 100 and

*T*= 0.4 (scaled in terms of

_{e}*N*generations) most tests were powerful enough, and several of them could reject the null hypothesis of constant size. We also determined whether the tests could also reject the null hypothesis for small sample sizes. For that, we reanalyzed a subset of 10 randomly chosen sequences from the data of Comas et al. (1996)<$REFLINK> . Table 1 shows the estimates of the power and of

*P*values of some statistical tests. The results clearly illustrate that the constant size hypothesis can be rejected by the most powerful tests (Fu's

*F*and

_{S}*R*

_{2}).

We also compared the *P* values and the power of the *R*_{2} and some of the statistical tests used in Alonso and Armour (2001)<$REFLINK> . These authors performed a nucleotide variation study in 100 chromosomes sampled from different African and Euroasiatic populations. Although the surveyed region is autosomal, the Alonso and Armour (2001)<$REFLINK> results suggested that recombination should be reduced. We analyzed the Japanese population (*n* = 20; *S* = 5) using the same values of the recombination parameter *R* (*R* = 2*Nρ,* where *ρ* is the recombination rate per generation) as the published ones; for that analysis we used Hudson's (1983)<$REFLINK> algorithm to generate DNA samples under the coalescent with recombination (results based on 10,000 replicates). For the power analysis we consider that *D _{e}* = 100 and

*T*= 0.1. For

_{e}*R*= 0 (no recombination) only the

*F*test can reject the null hypothesis of constant size. But for increasing recombination values the power of

_{S}*R*

_{2}and Tajima's

*D*tests increases, whereas it decreases for

*F*and

_{S}*rg.*In fact, for

*R*= 10 only

*R*

_{2}allows the null hypothesis to be rejected.

## Discussion

In this article, we have examined the power of several statistical tests to determine which are most powerful in different population growth scenarios. The analysis has been performed by using a coalescent-based approach. There are other alternative approaches (likelihood-based methods) to study a population expansion process: the maximum likelihood (e.g., Griffiths and Tavaré 1994<$REFLINK> ; Kuhner, Yamato, and Felsenstein 1998<$REFLINK> ; Weiss and Von Haeseler 1998<$REFLINK> ) and the Bayesian approaches (see Stephens 2001<$REFLINK> ). The likelihood provides a framework for testing hypotheses; specifically, tests based on the likelihood ratio test statistic, δ = −2 ln (*L*_{0}/*L*_{1}), where *L*_{0} and *L*_{1} are the maximum likelihood values under the null and the alternative hypothesis, can be used to discriminate between constant size and population growth. Unfortunately, the standard χ^{2} approximation for the distribution of δ might be inadequate. The empirical distribution of δ could be generated, however, by computer simulation and from that distribution the critical values could also be obtained; nevertheless, this method is computationally very intensive.

We have shown that tests based on the mismatch distribution have little power against population growth. The *MAE* test is the less powerful one; although *rg* is more powerful than *MAE,* it works less well than nearly all class I and class II tests examined. *ku,* the newly developed test of class III, although better than *MAE* and *rg,* is clearly inferior to other class I and class II tests.

On the other hand, several class I and class II tests can detect population expansion even for small *D _{e}* values. We have shown that two of the surveyed tests (

*R*

_{2}and

*F*) are the most powerful for a variety of different conditions. These tests should therefore be chosen to test constant population size versus population growth. In particular, we suggest using the

_{S}*R*

_{2}statistical test for small sample sizes and

*F*for large ones. Nevertheless, because

_{S}*R*

_{2}and

*F*statistics use different kinds of information, discrepancies between these tests could provide information about the action of other evolutionary processes, for example on the intragenic recombination (see below).

_{S}Fu (1997)<$REFLINK> studied the power of some statistics under the logistic model of population growth. He conducted coalescent simulations fixing theta (𝛉 = 5, 𝛉 = 10) instead of fixing *S.* To check the behavior of *R*_{2}, and other mismatch-based statistics, under these conditions we performed some additional simulations conditional on 𝛉. We found that the *R*_{2} and *F _{S}* are again the most powerful statistics (see an example in fig. 5 ). Interestingly,

*rg*and

*MAE*have better results fixing 𝛉 than fixed

*S.*

### Intragenic Recombination

The results from the present analysis are appropriate for nonrecombining DNA regions (i.e., mitochondrial or Y-chromosomal DNA regions). It is expected, however, that intragenic recombination substantially affects the power of the statistical tests surveyed (Rozas et al. 1999<$REFLINK> ; Wall 1999<$REFLINK> ). Indeed, a loss of power for those tests based on the haplotype distribution is expected (class II tests; e.g., Fu's *F _{S}* test) or for those based on the mismatch distribution (class III tests; e.g.,

*rg*test). The reason is that recombination, by shuffling nucleotide variation among DNA sequences (1) increases the number of haplotypes and (2) generates a much smoother mismatch distribution (Poisson-like). Consequently, class II and class III tests could be inadequate in detecting the signature left by a population growth on a recombining DNA region. Class I tests, on the contrary, should be less sensitive to intragenic recombination. To check our prediction, we conducted a few coalescent simulations using different values of the recombination parameter. Our preliminary results comparing the power of

*R*

_{2}and

*F*tests show that the behavior of the former is better than the

_{S}*F*for increasing levels of recombination (also see table 1 ).

_{S}### Coalescent Simulations Conditional on the Number of Segregating Sites

The present power analyses have been performed conducting coalescent simulations conditional on the number of segregating sites. Given that the actual value of 𝛉 is usually unknown, and that estimates of 𝛉 are usually obtained from DNA polymorphism data information, the method seems to be appropriate (Hudson 1993<$REFLINK> ; Depaulis, Mousset, and Veuille 2001<$REFLINK> ; Wall and Hudson 2001<$REFLINK> ). But Markovtsova, Marjoram, and Tavaré (2001)<$REFLINK> pointed out correctly that the power of coalescent-based tests are not independent of 𝛉 and, therefore, the statistical power might vary as a function of 𝛉 for a given *n* and *S.* To check that effect on the *R*_{2} we performed a prospective analysis generating samples conditional on 𝛉 and *S* using the rejection algorithm of Tavaré et al. (1997)<$REFLINK> . The results yield the same conclusions as that of Depaulis, Mousset, and Veuille (2001)<$REFLINK> and Wall and Hudson (2001)<$REFLINK> , i.e., the fixed *S* method seems to be appropriate unless the actual value of 𝛉 is far from Watterson's (1975)<$REFLINK> estimate of 𝛉.

### Competitive Alternative Hypotheses

It should be stressed that a significant result (a significant departure from the null hypothesis) should be interpreted cautiously: there are several putative alternative hypotheses to single null hypotheses. Indeed, processes other than population expansion, such as genetic hitchhiking (Maynard Smith and Haigh 1974<$REFLINK> ), could also produce similar genealogies (i.e., departures of the statistical tests in the same direction). Therefore, additional analyses could be necessary to discriminate between some competitive alternative hypotheses. For instance, because genetic hitchhiking in regions undergoing recombination will affect a relatively small fraction of the genome (close to the advantageous mutation), surveys at different gene regions across the genome could provide the opportunity to discriminate between population expansion and genetic hitchhiking (see Galtier, Depaulis, and Barton 2000<$REFLINK> ).

To summarize, *F _{S}* and

*R*

_{2}are the best statistical tests for detecting population growth. The behavior of

*R*

_{2}is better for small sample sizes, whereas

*F*is better for bigger sample sizes. Additionally, preliminary results also indicate that the behavior of

_{S}*R*

_{2}should be superior when the intragenic recombination is considered. On the other hand, some popular statistics based on the mismatch distribution,

*rg*and

*MAE,*are very conservative.

Wolfgang Stephan, Reviewing Editor

Keywords: population growth population expansion coalescent simulations neutrality tests

Address for correspondence and reprints: Julio Rozas, Departament de Genètica, Facultat de Biologia, Universitat de Barcelona, Diagonal 645, E-08071 Barcelona, Spain. E-mail: julio@bio.ub.es

We thank M. Aguadé, A. Navarro, H. Quesada, and C. Segarra for critical comments on the manuscript. This work was supported by grants PB97-0918 from the Dirección General de Investigación Científica y Técnica, Spain and 1999SGR-25 from the Comissió Interdepartamental de Recerca i Tecnologia, Catalonia, Spain, conferred on M. Aguadé.

## References

*D*, an overview and prospects of coalescent theory

_{e}*in*N. Takahata and A. G. Clark, eds. Mechanisms of molecular evolution. Sinauer Associates, Inc., Sunderland, Mass

*in*D. J. Balding, M. Bishop, and C. Cannings, eds. Handbook of statistical genetics. John Wiley and Sons, West Sussex, U.K

*in*P. Donnelly and S. Tavaré, eds. Progress in population genetics and human evolution. Springer-Verlag, New York

*rp49*gene region in different chromosomal inversions of

*Drosophila subobscura*

*in*D. J. Balding, M. Bishop, and C. Cannings, eds. Handbook of statistical genetics. John Wiley and Sons, West Sussex, U.K