Inference of Historical Population-Size Changes with Allele-Frequency Data

With up to millions of nearly neutral polymorphisms now being routinely sampled in population-genomic surveys, it is possible to estimate the site-frequency spectrum of such sites with high precision. Each frequency class reflects a mixture of potentially unique demographic histories, which can be revealed using theory for the probability distributions of the starting and ending points of branch segments over all possible coalescence trees. Such distributions are completely independent of past population history, which only influences the segment lengths, providing the basis for estimating average population sizes separating tree-wide coalescence events. The history of population-size change experienced by a sample of polymorphisms can then be dissected in a model-flexible fashion, and extension of this theory allows estimation of the mean and full distribution of long-term effective population sizes and ages of alleles of specific frequencies. Here, we outline the basic theory underlying the conceptual approach, develop and test an efficient statistical procedure for parameter estimation, and apply this to multiple population-genomic datasets for the microcrustacean Daphnia pulex.

Because polymorphisms with different allele frequencies arise at different average times in the past, information on the amount of variation associated with different allele-frequency classes in a population sample can provide insight into the history of population-size change. This is especially true for neutral variants, whose temporal dynamics depend only on stochastic sampling effects. This simple idea has led to the development of several technical and computationally demanding approaches for estimating historical changes in the sizes of populations, either from patterns of segregating variation at the single-nucleotide level or from information on linkage disequilibrium between nucleotide sites (Strimmer and Pybus 2001;Hayes et al. 2003;Tenesa et al. 2007;Gutenkunst et al. 2009;Li and Durbin 2011;Bhaskar et al. 2015;Liu and Fu 2015;Gattepaille et al. 2016;Weissman and Hallatschek 2017). All of these methods make numerous assumptions, some of which can be difficult to validate (e.g., the negligible influence of nonneutral sites), are almost certainly violated (e.g., linearity of the relationship between the recombination rate and physical distance betwen sites), and/or require information that is not available for most species (e.g., the identification of derived vs. ancestral alleles). Moreover, it remains to be seen whether simpler, more intuitive approaches might yield results that perform to a comparable (or even greater) degree of accuracy.
The approach taken here is conceptually straight-forward, the main biological assumptions being that the sites underlying the analysis have evolved in a neutral fashion for a considerable number of generations (roughly speaking, for at least four times the current effective population size, which is the expected coalescence time to common ancestry under current conditions), and that there be no substantial population structure. All aspects of the analysis are based on samples of the site-frequency spectrum (SFS) for such sites. Letting n be the number of sampled haploid genomes (typically twice the number of individuals in a sample from a diploid population), the number of polymorphic genomic sites with r copies of the derived allele is denoted G r , where r ¼ 1 to ðn 2 1Þ: The number of monomorphic sites is G 0 , and the SFS is defined as G r =G; where G is the total number of monomorphic and polymorphic sites evaluated across the genome. In the following, a mutation in class r will be referred to as rth order, with r ¼ 1 denoting singletons, r ¼ 2 doubletons, etc.
The methods that follow, which adhere to the theory utilized in the stairway method of Liu and Fu (2015), are based on three principles. First, the frequencies of sites residing within different classes are functions of the historical pattern of effective population size (N e )all other things being equal, increases in N e elevate the probability of an allele residing in a particular polymorphic state, but the relative frequency also depends on the sequence of N e experienced by all other allelic classes. Second, the SFS for neutral sites scales with the mutation rate per site per generation (u) (Kimura 1983), so if quantitative information on N e is desired, an estimate of u is required. Third, the frequency of a mutation provides information on its ageunder the process of neutral drift, the time for a new mutation to reach a frequency class is a monotonic function of the frequency, although there is considerable noise around the expectation.
The goal here is to use these principles to determine the long-term series of effective population sizes most compatible with the SFS, and given the measures of interval-specific N e to estimate the temporal history of past population-size changes experienced by segregating polymorphisms. We present analytical solutions for a broad set of genealogical features of a sample that are independent of the demographic history, and use this theory to develop estimators for the average age of single-nucleotide polymorphisms within each frequency class and the average N e experienced during their history. These results are worked out for the case of the unfolded SFS, and extended to the folded SFS, which summarizes the incidence of the minor-allele frequency classes, as investigators only rarely know which allele segregating at a locus is derived. A computationally efficient method for estimating such parameters is presented, validated with comparisons to computer-simulated data, and applied to large population-genomic data sets of Daphnia pulex.

THEORY
The Kingman (1982) coalescent provides the theoretical basis for all that follows. Under this view, members of a genealogy of n samples (assumed to be ( N e ) randomly coalesce each generation until the entire genealogy has congealed to one common ancestor at the base of the tree after the ðn 2 1Þth coalescence event. Although the number of possible tree topologies is enormous with large sample sizes, many of the summary features of the coalescent are known (Hein et al. 2005;Wakeley 2009).
Here, we are concerned with the average features of alleles within different frequency classes (r ¼ 1 to n 2 1) in the sample, which requires an understanding of the nature of the branch segments on which mutations of the different classes can reside. These probabilistic features can be summarized with a knowledge of P k;k2x ðrÞ, the probability that an allele (SNP, or single nucleotide polymorphism) with frequency r=n resides on an uninterrupted branch starting at level k and ending at level ðk 2 xÞ, where k ¼ n denotes the branch tips and k ¼ 1 denotes the base of the tree (Figure 1). For any class of mutations, the underlying branch segments can start as early as level ðn 2 r þ 1Þ (singleton branches always start at level n) and can end as deeply as level 1. This means that internal branches starting at level k can span up to k 2 1 possible coalescence events in the tree. Each coalescence event can potentially be associated with a unique effective population size.
A key point is that the P k;k2x ðrÞ coefficients are completely independent of the underlying demographic history, as the coordinates are simply denoted by the enumerated coalescence events, and are functions of only the sample size (n) and the allele class (r). Only the branch lengths are functions of the population size. As described below, the full set of coefficients (derived in the Appendix) provide the basis for analytical expressions for various useful statistical features of SNPs. Here, we adhere to the infinite-sites model, so that each new mutation arising on a genealogy is assumed to appear at a unique site, with the mutation rate to a novel SNP being equal to u per site per generation.
An interval-specific view of N e We start with the assumption that n sequences have been sampled randomly (from n=2 diploid individuals, or n haploids) at each of L nucleotide sites known to behave in an effectively neutral manner. Under neutrality, for a population with constant effective size, the expected number of sites occupying frequency class r in an unfolded site-frequency spectrum is where u ¼ 4N e u (for diploids, assumed here, or 2N e u for haploids), with u being the mutation rate per nucleotide site per generation (Watterson 1975;Fu 1995;Ewens 2004;Messer 2009). Here, we take a more refined approach by explicitly evaluating the way in which each SNP class reflects the historical series of coalescence events within a sample, averaging over all possible coalescent topologies for a sample of size n. Looking into the past, a tree composed of n sequences (branch tips) experiences ðn 2 1Þ coalescence events to the base, and to each level k we assign an effective population size N k , such that N n denotes the average effective size between the branch tips and the first coalescence event, N n21 represents the average in the interval between the first and second coalescence events, and so on. At level k, the expected time (in generations) to the next coalescence event into the past is (2) Figure 1 A genealogy comprising a sample size of n ¼ 8; with the k values denoting the ordered coalescence events (k ¼ 1 representing the root of the tree). The external branches, which can contain only singleton mutations, are given as solid black lines. There are three internal branches of order r ¼ 2 (upon which doubleton mutations must reside, denoted with large dashes); the red, green, and purple doubleton branches have starting and ending levels of (7,5), (6,5), and (4,2), respectively. The tree also contains single branches of orders r ¼ 3 (tripletons, medium dashes), 4 (quadrupletons, dot/dashes), and 5 (quintupletons, small dashes). The consecutive coalescence times are denoted by t k , with N k denoting the effective population size from level k to k 2 1.
generations. For example, starting with k ¼ n; t n ¼ 4N n =½nðn 2 1Þ denotes the expected number of generations until the first coalescence event in the sample; and the second coalescence event, which depends on N n21 ; is obtained by setting k ¼ n 2 1; i.e., t n21 ¼ 4N n21 = ½ðn 2 1Þðn 2 2Þ: Given knowledge of the expected internal features of the coalescent, for each SNP frequency class, the expected value of G r can be expressed as a function of the full set of relevant N k ; which determine the lengths of branches upon which mutations arise. This also requires expressions for the expected number of branches of order r at each relevant level of the coalescent, averaged over all possible random genealogies in a sample of size n. These are derived in the Appendix.
As an example, consider the simplest case of the singleton class. All singleton mutations must be present on external branches, which always start at level n but may end at any level in the genealogy from n 2 1 (the first coalescent) to 1 (extending to the base of the tree). The expected number of singletons in the sample is where S n;ðn2xÞ is the probability of a singleton branch (starting at level n) not having coalesced by level ðn 2 xÞ. This expression is equal to the sum of the product of the expected number of singleton branches surviving at each level and the length of the subsequent coalescence interval, all multiplied by the probability of a mutation arising per site per generation. Using the expression for S n;ðn2xÞ , Equation (A3) in the Appendix, the preceding expression simplifies to where NðnÞ is the arithmetic average of the interval-specific N i from the top (N n ) to the bottom (N 2 ) levels of the tree. This result applies regardless of the mode of population-size change, showing that an estimate of the arithmetic average N e across all coalescence events is provided by the incidence of singletons, i.e., as NðnÞ ¼ G 1 =ð4uLÞ, where G 1 denotes an estimate of the number of singletons.
Things are more complicated for the higher-order site-frequency classes because internal branches no longer initiate at the same levels. However, by extension from Equation (3b), one can infer that the probability of a mutation arising on a single branch starting at level k, allowing for variable ending points, is 4u NðkÞ=k; where NðkÞ is the arithmetic average of the interval-specific N i starting at level k and descending down to the base of the tree. From Equation (A7), the expected number of order-r branches starting at level k is ½ðr 2 1Þ=ðn 2 kÞ Á W k ðrÞ, where W k ðrÞ is a coefficient defining the expected number of segments of order r present at level k; given by Equation (A5). Summing these contributions over all levels, which can also be written as also obtained by Liu and Fu (2015). These expressions show that the expected frequencies of all mutation classes are defined by differentially weighted averages of the interval-specific N e . When r ¼ 1; Equation (4b) yields (3b), and with constant N e , it reduces to E½G r ¼ 4N e uL=r; in accordance with Equation (1); considerable simplification is also possible if many adjacent N k have the same values (see Supplemental Material). Before proceeding, recall that there are two forms of a site-frequency spectrum. The unfolded distribution, described above, requires information on the ancestral allelic states of each SNP site, ideally inferred from at least two suitably distant outgroup species (Keightley and Jackson 2018). Such a distribution is a summary of all sites having derived-allele frequencies 1=n to ðn 2 1Þ=n: If ancestral allelic states are unknown, as is often the case, one must work with the folded sitefrequency spectrum, which summarizes the minor-allele frequencies in classes 1=n to 1=2. The folded site-frequency spectrum, with 1 # r # n=2; is defined as Average age of a SNP Whereas the previous results are concerned with the demographic history of the population, an alternative viewpoint considers the average ages and demographic features of SNPs of various frequencies. Once the interval-specific estimates of N k are available, the statistical machinery developed in the Appendix can be used to infer both order-specific measures. There, we show that for an unfolded SFS the average age (in generations) of an rth-order SNP in terms of the historical effective population sizes is The expected second moment is expressed as so the variance of ages of SNPs can be obtained as VarðA r Þ ¼ E½A 2 r 2 E 2 ½A r , after substituting in the estimates for the N k : Although the preceding expressions apply to an unfolded SFS, where the designated alleles are known to be derived (by use of appropriate outgroup species for identifying ancestral allelic states), studies without such a luxury must rely on a folded SFS. In this case, each frequency class will be a mixture of derived and ancestral alleles with different average ages. For low-frequency alleles in large samples, almost the entire set of sampled SNPs will consist of derived alleles, and the preceding expressions can still be used to obtain reasonably precise estimates. This follows from Equation (1), which shows that the expected frequency of SNPs of order i is inversely proportional to i. Thus, for i ¼ r and n 2 r, the fractional contribution of the former to the folded distribution is of order ðn 2 rÞ=n provided the N e associated with the two classes are not greatly different (and larger than this if N e is larger for the younger alleles). For r ( n; almost all of the SNPs within folded class r will be derived alleles.
A more precise approach is to explicitly treat the frequencies of the folded distribution as mixtures of classes of derived alleles of order r and ancestral alleles of order n 2 r, with respective relative probabilities p d and p a ¼ 1 2 p d . The expected age of a SNP of order r in a folded SFS can then be written as The components of p d can be estimated by substitution of the estimates for the N k into Equation (4b), and E½A r and E½A n2r , and their expected squared values, are estimated by use of Equations (6a,b). The variance of A Ã r is then Average N e of a SNP For a population experiencing temporal changes in size, alleles of different order will generally experience different long-term effective population sizes from birth to the present. Letting the population size at time s in the past be NðsÞ, the expected average population size experienced by an allele of frequency r is If one has information on the ancestral states of alleles, and hence an unfolded site-frequency spectrum, the mean N e experienced by an allele of order r can be obtained from the theoretical results on the mean time spent in different intervals. Weighting of the intervalspecific durations by their associated N e values leads to see Appendix. With a folded site-frequency spectrum, the weighting approach used for the age of an allele in the preceding section can be applied using the definitions of p d and p a , as well as E½P r and E½P n2r as defined in Equation (8). Nonetheless, with large sample sizes, the proposed approach is still expected to yield fairly accurate information on the average N e of rare alleles. This again follows from Equation (1), which shows that the expected frequency of SNPs of order i is inversely proportional to i.

ESTIMATION PROCEDURE
The results summarized in Equation (4b) amount to a series of n 2 1 equations, each a function of the mutation rate, u, and one or more of the interval-specific effective population sizes, N i . Thus, in principle, working backward, one could apply the elements of the observed SFS to Equation (4b) to recursively estimate the full set of N i necessary to account for the data, i.e., solving a set of n 2 1 equations for n 2 1 unknowns. However, with large numbers of unknowns and imperfectly estimated G r , such an approach leads to aberrant results, including negative population-size estimates. Moreover, in the case of a folded site-frequency spectrum, the number of possible unknown population sizes exceeds the number of observed frequency classes. It then becomes necessary to pool adjacent population sizes so as to reduce the number of parameters to be estimated. Consistent with Liu and Fu (2015), we have adopted a stepwise procedure, implemented in a likelihood framework. Consider a sampled site-frequency spectrum given by G 1 ; . . . ; G n21 , where G 1 is the number of singletons, G 2 the number of doubletons, etc. With L sampled sites, the number of monomorphic sites is G 0 ¼ L 2 G 1 2 ⋯ 2 G n21 . For any set of intervalspecific population sizes, Equations (4b,5) give the expected frequencies of SNPs in the full set of classes. Using a composite-likelihood approach, i.e., assuming that the elements of the sampled SFS are all essentially independent and Poisson distributed with parameters equal to the frequency expectations times L, the likelihood function is given in Supplemental Material.
We have implemented the above procedure in the program epos (Estimating POpulation Sizes), which runs under the UNIX command line. The C sources of epos and tutorial-style documentation are available from github at https://github.com/EvolBioInf/epos. The starting point of the estimation procedure assumes a constant population size throughout the entire history of the sample. The maximum-likelihood estimator of N e is then equivalent to Watterson's (1975) estimator. The next most complicated model involves a single coalescent breakpoint k flanked by two different N e , such that N i is a constant N k for k , i # n, and N k21 for i # k. The formulae for the expected SFS then reduce to a three-parameter model, whose solution requires a search for the combination of k; N k , and N k21 that maximizes the composite likelihood of the observed SFS, which can be found by Newton-Raphson iteration. This procedure is then iterated in a stepwise fashion, with each iteration increasing the number of breakpoints by one, until the difference in adjacent likelihoods no longer improves beyond a critical value. To this end, we employ the Akaike Information Criterion (AIC), moving on to the next iteration provided the log-likelihood has increased by at least 2.0 in the preceding iteration. The end result is Figure 2 Results from the application of epos to the Yoruba SNP data set applied in Lapierre et al. (2017). The Yoruba data set was bootstrapped 10,000 times (average given by solid line) and the 5 and 95% quantiles (lower and upper dashed lines, respectively) computed from these replicates.
a stepwise plot of interval-specific N e estimates, with the breakpoints converted to time (in generations) using the interval-specific expected coalescent times given by Equation (2) and a mutation rate provided by the user.
Several additional features are built into epos. First, it is possible to analyze folded as well as unfolded SFSs. Second, the auxiliary program bootSfs (github.com/EvolBioInf/bootSfs) implements the bootstrap to estimate the sampling variance of the estimated demographic history. Third, it is possible to exclude classes from the SFS, if for example the singleton class is deemed unreliable owing to sequencing errors. Fourth, the user can specify the AIC stopping criterion. Fifth, all possible combinations of breakpoint locations can be re-evaluated at each iteration, as opposed to sequentially adding fixed breakpoints; to accomplish this, there are two versions of the function nextConfig in epos: a fast, greedy version, which adds one new level at a time and a slow, exhaustive version, which goes through all possible combinations of levels. This flexibility is provided because the the number of possible sets of breakpoints increases exponentially as the stepwise estimation procedure advances.
Based on the performance of the Stairway Plot algorithm of Liu and Fu (2015), as applied to a single human (Yoruba) population sample, Lapierre et al. (2017) have raised concerns about the use of modelflexible approaches to estimating historical demography, as opposed to using model-constrained approaches that pre-specify the form of population growth and breakpoints in demographic features. In this particular application, these authors showed that the Stairway Plot algorithm predicts a complex demographic history with multiple recent bottlenecks, with a poor least-squares fit to the observed SFS (with a weighted mean-squared distance of d 2 ¼ 2:9 · 10 23 ). In contrast, simpler pre-specified models (e.g., linear, exponential, and sudden) predicted consistent increases in population size to the present (all with d 2 in the range of 2:2 · 10 24 to 4:1 · 10 24 ). Application of epos to the same data set predicts an increase in population size from the deep past to the present, but with a short intervening population bottleneck $ 500; 000 years ago (Figure 2), and has a fivefold reduction of d 2 to 8:0 · 10 25 . The current N e ≃28; 000 predicted by epos is comparable to that obtained by other methods. Thus, contrary to the conclusions of Lapierre et al. (2017), these results suggest that constrained models are not inherently superior to flexible models, but simply that the quality of the results obtained in the latter context can be suboptimal if the algorithmic approach of Liu and Fu (2015) is applied.
We have further evaluated the utility of epos by fitting N e histories to various demographic scenarios by generating sample SFSs using the coalescent software of Kelleher et al. (2016) and Chen et al. (2009) in the analyses in Figures 3A-E and 3F, respectively. Comparison of our results to those obtained by the algorithms of Liu and Fu (2015) shows that epos performs as well and in some cases better than the Stairway method ( Figure 3). For each evaluated scenario, ten SFSs were Figure 3 Comparison of the estimation quality of epos (blue) to results from the Stairway Plot (SP, red) method of Liu and Fu (2015). In each panel, the means of 10 medians and 5 and 95% quantiles are shown (dotted and dashed lines, based on averages of 10 independently derived SFSs). The black lines denote the true underlying demography used to simulate the data. generated, and 2,000 bootstrap replications were used to find the mean and percentiles of the effective-population size estimates, except under the last scenario (F), where 200 bootstrap replications were used. Epos is at least 1,000 times faster than Liu and Fu's (2015) Stairway procedure. For example, epos and Stairway Plot v2 took 0:00:14 and 6:11:51, respectively, to analyze one site-frequency spectrum under the scenario in Figure 3a.
APPLICATION TO DAPHNIA POPULATION-GENOMIC DATA We applied epos to the SFSs from eight Daphnia pulex populations, an emerging model system in population genomics. A practical issue in any population-genomic study with moderate sequence coverage per site is that not all sites are scored in identical numbers of individuals. In this particular study, for each population 8 to 14 SFSs were available for sample sizes of 40,000 to 2,400,000 nucleotides (Table 1), with separate analyses performed for fourfold redundant silent sites in proteincoding genes and internal intron sites known to behave in a nearly neutral fashion (Lynch et al. 2017). Although the individuals used within these within-population analyses were largely overlapping, the sites employed were fully nonoverlapping. The numbers of individuals associated with each SFS range from 62 to 93.
This type of partitioning is required because the SFS theory involves discrete distributions, i.e., frequencies from different sample sizes should not be amalgamated into a single pooled SFS. However, such replication in analysis also provides some guard against sampling variance issues. For each of the samples, 10,000 bootstraps of the SFS were performed to generate a median demographic-history estimate, assuming a mutation rate of u ¼ 5:7 · 10 29 per site per generation (Keith et al. 2016). The final demographic-history estimates for each population are then given as the means of separate n■ Table 1 The site-frequency spectra analyzed in this study. Details on data acquisition can be found in Maruki et al. (2019) For each population, 8 to 14 SFSs were used, with a range of numbers of sampled individuals and nucleotides as noted in the text median estimates (Figure 4), and a further summary mean over all populations is given in Figure 5. Although there are significant differences among populations, these analyses suggest a fairly consistent demographic history among all populations (focusing on an order-of-magnitude time scale). From $ 10 5 to 3 · 10 6 generations in the past, N e was almost always in the range of 0.5 to 2:0 · 10 6 , with little evidence of dramatic changes. All populations exhibit evidence of a twofold or so decline in N e in the very recent past, followed by an interval of population-size expansion around 20,000 generations ago ( Figure 5). Assuming five to ten generations per year, these recent demographic shifts would represent post-Pleistocene changes, with the point of initiation of population-size expansion being 2,000 to 4,000 years ago (roughly corresponding to the Neopluvial, a period of wetter and cooler climate in North America). The ending points in the demographic profiles ($ 3 · 10 6 million years ago) fall in the mid-Pleistocene. Influences from European settlement, deforestation, and agriculture would date no further back than 5,000 generations, and are not discernible.
Finally, the relationships between the mean N e of SNPs and their average age is given for each population in Figure 6. The left panel provides an example of the variation among sample-size classes for one particular population (with each point representing a particular SFS class for a particular number of individuals scored). The right panel summarizes the average results for each population as simple firstor second-order polynomial least-squares regressions. The main point again is that these Daphnia populations do not exhibit major demographic shifts across allele-frequency classes, with the population average N e associated with SNPs of all ages almost always falling in the range of 800; 000 to 3; 000; 000:

Data availability
The details on data acquisition, curation, and deposition appear in Maruki and Lynch (2019); the FASTQ files of the raw-sequence data are publicly available via the NCBI Sequence Read Archive (accession number SRP155055). Supplemental material available at figshare: https://doi.org/10.25387/g3.10265867.

DISCUSSION
The methods developed herein provide a model-independent means for estimating the past demographic history of a sample, using information on the frequency distribution of nucleotide sites assumed to behave in a neutral to nearly neutral manner. The approach taken assumes that changes in population size occur only at specific points in a genealogy, i.e., at the times of average occurrence of coalesence events. This, of course, will never happen precisely in any natural population. However, as the times of coalescent events vary widely among genealogies, such granularity can be expected to average out. Moreover, the approach taken does provide an increasingly fine dissection of the overall time scale under evaluation with increasing sample size (n). So as shown by Liu and Fu (2015) and herein, the method has the potential to closely approximate the more continuous patterns of population-size changes that likely occur in nature.
In the proposed method of estimation, Epos simply starts with an assumption of constant N e , and then progressively searches for points of change in N e that, when invoked, yield significant improvements in the likelihood of the observed SFS data in a stepwise manner. Application of the bootstrap yields a further smoothing of the output estimates as well as confidence intervals on the overall pattern. Further smoothing is obtained by partitioning the SFS data into classes differing in sample sizes (or from different classes of sites, such as fourfold redundant codon sites vs. internal intron sites, both of which behave in a nearly neutral fashion). SFS sample-size variation will generally be the rule in low-coverage population-genomic sequencing data, where some individuals will have inadequate sequence at random sites.
The theoretical basis of the methods described herein is the same as that adopted by Liu and Fu (2015), although we have derived a number of extensions. In addition, the estimation procedures embodied in Epos deliver advances over the pioneering work of Liu and Fu (2015) in a number of ways. First, by using a Newton method for maximizing the likelihood rather than a (slow) genetic algorithm for optimization, the overall algorithmic approach is considerably more efficient, improving computational speed by over an order of magnitude without sacrificing accuracy in estimation (and in some cases apparently improving it). Second, Epos is capable of an exhaustive search for the best-fit demographic scenario, up to a number of steps specified by the user. Under this exhaustive search model, in adding a new breakpoint to the demography, each step in the iterative fitting re-evaluates the positions of all preceding breakpoints and their flanking N e estimates. Third, epos returns estimates on the average ages and N e (and sampling errors) of alleles within different frequency classes.
One potential concern with our method is its reliance on a composite-likelihood approach that ignores the nonindependence of linked SNPs. There are two reasons to believe that this is a minor issue with respect to the final analyses. First, most organisms have ten or more chromosomes, so only a minor fraction of pairs of loci are even on the same chromosome, and even a smaller fraction are within the $ 10 6 bp where linkage disequilibrium is likely to be significant. Second, our simulation studies on algorithm performance, which generated data based on a recombining chromosome and then applied the composite likelihood, did indeed yield results consistent with simulated demographies. Although desirable, full-likelihood methods allowing for linked loci would be enormously computationally demanding, but more importantly would require detailed information on chromosomal map structures, which are available for few species.
Like all polymorphism-based methods, our approach is expected to become increasingly unreliable at very distant times in the past, owing to the increasing granularity of the coalescent process, and the fact that few polymorphisms are expected to survive for more than 4N e generations. In addition, the ability to estimate very recent population-size changes is a function of the sample size, as there can be no power to estimate a span of time during which there is essentially zero chance of a de novo mutation appearing in a sample.

ACKNOWLEDGMENTS
This work was supported by NIH grants R01-GM101672 and R35-GM122566-01 and NSF grant DEB-1257806 to ML. We thank Yun-Xin Fu for helpful discussion. Figure 6 Left) The average effective population sizes and ages associated with each site-frequency class of alleles for the KAP population of D. pulex, with open symbols for internal intron sites (as defined in Lynch et al. 2017) and closed symbols for fourfold redundant sites. Sample sizes ranging from 72 to 78 diploid individuals have associated numbers of scored sites given in the inset for internal intron sites and fourfold redundant sites, respectively. Right) Summary of the results for each population, simply given as best second-order polynomial fits to the full sets of data.

APPENDIX
Results at various intermediate steps had to be derived before arriving at the main results of the text, some scattered in the prior literature (e.g., Janson and Kersting 2010; Dahmer and Kersting 2015). Here we summarize things in one place to make the overall approach more transparent.

Probability distribution for external branches.
By definition, singleton mutations exist only on external branches, which always start at level k ¼ n in the coalescent and can end at levels k ¼ n 2 1 to 1, where n is the sample size. These locations denote the consecutive coalescence events across the tree (with k ¼ 1 denoting the base of the genealogy). We wish to determine the probability distribution of branch lengths in units of coalescence events across the tree, with P n;k ð1Þ denoting the probability that an external branch (starting at level n) ends at level k. This can be accomplished by letting S n;k denote the probability that a singleton branch descends to at least level k without having been absorbed by a coalescence event. By definition, S n;n ¼ 1, and P n;k ð1Þ ¼ S n;kþ1 2 S n;k : (A1) S n;k is the fraction of singleton branches extending down the tree until at least level k; whereas P n;k ð1Þ denotes the fraction of all singleton branches terminating at level k: To obtain the probability of a particular external branch surviving until the first coalescence event across the tree, note that two draws must be made without replacement from the n initial branch tips. The probability that neither draw involves the focal branch tip is The probability of the focal branch tip continuing to survive the next coalescence event (i.e., not being drawn) is obtained by noting that there are now n 2 1 possible draws, S n;n22 ¼ S n;n21 Á 1 2 1 n 2 1 1 2 1 n 2 2 ¼ ðn 2 2Þðn 2 3Þ nðn 2 1Þ ; which generalizes to, S n;n2x ¼ ðn 2 xÞðn 2 x 2 1Þ nðn 2 1Þ : Substituting into Equation (A1) yields the general expression for a singleton branch terminating at the xth coalescence event in the tree, P n;n2x ð1Þ ¼ 2ðn 2 xÞ nðn 2 1Þ : (A4)

Probability distribution for internal branches
The situation is more complicated with internal branches, which have variable starting and ending points, and also vary in number among alternative tree topologies. However, progress is made possible with a result from Dahmer and Kersting (2015), which states that for a sample of size n the expected number of segments of order r present at level k (just below the coalescence at this point) is n 2 k 2 i n 2 1 2 i ! ¼ kðk 2 1Þ n 2 r Á ðn 2 kÞ!ðn 2 rÞ! ðn 2 k 2 r þ 1Þ!ðn 2 1Þ! ; for k # ðn 2 r þ 1Þ, and W k ðrÞ ¼ 0 for k . ðn 2 r þ 1Þ. This result can also be obtained by extrapolating Equation (14) in Fu (1995). Again note that k ¼ n denotes the branch-tip level, k ¼ n 2 1 denotes the first coalescence event in the sample, and k ¼ 1 denotes the base of the genealogy (the final coalescent), so that starting points (nearer the branch tips) have higher integer values than ending points. From this expression, it is possible to deconvolute the expected number of internal branches initiating at level k; B k ðrÞ, by noting the birth-death process involving segments of order r as one descends down the tree, is the probability that a particular group at level k þ 1 is not a participant in the next coalescence event. Upon rearrangement, and substitution from above, B k ðrÞ ¼ r 2 1 n 2 k Á W k ðrÞ: The highest level for a nonzero value of this birth rate is n 2 r þ 1, and thereafter 0 , B k ðrÞ , 1, except for the case of k ¼ n 2 1 for r ¼ 2, B n21 ð2Þ ¼ 1; as the first coalescence event always produces a doubleton.
From the statistical properties of coalescence events arising subsequent to the origin of a segment noted above, the expected number of internal branches of order r initiating at level k and ending at level k 2 x, where 1 # x # ðk 2 1Þ; is which follows directly from Equation (A4). The complete probability distribution for segment spans of order r is then where the denominator is the expected total number of branches of order r in a genealogy None of the features in this entire section on the genealogical structure of a sample depend on demographic history, although the lengths of the individual branches do.
Average age of a SNP Griffiths (2003) obtained a general expression for the average age of a derived allele of arbitrary frequency under the assumption of constant population size, given in generations, However, here we are concerned with the more complex issue of estimating the average age of SNPs when the population size is not constant. The central challenge is that mutations of various orders can appear on branches that start and end at various levels in the tree, each of which may be associated with a particular N e : Here, we take advantage of a derivation of Griffiths and Taveré (1998), their Equation (5.4), which requires a definition of p n;k ðrÞ, the probability that a random line, at the time there are k total lines in the coalescent, is subtended by r leaves in the tree. This is equivalent to W k ðrÞ=k; with simplification of Equation (A5) leading to p n;k ðrÞ ¼ k 2 1 r Á n 2 k r 2 1 n 2 1 r : Letting t k be the time the coalescent has k lines (defined by Equation (2) in the main text), and T kþ1 ¼ t n þ ⋯ þ t kþ1 , then as Griffiths and Tavare (1998) argue above their Equation (5.1), the expected age of an allele arising on a branch when the coalescent has k lines, is Ut k þ T kþ1 , where U is a uniformly distributed on ½0; 1 independent of all other random variables.
To obtain the moments of the ages, we take advantage of a derivation of Griffiths and Taveré (1998) The denominator, which is independent of j, reduces to X n k¼2 n 2 k r 2 1 kðk 2 1ÞE½t k ¼ 4 X n k¼2 n 2 k r 2 1 N k : Average N e of a SNP The results from the previous section summarize the average amounts of times that alleles spend at the various population sizes. For SNPs of any order r, the average N e experienced can be determined by weighting the relative time spent in each interval by the interval-specific population sizes. From Griffiths and Taveré (1998), their Equation (5.1), we know that (6a) is in fact equivalent to E½A r ¼ P n k¼2 kp n;k ðrÞE½t k Á ðUt k þ T kþ1 Þ P n k¼2 kp n;k ðrÞE½t k ; where U is uniformly distributed on ½0; 1; see also the explanation below (A11). This formula can be interpreted such that the kp n;k -term is proportional to the probability that the SNP of size r occurs within level k. The expectation in the numerator then gives the time how long in the past this level happened. For estimating the average N e of a SNPs in size r, we need to weigh this expectation by the experienced population sizes, leading us to E½P r ¼ P n k¼2 kp n;k ðrÞE½t k Á ðUt k N k þ t kþ1 N kþ1 þ ⋯ þ t n N n Þ P n k¼2 kp n;k ðrÞE½t k Á ðUt k þ T kþ1 Þ

:
For the denominator, note that 4N ℓ ℓðℓ 2 1Þ : So, we compute X n k¼2 kp n;k ðrÞE½t k Á ðUt k þ t kþ1 þ ⋯ þ t n Þ ¼ X n The numerator is actually almost the same except for a different weight of the N ℓ 's, i.e. X n k¼2 kp n;k ðrÞE½t k Á ðUt k N k þ t kþ1 N kþ1 þ ⋯ þ t n N n Þ ¼ X n k¼2 4N k n 2 k r 2 1 n 2 1 r r X n ℓ¼k 4N 2 ℓ ℓðℓ 2 1Þ : Dividing the last two displays then gives Equation (8).