Abstract

Properties of gene genealogies such as tree height (H), total branch length (L), total lengths of external (E) and internal (I) branches, mean length of basal branches (B), and the underlying coalescence times (T) can be used to study population-genetic processes and to develop statistical tests of population-genetic models. Uses of tree features in statistical tests often rely on predictions that depend on pairwise relationships among such features. For genealogies under the coalescent, we provide exact expressions for Taylor approximations to expected values and variances of ratios Xn/Yn, for all 15 pairs among the variables {Hn,Ln,En,In,Bn,Tk}, considering n leaves and 2kn. For expected values of the ratios, the approximations match closely with empirical simulation-based values. The approximations to the variances are not as accurate, but they generally match simulations in their trends as n increases. Although En has expectation 2 and Hn has expectation 2 in the limit as n, the approximation to the limiting expectation for En/Hn is not 1, instead equaling π2/321.28987. The new approximations augment fundamental results in coalescent theory on the shapes of genealogical trees.

Introduction

Coalescent theory models random genealogies conditional on assumptions about the evolutionary process (Hein et al. 2005; Wakeley 2009). In coalescent theory, a gene genealogy is a tree or network structure that represents a random draw from a coalescent model.

Genealogies in coalescent theory can be summarized using a variety of quantities. For example, for random tree-like genealogies with n lineages, the tree height Hn records the sum of branch lengths on a path from a leaf to the root, and the tree length Ln sums all branch lengths in the tree. The total length En of external branches sums over leaves the lengths of paths from leaves to their nearest internal nodes, and the total length of internal branches, In=LnEn, sums the lengths of all remaining branches.

Studies in coalescent theory have often investigated the properties of tree summaries conditional on assumptions of coalescent models, with the goal of understanding how shapes of the genealogies relate to processes such as population growth and migration (e.g. Slatkin 1996; Rosenberg and Feldman 2002). Because mutations can be viewed as occurring conditionally on underlying genealogies (Hudson 1990), features of genealogical shape affect the patterns of genetic variation produced by coalescent models that permit mutation. Thus, the understanding of summaries of tree shape predicted by coalescent models is a component of the interpretation of patterns of genetic variation in relation to evolutionary processes.

Initial results concerning summaries of genealogical shape focused on single quantities, producing results on quantities such as Hn and Ln (Kingman 1982; Hudson 1983, 1990; Tajima 1983). Studies soon examined the information that resides in the relationships between pairs of summaries; genetic variation statistics such as those of Tajima (1989) and Fu and Li (1993) can be viewed as assessing whether or not one aspect of a tree contains long branches in relation to another.

Recently, Arbisser et al. (2018) performed a detailed investigation of the relationship between Hn and Ln under coalescent models. They studied the mathematical relationship between these two quantities, computing under a standard coalescent model with a constant-sized population the covariance and correlation coefficient of Hn and Ln. Extending the work of Arbisser et al. (2018) on Hn and Ln, we (Alimpiev and Rosenberg 2022) reported covariances and correlations for all pairs of variables among {Hn,Ln,En,In,Bn,Tk}, where Bn is the mean of the lengths of the two basal branches of a genealogy and Tk is the coalescence time from k to k—1 lineages, 2kn. Our compendium in Tables 1 and 2 of Alimpiev and Rosenberg (2022) summarizes pairwise relationships for several of the most commonly used features of coalescent tree shape, recording both new and previously known results.

Table 1.

Definitions of random variables associated with various tree summaries.

VariableDefinition
Hnk=2nTk
Lnk=2nkTk
Eni=1nei(n)
InLnEn
Bn12T2+[j=3n1k=2j1j(j1)Tk]+(k=2n1n1Tk)
VariableDefinition
Hnk=2nTk
Lnk=2nkTk
Eni=1nei(n)
InLnEn
Bn12T2+[j=3n1k=2j1j(j1)Tk]+(k=2n1n1Tk)

Here, Tk is the random variable representing the coalescence time from k to k—1 lineages, and ei(n) is the (random) length of the ith external branch of a tree with n leaves. We define Hn, Ln, and En for n2, In for n3, and Bn for n4. The expression for Bn follows a form that incorporates terms associated with all of its contributing branches, following p. 1400 of Uyenoyama (1997) and Section 2.6 of Alimpiev and Rosenberg (2022), and it can be simplified to Bn=k=2n1k1Tk.

Table 1.

Definitions of random variables associated with various tree summaries.

VariableDefinition
Hnk=2nTk
Lnk=2nkTk
Eni=1nei(n)
InLnEn
Bn12T2+[j=3n1k=2j1j(j1)Tk]+(k=2n1n1Tk)
VariableDefinition
Hnk=2nTk
Lnk=2nkTk
Eni=1nei(n)
InLnEn
Bn12T2+[j=3n1k=2j1j(j1)Tk]+(k=2n1n1Tk)

Here, Tk is the random variable representing the coalescence time from k to k—1 lineages, and ei(n) is the (random) length of the ith external branch of a tree with n leaves. We define Hn, Ln, and En for n2, In for n3, and Bn for n4. The expression for Bn follows a form that incorporates terms associated with all of its contributing branches, following p. 1400 of Uyenoyama (1997) and Section 2.6 of Alimpiev and Rosenberg (2022), and it can be simplified to Bn=k=2n1k1Tk.

Table 2.

Expectations and variances of properties of tree branch lengths.

XnE[Xn]limnE[Xn]Var[Xn]limnVar[Xn]
Hn2(n1)n28(S2,n1)4(n1n)24π23121.15947
Ln2S1,n14S2,n12π236.57974
En22{4,n=2,8(n1)(n2)[S1,n1n2(n1)],n>2.0
In2S1,n124[2[S1,n1n2(n1)](n1)(n2)2S1,n1n1+S2,n1]2π236.57974
Bn2S2,n12+2nπ2321.289872(3S2,n1n22S2,n12n2+n24S2,n1n+3n4)n2π49+π2+21.04637
Tk2k(k1)2k(k1)4k2(k1)24k2(k1)2
XnE[Xn]limnE[Xn]Var[Xn]limnVar[Xn]
Hn2(n1)n28(S2,n1)4(n1n)24π23121.15947
Ln2S1,n14S2,n12π236.57974
En22{4,n=2,8(n1)(n2)[S1,n1n2(n1)],n>2.0
In2S1,n124[2[S1,n1n2(n1)](n1)(n2)2S1,n1n1+S2,n1]2π236.57974
Bn2S2,n12+2nπ2321.289872(3S2,n1n22S2,n12n2+n24S2,n1n+3n4)n2π49+π2+21.04637
Tk2k(k1)2k(k1)4k2(k1)24k2(k1)2

These expressions can be found in Alimpiev and Rosenberg (2022). Note that for Ln and In, although the limiting variance is finite, the expectation is infinite (Tavaré et al. 1997; Wakeley 2009, p. 76).

Table 2.

Expectations and variances of properties of tree branch lengths.

XnE[Xn]limnE[Xn]Var[Xn]limnVar[Xn]
Hn2(n1)n28(S2,n1)4(n1n)24π23121.15947
Ln2S1,n14S2,n12π236.57974
En22{4,n=2,8(n1)(n2)[S1,n1n2(n1)],n>2.0
In2S1,n124[2[S1,n1n2(n1)](n1)(n2)2S1,n1n1+S2,n1]2π236.57974
Bn2S2,n12+2nπ2321.289872(3S2,n1n22S2,n12n2+n24S2,n1n+3n4)n2π49+π2+21.04637
Tk2k(k1)2k(k1)4k2(k1)24k2(k1)2
XnE[Xn]limnE[Xn]Var[Xn]limnVar[Xn]
Hn2(n1)n28(S2,n1)4(n1n)24π23121.15947
Ln2S1,n14S2,n12π236.57974
En22{4,n=2,8(n1)(n2)[S1,n1n2(n1)],n>2.0
In2S1,n124[2[S1,n1n2(n1)](n1)(n2)2S1,n1n1+S2,n1]2π236.57974
Bn2S2,n12+2nπ2321.289872(3S2,n1n22S2,n12n2+n24S2,n1n+3n4)n2π49+π2+21.04637
Tk2k(k1)2k(k1)4k2(k1)24k2(k1)2

These expressions can be found in Alimpiev and Rosenberg (2022). Note that for Ln and In, although the limiting variance is finite, the expectation is infinite (Tavaré et al. 1997; Wakeley 2009, p. 76).

In addition to computing the covariance and correlation coefficient of Hn and Ln, Arbisser et al. (2018) also found approximations to the expectation and variance of the ratio Hn/Ln under the coalescent model. This ratio gives a summary of the joint distribution of Hn and Ln that characterizes the relative magnitudes of the variables—a feature not captured by their covariance or correlation. Arbisser et al. (2018) found that although the approximation to Var[Hn/Ln] differed noticeably from the exact value, as obtained by numerical integration and simulations of the coalescent model, the approximation to E[Hn/Ln] was quite accurate.

In this article, we extend the work of Arbisser et al. (2018) to compute approximations to the expectations and variances for ratios of the 14 remaining pairs among {Hn,Ln,En,In,Bn,Tk}. The study performs for the expectation and variance of coalescent ratios an analogous extension of Arbisser et al. (2018) to that performed by Alimpiev and Rosenberg (2022) for the covariance and correlation coefficient.

Materials and methods

Tree variables

We work with a haploid population of constant size N that follows a standard coalescent model. Time is measured in units of N generations. In this section, we recall the definitions of the coalescence time Tk and tree properties Hn, Ln, En, In, and Bn for sample size n2 and 2kn.

Tk is defined to be a random variable representing the time to coalescence of k to k—1 lineages, for 2kn. Variable Tk has exponential probability density function
The expectation and variance of Tk are
(1)
(2)

The tree properties Hn, Ln, En, In, and Bn are defined in terms of the Tk. Visual depictions of these properties appear in Fig. 1, and mathematical definitions of these quantities appear in Table 1.

Properties of genealogical trees. The tree height is Hn. The sum of the lengths of all branches is Ln. External branches have total length En (green). Internal branches have total length In (orange). Basal branches have mean length Bn (blue).
Fig. 1.

Properties of genealogical trees. The tree height is Hn. The sum of the lengths of all branches is Ln. External branches have total length En (green). Internal branches have total length In (orange). Basal branches have mean length Bn (blue).

We define Sp,n=k=1nkp as a useful shorthand. The limit limnSp,n=Sp, is the Riemann zeta function, usually denoted ζ(p). In particular, S1, diverges, S2,=π2/61.64493, and S3, is Apéry’s constant, approximately 1.20206.

Taylor approximations to expectations and variances of ratios

To compute approximate expressions for expected values and variances of the ratios of various tree properties, we rely on Taylor approximations. In particular, consider random variables X and Y with E[X],E[Y]0. For the expectation, we have (second-order) approximation (Elandt-Johnson and Johnson 1999, eq. 3.88):
(3)
For the variance, we have (first-order) approximation (Stuart and Ord 1994, eq. 10.17):
(4)

We use E˜[X/Y] and Var˜[X/Y] to denote approximations from equations (3) and (4). For both the expectation and the variance, we also take the n limit of the approximations.

Exact expectations, variances, and covariances of tree properties

Expected values and variances of variables Hn, Ln, En, In, Bn, and Tk that are used in equations (3) and (4) are known, in many cases, from early studies in coalescent theory (Fu and Li 1993; Tavaré et al. 1997; Wakeley 2009). We summarize these expectations and variances in Table 2.

The covariances compiled by Alimpiev and Rosenberg (2022) appear in Table 3. In the case of pairs (En, Bn) and (In, Bn), the covariances are approximate, as described by Alimpiev and Rosenberg (2022).

Table 3.

Covariances of pairs of variables that summarize genealogical trees.

(XnYn)Cov[Xn,Yn]limnCov[Xn,Yn]
Hn, Tk4k2(k1)24k2(k1)2
Hn, Ln4S2,n14+4n2π2342.57974
Hn, En4n0
Hn, In4S2,n142π2342.57974
Hn, Bn4[S3,n1n23S2,n1n2+(4n+1)(n1)]n22π2+4ζ(3)+161.06902
Ln, Tk4k(k1)24k(k1)2
Ln, En4S1,n1n10
Ln, In4S2,n14S1,n1n12π236.57974
Ln, Bn4[S3,n1nS2,n1n+n1]n2π23+4ζ(3)+42.22849
En, Tk4k(k1)(n1)0
En, In4S1,n1n18S1,n1n(n1)(n2)+16n20
En, Bn4(S2,n1nn+1)n(n1)0
In, Tk4(nk)k(k1)2(n1)4k(k1)2
In, Bn4(S3,n1nS2,n1n+nS3,n11)n12π23+4ζ(3)+42.22849
Bn, Tk4k2(k1)34k2(k1)3
(XnYn)Cov[Xn,Yn]limnCov[Xn,Yn]
Hn, Tk4k2(k1)24k2(k1)2
Hn, Ln4S2,n14+4n2π2342.57974
Hn, En4n0
Hn, In4S2,n142π2342.57974
Hn, Bn4[S3,n1n23S2,n1n2+(4n+1)(n1)]n22π2+4ζ(3)+161.06902
Ln, Tk4k(k1)24k(k1)2
Ln, En4S1,n1n10
Ln, In4S2,n14S1,n1n12π236.57974
Ln, Bn4[S3,n1nS2,n1n+n1]n2π23+4ζ(3)+42.22849
En, Tk4k(k1)(n1)0
En, In4S1,n1n18S1,n1n(n1)(n2)+16n20
En, Bn4(S2,n1nn+1)n(n1)0
In, Tk4(nk)k(k1)2(n1)4k(k1)2
In, Bn4(S3,n1nS2,n1n+nS3,n11)n12π23+4ζ(3)+42.22849
Bn, Tk4k2(k1)34k2(k1)3

For pairs involving En or In, expressions apply for n3; expressions involving Bn apply for n4. The expressions can be found in Alimpiev and Rosenberg (2022).

Table 3.

Covariances of pairs of variables that summarize genealogical trees.

(XnYn)Cov[Xn,Yn]limnCov[Xn,Yn]
Hn, Tk4k2(k1)24k2(k1)2
Hn, Ln4S2,n14+4n2π2342.57974
Hn, En4n0
Hn, In4S2,n142π2342.57974
Hn, Bn4[S3,n1n23S2,n1n2+(4n+1)(n1)]n22π2+4ζ(3)+161.06902
Ln, Tk4k(k1)24k(k1)2
Ln, En4S1,n1n10
Ln, In4S2,n14S1,n1n12π236.57974
Ln, Bn4[S3,n1nS2,n1n+n1]n2π23+4ζ(3)+42.22849
En, Tk4k(k1)(n1)0
En, In4S1,n1n18S1,n1n(n1)(n2)+16n20
En, Bn4(S2,n1nn+1)n(n1)0
In, Tk4(nk)k(k1)2(n1)4k(k1)2
In, Bn4(S3,n1nS2,n1n+nS3,n11)n12π23+4ζ(3)+42.22849
Bn, Tk4k2(k1)34k2(k1)3
(XnYn)Cov[Xn,Yn]limnCov[Xn,Yn]
Hn, Tk4k2(k1)24k2(k1)2
Hn, Ln4S2,n14+4n2π2342.57974
Hn, En4n0
Hn, In4S2,n142π2342.57974
Hn, Bn4[S3,n1n23S2,n1n2+(4n+1)(n1)]n22π2+4ζ(3)+161.06902
Ln, Tk4k(k1)24k(k1)2
Ln, En4S1,n1n10
Ln, In4S2,n14S1,n1n12π236.57974
Ln, Bn4[S3,n1nS2,n1n+n1]n2π23+4ζ(3)+42.22849
En, Tk4k(k1)(n1)0
En, In4S1,n1n18S1,n1n(n1)(n2)+16n20
En, Bn4(S2,n1nn+1)n(n1)0
In, Tk4(nk)k(k1)2(n1)4k(k1)2
In, Bn4(S3,n1nS2,n1n+nS3,n11)n12π23+4ζ(3)+42.22849
Bn, Tk4k2(k1)34k2(k1)3

For pairs involving En or In, expressions apply for n3; expressions involving Bn apply for n4. The expressions can be found in Alimpiev and Rosenberg (2022).

Evaluating the approximations

For each of 15 pairs of random variables, considering Hn, Ln, En, In, and Bn as well as Tk, we substitute expressions from Tables 2 and 3 into equations (3) and (4) to obtain approximate expectations and variances for ratios of pairs of variables. For each pair, we choose one variable for the numerator and the other for the denominator; approximate expectations and variances for the reciprocals can be obtained similarly. We present the approximations in Tables 4 and 5, and we plot them in Figs. 2–5.

Simulated and theoretical approximations of expectations of ratios of pairs of variables, plotted as functions of sample size n. Expressions for theoretical values are taken from Table 4.
Fig. 2.

Simulated and theoretical approximations of expectations of ratios of pairs of variables, plotted as functions of sample size n. Expressions for theoretical values are taken from Table 4.

Theoretical approximations E˜[X/Tk] for variables X in {Hn,Ln,En,In,Bn}, plotted as functions of k for n = 10, n = 20, and n = 50. The expressions plotted are taken from Table 4.
Fig. 3.

Theoretical approximations E˜[X/Tk] for variables X in {Hn,Ln,En,In,Bn}, plotted as functions of k for n =10, n =20, and n =50. The expressions plotted are taken from Table 4.

Simulated and theoretical approximations of variances of ratios of pairs of variables, plotted as functions of sample size n. Expressions for theoretical values are taken from Table 5.
Fig. 4.

Simulated and theoretical approximations of variances of ratios of pairs of variables, plotted as functions of sample size n. Expressions for theoretical values are taken from Table 5.

Theoretical approximations Var˜[X/Tk] for variables X in {Hn,Ln,En,In,Bn}, plotted as functions of k for n = 10, n = 20, and n = 50. The expressions plotted are taken from Table 5.
Fig. 5.

Theoretical approximations Var˜[X/Tk] for variables X in {Hn,Ln,En,In,Bn}, plotted as functions of k for n =10, n =20, and n =50. The expressions plotted are taken from Table 5.

Table 4.

Approximations to expectations of ratios of pairs of variables.

(XnYn)E˜[Xn/Yn]limnE˜[Xn/Yn]
Hn, Tk(2k22k1)n2k(k1)n2k22k1
Hn, Lnn1S1,n1nS2,n1nn+1S1,n12n+S2,n1(n1)S1,n13n0
En, Hnn(2S2,nn22n2n+1)(n1)3π2321.28987
Hn, Inn1(S1,n11)nS2,n11(S1,n11)2+S2,n1(n1)(n2)4n+4S1,n1+4(S1,n11)3n(n2)0
Bn, HnS2,n1nn+1n1+3S2,n1n2S3,n1n24n2+3n+1(n1)2+(S2,n1nn+1)(2S2,nn23n2+2n1)(n1)3π418π26ζ(3)20.56463
Ln, Tk2S1,n1k2(2S1,n1+1)k
En, Ln(S1,n12+S2,n1)n2S1,n12S2,n1S1,n13(n1)0
Ln, In(S1,n13+S2,n1)(n1)(n2)S1,n12(2n27n+2)+S1,n1(n28n+8)(S1,n11)3(n1)(n2)1
Bn, LnS2,n1nn+1S1,n1n+S2,n1nS3,n1nn+1S1,n12n+S2,n1(S2,n1nn+1)S1,n13n0
En, Tkk(k1)(2n3)n12k(k1)
En, InS1,n12(n22n+4)S1,n1(2n2n2)+(S2,n1+1)(n1)(n2)(S1,n11)3(n1)(n2)0
Bn, En(n2+2S1,n1n8n+8)(S2,n1nn+1)n(n1)(n2)π2610.64493
In, Tk2k(k1)(S1,n11)k(nk)n1
Bn, InS2,n1nn+1(S1,n11)n+(S2,n1nn+1)[S2,n1(n1)(n2)4n+4S1,n1+4](S1,n11)3n(n1)(n2)+S2,n1n(S3,n1+1)(n1)(S1,n11)2(n1)0
Bn, Tk2k(k1)(S2,n1nn+1)n1k113(π26)k(k1)1k1
(XnYn)E˜[Xn/Yn]limnE˜[Xn/Yn]
Hn, Tk(2k22k1)n2k(k1)n2k22k1
Hn, Lnn1S1,n1nS2,n1nn+1S1,n12n+S2,n1(n1)S1,n13n0
En, Hnn(2S2,nn22n2n+1)(n1)3π2321.28987
Hn, Inn1(S1,n11)nS2,n11(S1,n11)2+S2,n1(n1)(n2)4n+4S1,n1+4(S1,n11)3n(n2)0
Bn, HnS2,n1nn+1n1+3S2,n1n2S3,n1n24n2+3n+1(n1)2+(S2,n1nn+1)(2S2,nn23n2+2n1)(n1)3π418π26ζ(3)20.56463
Ln, Tk2S1,n1k2(2S1,n1+1)k
En, Ln(S1,n12+S2,n1)n2S1,n12S2,n1S1,n13(n1)0
Ln, In(S1,n13+S2,n1)(n1)(n2)S1,n12(2n27n+2)+S1,n1(n28n+8)(S1,n11)3(n1)(n2)1
Bn, LnS2,n1nn+1S1,n1n+S2,n1nS3,n1nn+1S1,n12n+S2,n1(S2,n1nn+1)S1,n13n0
En, Tkk(k1)(2n3)n12k(k1)
En, InS1,n12(n22n+4)S1,n1(2n2n2)+(S2,n1+1)(n1)(n2)(S1,n11)3(n1)(n2)0
Bn, En(n2+2S1,n1n8n+8)(S2,n1nn+1)n(n1)(n2)π2610.64493
In, Tk2k(k1)(S1,n11)k(nk)n1
Bn, InS2,n1nn+1(S1,n11)n+(S2,n1nn+1)[S2,n1(n1)(n2)4n+4S1,n1+4](S1,n11)3n(n1)(n2)+S2,n1n(S3,n1+1)(n1)(S1,n11)2(n1)0
Bn, Tk2k(k1)(S2,n1nn+1)n1k113(π26)k(k1)1k1

Expressions involving En or In apply for n3; expressions involving Bn apply for n4. The value for (Hn, Ln) follows equation 15 of Arbisser et al. (2018). The expressions are obtained using equation 3 and Tables 2 and 3.

Table 4.

Approximations to expectations of ratios of pairs of variables.

(XnYn)E˜[Xn/Yn]limnE˜[Xn/Yn]
Hn, Tk(2k22k1)n2k(k1)n2k22k1
Hn, Lnn1S1,n1nS2,n1nn+1S1,n12n+S2,n1(n1)S1,n13n0
En, Hnn(2S2,nn22n2n+1)(n1)3π2321.28987
Hn, Inn1(S1,n11)nS2,n11(S1,n11)2+S2,n1(n1)(n2)4n+4S1,n1+4(S1,n11)3n(n2)0
Bn, HnS2,n1nn+1n1+3S2,n1n2S3,n1n24n2+3n+1(n1)2+(S2,n1nn+1)(2S2,nn23n2+2n1)(n1)3π418π26ζ(3)20.56463
Ln, Tk2S1,n1k2(2S1,n1+1)k
En, Ln(S1,n12+S2,n1)n2S1,n12S2,n1S1,n13(n1)0
Ln, In(S1,n13+S2,n1)(n1)(n2)S1,n12(2n27n+2)+S1,n1(n28n+8)(S1,n11)3(n1)(n2)1
Bn, LnS2,n1nn+1S1,n1n+S2,n1nS3,n1nn+1S1,n12n+S2,n1(S2,n1nn+1)S1,n13n0
En, Tkk(k1)(2n3)n12k(k1)
En, InS1,n12(n22n+4)S1,n1(2n2n2)+(S2,n1+1)(n1)(n2)(S1,n11)3(n1)(n2)0
Bn, En(n2+2S1,n1n8n+8)(S2,n1nn+1)n(n1)(n2)π2610.64493
In, Tk2k(k1)(S1,n11)k(nk)n1
Bn, InS2,n1nn+1(S1,n11)n+(S2,n1nn+1)[S2,n1(n1)(n2)4n+4S1,n1+4](S1,n11)3n(n1)(n2)+S2,n1n(S3,n1+1)(n1)(S1,n11)2(n1)0
Bn, Tk2k(k1)(S2,n1nn+1)n1k113(π26)k(k1)1k1
(XnYn)E˜[Xn/Yn]limnE˜[Xn/Yn]
Hn, Tk(2k22k1)n2k(k1)n2k22k1
Hn, Lnn1S1,n1nS2,n1nn+1S1,n12n+S2,n1(n1)S1,n13n0
En, Hnn(2S2,nn22n2n+1)(n1)3π2321.28987
Hn, Inn1(S1,n11)nS2,n11(S1,n11)2+S2,n1(n1)(n2)4n+4S1,n1+4(S1,n11)3n(n2)0
Bn, HnS2,n1nn+1n1+3S2,n1n2S3,n1n24n2+3n+1(n1)2+(S2,n1nn+1)(2S2,nn23n2+2n1)(n1)3π418π26ζ(3)20.56463
Ln, Tk2S1,n1k2(2S1,n1+1)k
En, Ln(S1,n12+S2,n1)n2S1,n12S2,n1S1,n13(n1)0
Ln, In(S1,n13+S2,n1)(n1)(n2)S1,n12(2n27n+2)+S1,n1(n28n+8)(S1,n11)3(n1)(n2)1
Bn, LnS2,n1nn+1S1,n1n+S2,n1nS3,n1nn+1S1,n12n+S2,n1(S2,n1nn+1)S1,n13n0
En, Tkk(k1)(2n3)n12k(k1)
En, InS1,n12(n22n+4)S1,n1(2n2n2)+(S2,n1+1)(n1)(n2)(S1,n11)3(n1)(n2)0
Bn, En(n2+2S1,n1n8n+8)(S2,n1nn+1)n(n1)(n2)π2610.64493
In, Tk2k(k1)(S1,n11)k(nk)n1
Bn, InS2,n1nn+1(S1,n11)n+(S2,n1nn+1)[S2,n1(n1)(n2)4n+4S1,n1+4](S1,n11)3n(n1)(n2)+S2,n1n(S3,n1+1)(n1)(S1,n11)2(n1)0
Bn, Tk2k(k1)(S2,n1nn+1)n1k113(π26)k(k1)1k1

Expressions involving En or In apply for n3; expressions involving Bn apply for n4. The value for (Hn, Ln) follows equation 15 of Arbisser et al. (2018). The expressions are obtained using equation 3 and Tables 2 and 3.

Table 5.

Approximations to variances of ratios of pairs of variables.

(XnYn)Var˜[Xn/Yn]limnVar˜[Xn/Yn]
Hn, Tk2k(k1)[k(k1)S2,nn(k2k+1)n+1]n13k(k1)[(π26)k2(π26)k6]
Hn, Ln(n1S1,n1n)2[2(S2,n1)n2(n1)2(n1)22[S2,n1n(n1)]S1,n1(n1)+S2,n1S1,n12]0
En, Hn[2S1,n1n(n1)+2S2,nn2(n2)(n23)(3n2)]n2(n1)4(n2)π2330.28987
Hn, In2S2,nn23n2+2n1(S1,n11)2n2+1(S1,n11)4[[[S2,n1(n2)4](n1)+4S1,n1](n1)n2(n2)2(S1,n11)(S2,n11)(n1)n]0
Bn, Hn(4S3,n1n2+4S2,nn2+11n25n10)(n1)2S2,n1(4S3,n1n2+8S2,nn2+13n29n12)n(n1)+4S2,n12(S2,nn2+n2n1)n22(n1)4π6108π4183π24+112+2ζ(3)π2ζ(3)30.03744
Ln, Tkk2(k1)2S1,n12[S2,n1S1,n122(k1)S1,n1+1]
En, Ln2S1,n13nS1,n12(6n8)+S2,n1(n1)(n2)S1,n14(n1)(n2)0
Ln, In2S1,n13nS1,n12(6n8)+S2,n1(n1)(n2)(S1,n11)4(n1)(n2)0
Bn, LnS1,n12[2S2,n12n2+S2,n1(3n4)n+n2+3n4]+4S1,n1(S2,n1nn+1)(S2,n1nS3,n1nn+1)+2S2,n1(S2,n1nn+1)22S1,n14n20
En, Tkk2(k1)2(n2+2S1,n1n9n+10)(n1)(n2)k2(k1)2
En, In2S1,n13nS1,n12(6n8)+S2,n1(n1)(n2)(S1,n11)4(n1)(n2)0
Bn, En4S1,n1n(S2,n1nn+1)22S2,n12(n2+3n6)n2+S2,n1(3n4)(n+6)n(n1)+(n210n+8)(n1)22n2(n1)(n2)π430+π24+120.26159
In, Tkk2(k1)[(k1)S1,n12(n1)(n2)2S1,n1(kn24kn+n+2k)+(k1)S2,n1(n1)(n2)+kn2+n29kn+3n+10k6](n1)(n2)
Bn, In[S2,n1(n1)(n2)4n+4S1,n1+4](S2,n1nn+1)2(S1,n11)4n2(n1)(n2)+2[S2,n1n(S3,n1+1)(n1)](S2,n1nn+1)(S1,n11)3n(n1)2S2,n12n2S2,n1(3n4)n(n+4)(n1)2(S1,n11)2n20
Bn, Tkk2[[k(k1)2(3n+2)+4n](n1)n2(k+1)(k23k+4)S2,n1]112k[(18π2)k32(18π2)k2+(18π2)k4(π26)]
(XnYn)Var˜[Xn/Yn]limnVar˜[Xn/Yn]
Hn, Tk2k(k1)[k(k1)S2,nn(k2k+1)n+1]n13k(k1)[(π26)k2(π26)k6]
Hn, Ln(n1S1,n1n)2[2(S2,n1)n2(n1)2(n1)22[S2,n1n(n1)]S1,n1(n1)+S2,n1S1,n12]0
En, Hn[2S1,n1n(n1)+2S2,nn2(n2)(n23)(3n2)]n2(n1)4(n2)π2330.28987
Hn, In2S2,nn23n2+2n1(S1,n11)2n2+1(S1,n11)4[[[S2,n1(n2)4](n1)+4S1,n1](n1)n2(n2)2(S1,n11)(S2,n11)(n1)n]0
Bn, Hn(4S3,n1n2+4S2,nn2+11n25n10)(n1)2S2,n1(4S3,n1n2+8S2,nn2+13n29n12)n(n1)+4S2,n12(S2,nn2+n2n1)n22(n1)4π6108π4183π24+112+2ζ(3)π2ζ(3)30.03744
Ln, Tkk2(k1)2S1,n12[S2,n1S1,n122(k1)S1,n1+1]
En, Ln2S1,n13nS1,n12(6n8)+S2,n1(n1)(n2)S1,n14(n1)(n2)0
Ln, In2S1,n13nS1,n12(6n8)+S2,n1(n1)(n2)(S1,n11)4(n1)(n2)0
Bn, LnS1,n12[2S2,n12n2+S2,n1(3n4)n+n2+3n4]+4S1,n1(S2,n1nn+1)(S2,n1nS3,n1nn+1)+2S2,n1(S2,n1nn+1)22S1,n14n20
En, Tkk2(k1)2(n2+2S1,n1n9n+10)(n1)(n2)k2(k1)2
En, In2S1,n13nS1,n12(6n8)+S2,n1(n1)(n2)(S1,n11)4(n1)(n2)0
Bn, En4S1,n1n(S2,n1nn+1)22S2,n12(n2+3n6)n2+S2,n1(3n4)(n+6)n(n1)+(n210n+8)(n1)22n2(n1)(n2)π430+π24+120.26159
In, Tkk2(k1)[(k1)S1,n12(n1)(n2)2S1,n1(kn24kn+n+2k)+(k1)S2,n1(n1)(n2)+kn2+n29kn+3n+10k6](n1)(n2)
Bn, In[S2,n1(n1)(n2)4n+4S1,n1+4](S2,n1nn+1)2(S1,n11)4n2(n1)(n2)+2[S2,n1n(S3,n1+1)(n1)](S2,n1nn+1)(S1,n11)3n(n1)2S2,n12n2S2,n1(3n4)n(n+4)(n1)2(S1,n11)2n20
Bn, Tkk2[[k(k1)2(3n+2)+4n](n1)n2(k+1)(k23k+4)S2,n1]112k[(18π2)k32(18π2)k2+(18π2)k4(π26)]

Expressions involving En or In apply for n3; expressions involving Bn apply for n4. The value for (Hn, Ln) follows equation 18 of Arbisser et al. (2018). The expressions are obtained using equation 4 and Tables 2 and 3.

Table 5.

Approximations to variances of ratios of pairs of variables.

(XnYn)Var˜[Xn/Yn]limnVar˜[Xn/Yn]
Hn, Tk2k(k1)[k(k1)S2,nn(k2k+1)n+1]n13k(k1)[(π26)k2(π26)k6]
Hn, Ln(n1S1,n1n)2[2(S2,n1)n2(n1)2(n1)22[S2,n1n(n1)]S1,n1(n1)+S2,n1S1,n12]0
En, Hn[2S1,n1n(n1)+2S2,nn2(n2)(n23)(3n2)]n2(n1)4(n2)π2330.28987
Hn, In2S2,nn23n2+2n1(S1,n11)2n2+1(S1,n11)4[[[S2,n1(n2)4](n1)+4S1,n1](n1)n2(n2)2(S1,n11)(S2,n11)(n1)n]0
Bn, Hn(4S3,n1n2+4S2,nn2+11n25n10)(n1)2S2,n1(4S3,n1n2+8S2,nn2+13n29n12)n(n1)+4S2,n12(S2,nn2+n2n1)n22(n1)4π6108π4183π24+112+2ζ(3)π2ζ(3)30.03744
Ln, Tkk2(k1)2S1,n12[S2,n1S1,n122(k1)S1,n1+1]
En, Ln2S1,n13nS1,n12(6n8)+S2,n1(n1)(n2)S1,n14(n1)(n2)0
Ln, In2S1,n13nS1,n12(6n8)+S2,n1(n1)(n2)(S1,n11)4(n1)(n2)0
Bn, LnS1,n12[2S2,n12n2+S2,n1(3n4)n+n2+3n4]+4S1,n1(S2,n1nn+1)(S2,n1nS3,n1nn+1)+2S2,n1(S2,n1nn+1)22S1,n14n20
En, Tkk2(k1)2(n2+2S1,n1n9n+10)(n1)(n2)k2(k1)2
En, In2S1,n13nS1,n12(6n8)+S2,n1(n1)(n2)(S1,n11)4(n1)(n2)0
Bn, En4S1,n1n(S2,n1nn+1)22S2,n12(n2+3n6)n2+S2,n1(3n4)(n+6)n(n1)+(n210n+8)(n1)22n2(n1)(n2)π430+π24+120.26159
In, Tkk2(k1)[(k1)S1,n12(n1)(n2)2S1,n1(kn24kn+n+2k)+(k1)S2,n1(n1)(n2)+kn2+n29kn+3n+10k6](n1)(n2)
Bn, In[S2,n1(n1)(n2)4n+4S1,n1+4](S2,n1nn+1)2(S1,n11)4n2(n1)(n2)+2[S2,n1n(S3,n1+1)(n1)](S2,n1nn+1)(S1,n11)3n(n1)2S2,n12n2S2,n1(3n4)n(n+4)(n1)2(S1,n11)2n20
Bn, Tkk2[[k(k1)2(3n+2)+4n](n1)n2(k+1)(k23k+4)S2,n1]112k[(18π2)k32(18π2)k2+(18π2)k4(π26)]
(XnYn)Var˜[Xn/Yn]limnVar˜[Xn/Yn]
Hn, Tk2k(k1)[k(k1)S2,nn(k2k+1)n+1]n13k(k1)[(π26)k2(π26)k6]
Hn, Ln(n1S1,n1n)2[2(S2,n1)n2(n1)2(n1)22[S2,n1n(n1)]S1,n1(n1)+S2,n1S1,n12]0
En, Hn[2S1,n1n(n1)+2S2,nn2(n2)(n23)(3n2)]n2(n1)4(n2)π2330.28987
Hn, In2S2,nn23n2+2n1(S1,n11)2n2+1(S1,n11)4[[[S2,n1(n2)4](n1)+4S1,n1](n1)n2(n2)2(S1,n11)(S2,n11)(n1)n]0
Bn, Hn(4S3,n1n2+4S2,nn2+11n25n10)(n1)2S2,n1(4S3,n1n2+8S2,nn2+13n29n12)n(n1)+4S2,n12(S2,nn2+n2n1)n22(n1)4π6108π4183π24+112+2ζ(3)π2ζ(3)30.03744
Ln, Tkk2(k1)2S1,n12[S2,n1S1,n122(k1)S1,n1+1]
En, Ln2S1,n13nS1,n12(6n8)+S2,n1(n1)(n2)S1,n14(n1)(n2)0
Ln, In2S1,n13nS1,n12(6n8)+S2,n1(n1)(n2)(S1,n11)4(n1)(n2)0
Bn, LnS1,n12[2S2,n12n2+S2,n1(3n4)n+n2+3n4]+4S1,n1(S2,n1nn+1)(S2,n1nS3,n1nn+1)+2S2,n1(S2,n1nn+1)22S1,n14n20
En, Tkk2(k1)2(n2+2S1,n1n9n+10)(n1)(n2)k2(k1)2
En, In2S1,n13nS1,n12(6n8)+S2,n1(n1)(n2)(S1,n11)4(n1)(n2)0
Bn, En4S1,n1n(S2,n1nn+1)22S2,n12(n2+3n6)n2+S2,n1(3n4)(n+6)n(n1)+(n210n+8)(n1)22n2(n1)(n2)π430+π24+120.26159
In, Tkk2(k1)[(k1)S1,n12(n1)(n2)2S1,n1(kn24kn+n+2k)+(k1)S2,n1(n1)(n2)+kn2+n29kn+3n+10k6](n1)(n2)
Bn, In[S2,n1(n1)(n2)4n+4S1,n1+4](S2,n1nn+1)2(S1,n11)4n2(n1)(n2)+2[S2,n1n(S3,n1+1)(n1)](S2,n1nn+1)(S1,n11)3n(n1)2S2,n12n2S2,n1(3n4)n(n+4)(n1)2(S1,n11)2n20
Bn, Tkk2[[k(k1)2(3n+2)+4n](n1)n2(k+1)(k23k+4)S2,n1]112k[(18π2)k32(18π2)k2+(18π2)k4(π26)]

Expressions involving En or In apply for n3; expressions involving Bn apply for n4. The value for (Hn, Ln) follows equation 18 of Arbisser et al. (2018). The expressions are obtained using equation 4 and Tables 2 and 3.

For pairs (Xn, Yn), we simulate the values of E[Xn/Yn] and Var[Xn/Yn] under the coalescent model using ms (Hudson 2002), performing 100,000 replicate simulations for each tree size n=2,3,,50. We plot the simulated values alongside the approximate values from Tables 4 and 5 in Figs. 2 and 4.

Results

Expectations of the ratios

The approximate expected values in Table 4, as approximations of ratios, have the form of rational functions. As n grows, the approximate expectations of Hn/Ln,Hn/In,En/Ln,En/In,Bn/Ln, and Bn/In approach 0. This behavior is sensible when considering the properties of the coalescent model: in the numerators, En has expectation 2 and E[Hn] and E[Bn] have bounded expectation in the limit as n; in the denominators, Ln and In have expectations that grow without bound (Table 2). Similarly, approximate expectations of ratios Ln/Tk or In/Tk with Ln and In in the numerator and Tk in the denominator grow to infinity as n increases. The approximation to E[Ln/In] approaches 1 in the limit as n: as the number of leaves in the tree grows, internal branches occupy an increasingly large fraction of the total branch length.

For pairs of variables that both have finite expectation, the approximate expectations of their associated ratios—Hn/Tk,En/Hn,En/Tk,Bn/Hn,Bn/En, and Bn/Tk—also approach finite values in the limit as n. It is interesting to observe that although limnE[En]=limnE[Hn]=2 (Table 2), E˜[En/Hn]=π2/321.289871. In other words, although expectations of the individual variables approach the same value, we expect En/Hn to be somewhat larger than 1 on average.

For each of the 10 pairs of variables among {Hn,Ln,En,In,Bn}, the approximate expectations from Table 4 are plotted in Fig. 2 together with the simulated values. Although some divergences are present for small n, the approximate and simulated values match closely.

The approximate ratios involving Tk are shown in Fig. 3 as functions of k for each of three values of n. Ln is the fastest-growing variable according to the expression for its expectation (Table 2), and the graph for E˜[Ln/Tk] is topmost in all three plots. As expectations of Hn and En are close (Table 2), the graphs for E˜[Hn/Tk] and E˜[En/Tk] are close in Fig. 3.

Variances of the ratios

The limits of approximations of variances of ratios are presented in Table 5. They behave similarly to the expectations in Table 4. Because Ln and In have expectations that grow without bound, for ratios Hn/Ln,Hn/In,En/Ln,Bn/Ln,En/In,Bn/In—with Ln or In in the denominator—the limits of the variance approximations are 0. As n grows, the denominators grow much faster than the numerators, and the values are therefore increasingly concentrated around 0. Hence, the variances also approach 0.

Because Ln and In are much larger than the coalescence times Tk, approximations to variances of Ln/Tk and In/Tk diverge to infinity as n increases. Interestingly, however, the approximate variance of Ln/In, a ratio of two quantities with diverging expectations, approaches 0.

The variance approximations with finite nonzero limits are those for Hn/Tk,En/Hn,En/Tk,Bn/Hn,Bn/En, and Bn/Tk. All give ratios of two variables with finite expectation and variance as n (Table 2).

Figure 4 shows the expressions from Table 5 together with the simulated values. Compared to the plots of expectations of ratios (Fig. 2), differences between the simulated and approximate variances are prominent at small n. For the variances of Hn/Ln,Bn/Hn, and Bn/Ln, the simulated and approximate values differ substantially even as n increases. Because the theoretical value of Cov[En,Bn] that contributes to the approximate variance of Bn/En is itself an approximation, one of the larger differences between simulation and approximation occurs for the plot for Var˜[Bn/En].

Figure 5 shows variances of ratios involving Tk for varying k, for each of three values of n. Qualitatively, the values for approximate variances behave similarly to expectations in Fig. 3: in particular, the vertical placement of the curves follows the same order. Our approximations to the variances of Ln/Tk and In/Tk grow fastest, as the numerators are typically large and the expected value of the denominator Tk decreases as k grows. Approximations to variances of Hn/Tk,En/Tk, and Bn/Tk all display much slower growth; for these quantities, the expectations of numerators of the ratios are bounded above by 2 for all n.

Discussion

In this article, we have computed approximations to expected values and variances of ratios of various branch lengths under the standard coalescent model. We have considered all 15 possible pairs of variables in {Hn,Ln,En,In,Bn,Tk}, a set of variables whose properties have been studied in detail individually. We have also assessed the accuracy of approximations to the expectation and variance by comparing them with values computed by simulation. We have observed that the approximate expressions behave in a way that matches mathematical intuition about the behavior of random variables associated with the branch lengths.

In plots of the various approximations, we have illustrated how the random variables relate to each other, both among {Hn,Ln,En,In,Bn} (Figs. 2 and 4) as well as between pairs including one of {Hn,Ln,En,In,Bn} along with Tk (Figs. 3 and 5). As n grows large, the ratios involving Ln and In have nearly identical behavior in the plots, an observation that is explained by the fact that internal branches take up increasingly large fractions of the total branch length. In the limit as n, expectations of both Hn and En approach a constant value of 2 (Table 2), and Cov[Hn,En] approaches 0 (Table 3). However, we observed that limnE˜[En/Hn]=π2/321.28987 is not equal to limnE[En]/limnE[Hn]=1. For the ratio Bn/En, the approximation aligns with the naive prediction, limnE˜[Bn/En]=π2/61=limnE[Bn]/limnE[En], even though Cov[En,Bn] is also zero in the limit (Table 2). For Bn and Hn, which possess a high correlation, limnE˜[Bn/Hn]=π4/18π2/6ζ(3)20.56463, whereas limnE[Bn]/limnE[Hn]=π2/610.64493.

Previously, we evaluated covariances and correlation coefficients under the coalescent model for the pairs of variables that we consider here, obtaining exact covariances and correlations for 13 of 15 pairs and approximations for the other two. We obtained limiting expressions for these covariances and correlations as n. The approximate values that we have provided here for expectations and variances of ratios make use of these previous results concerning covariances, adding to the understanding of the properties of joint distributions of pairs of genealogical variables in coalescent theory.

Many statistical tests of population-genetic models rely on a model prediction of an equivalence between two quantities, framed as a null hypothesis that a test statistic equals a particular value. The prediction is often formulated as a null hypothesis that a difference between two quantities equals 0 or that their ratio equals a null value such as 1. In coalescent theory, tests that evaluate site-frequency spectra for agreement with predictions of coalescent models tend to use differences or other linear combinations (Zeng et al. 2006; Achaz 2009; Ferretti et al. 2010, 2017; Ronen et al. 2013; Fu 2022). However, several modeling studies and inference procedures in coalescent theory do emphasize ratios (Slatkin 1996; Uyenoyama 1997; Schierup and Hein 2000; Rosenberg and Hirsh 2003; Eldon 2011; Arbisser et al. 2018), as do some test statistics (Schlötterer 2002; Lohse and Kelleher 2009). Widely used tests in the area of molecular evolution, such as tests of the relative count of nonsynonymous and synonymous substitutions and the McDonald–Kreitman test of polymorphism and divergence, also make use of ratios (Yang 2014).

The choice of a difference or a ratio in formulating a test statistic can rely on several factors. Ratios are unitless, so that their values do not depend on conventions chosen during computation (e.g. scaling time in units of N or 2N). Ratios might take values in a prescribed range that can be simply interpreted, such as the range of the coalescent ratio Hn/Ln from 1n to 12 (Arbisser et al. 2018). However, the statistical properties of random variables formulated as differences are generally easier to compute from the properties of the separate random variables whose difference is taken than are the properties of corresponding statistics formulated as ratios. In general, corresponding differences and ratios in coalescent theory have not been formally compared for features such as their power to reject the null hypothesis when processes such as natural selection or population or species divergence affect the shapes of evolutionary trees. Our work to obtain approximate expectations and variances of ratios can augment understanding of scenarios in which coalescent ratios are considered, and it can assist in evaluating the relative utility of difference-based and ratio-based statistics.

We have found that approximations for fixed n and in the limit as n are quite accurate in predicting the expected values seen in coalescent simulations of the ratios (Fig. 2). For the variances, the approximations are generally less accurate, although in most cases, graphs of the approximations and simulated values have similar shape (Fig. 4). These approximations are obtained from a Taylor approximation for the variance of a ratio (equation 4), and higher-order approximations of this variance could potentially be applied by use of Taylor’s theorem; as the order of the approximation increases, however, the complexity of the resulting formula also increases. For those variances for which the approximation and simulation are not close in Fig. 4, we advise caution in using the variances in settings in which a precise approximation is needed.

Data Availability

The ms command for simulations is ms n 100,000 -T, where n is taken from {2,3,,50} and gives the number of leaves of simulated trees.

Funding

The authors acknowledge support from NIH grants R01 GM131404 and R01 HG005855 and NSF grant BCS-2116322.

Conflicts of interest

None declared.

Literature cited

Achaz
G.
Frequency spectrum neutrality tests: one for all and all for one
.
Genetics
.
2009
;
183
(
1
):
249
258
.

Alimpiev
E
,
Rosenberg
NA.
A compendium of covariances and correlation coefficients of coalescent tree properties
.
Theor Popul Biol
.
2022
;
143
:
1
13
.

Arbisser
IM
,
Jewett
EM
,
Rosenberg
NA.
On the joint distribution of tree height and tree length under the coalescent
.
Theor Popul Biol
.
2018
;
122
:
46
56
.

Elandt-Johnson
RC
,
Johnson
NL.
Survival Models and Data Analysis
.
New York (NY
):
Wiley
;
1999
.

Eldon
B.
Estimation of parameters in large offspring number models and ratios of coalescence times
.
Theor Popul Biol
.
2011
;
80
(
1
):
16
28
.

Ferretti
L
,
Ledda
A
,
Wiehe
T
,
Achaz
G
,
Ramos-Onsins
SE.
Decomposing the site frequency spectrum: the impact of tree topology on neutrality tests
.
Genetics
.
2017
;
207
(
1
):
229
240
.

Ferretti
L
,
Perez-Enciso
M
,
Ramos-Onsins
S.
Optimal neutrality tests based on the frequency spectrum
.
Genetics
.
2010
;
186
(
1
):
353
365
.

Fu
YX.
Variances and covariances of linar summary statistics of segregating sites
.
Theor Popul Biol
.
2022
;
145
:
95
108
.

Fu
YX
,
Li
WH.
Statistical tests of neutrality of mutations
.
Genetics
.
1993
;
133
(
3
):
693
709
.

Hein
J
,
Schierup
M
,
Wiuf
C.
Gene Genealogies, Variation and Evolution
.
Oxford
:
Oxford University Press
;
2005
.

Hudson
RR.
Testing the constant-rate neutral allele model with protein sequence data
.
Evolution
.
1983
;
37
(
1
):
203
217
.

Hudson
RR.
Gene genealogies and the coalescent process
.
Oxford Surv Evol Biol
.
1990
;
7
:
1
44
.

Hudson
RR.
Generating samples under a Wright–Fisher neutral model of genetic variation
.
Bioinformatics
.
2002
;
18
(
2
):
337
338
.

Kingman
JFC.
On the genealogy of large populations
.
J Appl Probab
.
1982
;
19
(
A
):
27
43
.

Lohse
K
,
Kelleher
J.
Measuring the degree of starshape in genealogies—summary statistics and demographic inference
.
Genet Res (Camb)
.
2009
;
91
(
4
):
281
292
.

Ronen
R
,
Udpa
N
,
Halperin
E
,
Bafna
V.
Learning natural selection from the site frequency spectrum
.
Genetics
.
2013
;
195
(
1
):
181
193
.

Rosenberg
NA
,
Feldman
MW.
The relationship between coalescence times and population divergence times. In: Slatkin M Veuille M, editors.
Modern Developments in Theoretical Population Genetics
.
Oxford
:
Oxford University Press
;
2002
. Chapter 9, p.
130
164
.

Rosenberg
NA
,
Hirsh
AE.
On the use of star-shaped genealogies in inference of coalescence times
.
Genetics
.
2003
;
164
(
4
):
1677
1682
.

Schierup
MH
,
Hein
J.
Consequences of recombination on traditional phylogenetic analysis
.
Genetics
.
2000
;
156
(
2
):
879
891
.

Schlötterer
C.
A microsatellite-based multilocus screen for the identification of local selective sweeps
.
Genetics
.
2002
;
160
(
2
):
753
763
.

Slatkin
M.
Gene genealogies within mutant allelic classes
.
Genetics
.
1996
;
143
(
1
):
579
587
.

Stuart
A
,
Ord
JK.
Kendall’s Advanced Theory of Statistics, Volume 1, Distribution Theory. 6th
ed.
Chichester
:
Wiley
;
1994
.

Tajima
F.
Evolutionary relationship of DNA sequences in finite populations
.
Genetics
.
1983
;
105
(
2
):
437
460
.

Tajima
F.
Statistical method for testing the neutral mutation hypothesis by DNA polymorphism
.
Genetics
.
1989
;
123
(
3
):
585
595
.

Tavaré
S
,
Balding
DJ
,
Griffiths
RC
,
Donnelly
P.
Inferring coalescence times from DNA sequence data
.
Genetics
.
1997
;
145
(
2
):
505
518
.

Uyenoyama
MK.
Genealogical structure among alleles regulating self-incompatibility in natural populations of flowering plants
.
Genetics
.
1997
;
147
(
3
):
1389
1400
.

Wakeley
J.
Coalescent Theory
.
Greenwood Village (CO
):
Roberts & Company
;
2009
.

Yang
Z.
Molecular Evolution: A Statistical Approach
.
Oxford
:
Oxford University Press
;
2014
.

Zeng
K
,
Fu
YX
,
Shi
S
,
Wu
CI.
Statistical tests for detecting positive selection by utilizing high-frequency variants
.
Genetics
.
2006
;
174
(
3
):
1431
1439
.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
Editor: R Hernandez
R Hernandez
Editor
Search for other works by this author on: