-
PDF
- Split View
-
Views
-
Cite
Cite
Egor Lappo, Noah A Rosenberg, Approximations to the expectations and variances of ratios of tree properties under the coalescent, G3 Genes|Genomes|Genetics, Volume 12, Issue 10, October 2022, jkac205, https://doi.org/10.1093/g3journal/jkac205
- Share Icon Share
Abstract
Properties of gene genealogies such as tree height (H), total branch length (L), total lengths of external (E) and internal (I) branches, mean length of basal branches (B), and the underlying coalescence times (T) can be used to study population-genetic processes and to develop statistical tests of population-genetic models. Uses of tree features in statistical tests often rely on predictions that depend on pairwise relationships among such features. For genealogies under the coalescent, we provide exact expressions for Taylor approximations to expected values and variances of ratios , for all 15 pairs among the variables , considering n leaves and . For expected values of the ratios, the approximations match closely with empirical simulation-based values. The approximations to the variances are not as accurate, but they generally match simulations in their trends as n increases. Although En has expectation 2 and Hn has expectation 2 in the limit as , the approximation to the limiting expectation for is not 1, instead equaling . The new approximations augment fundamental results in coalescent theory on the shapes of genealogical trees.
Introduction
Coalescent theory models random genealogies conditional on assumptions about the evolutionary process (Hein et al. 2005; Wakeley 2009). In coalescent theory, a gene genealogy is a tree or network structure that represents a random draw from a coalescent model.
Genealogies in coalescent theory can be summarized using a variety of quantities. For example, for random tree-like genealogies with n lineages, the tree height Hn records the sum of branch lengths on a path from a leaf to the root, and the tree length Ln sums all branch lengths in the tree. The total length En of external branches sums over leaves the lengths of paths from leaves to their nearest internal nodes, and the total length of internal branches, , sums the lengths of all remaining branches.
Studies in coalescent theory have often investigated the properties of tree summaries conditional on assumptions of coalescent models, with the goal of understanding how shapes of the genealogies relate to processes such as population growth and migration (e.g. Slatkin 1996; Rosenberg and Feldman 2002). Because mutations can be viewed as occurring conditionally on underlying genealogies (Hudson 1990), features of genealogical shape affect the patterns of genetic variation produced by coalescent models that permit mutation. Thus, the understanding of summaries of tree shape predicted by coalescent models is a component of the interpretation of patterns of genetic variation in relation to evolutionary processes.
Initial results concerning summaries of genealogical shape focused on single quantities, producing results on quantities such as Hn and Ln (Kingman 1982; Hudson 1983, 1990; Tajima 1983). Studies soon examined the information that resides in the relationships between pairs of summaries; genetic variation statistics such as those of Tajima (1989) and Fu and Li (1993) can be viewed as assessing whether or not one aspect of a tree contains long branches in relation to another.
Recently, Arbisser et al. (2018) performed a detailed investigation of the relationship between Hn and Ln under coalescent models. They studied the mathematical relationship between these two quantities, computing under a standard coalescent model with a constant-sized population the covariance and correlation coefficient of Hn and Ln. Extending the work of Arbisser et al. (2018) on Hn and Ln, we (Alimpiev and Rosenberg 2022) reported covariances and correlations for all pairs of variables among , where Bn is the mean of the lengths of the two basal branches of a genealogy and Tk is the coalescence time from k to k—1 lineages, . Our compendium in Tables 1 and 2 of Alimpiev and Rosenberg (2022) summarizes pairwise relationships for several of the most commonly used features of coalescent tree shape, recording both new and previously known results.
Variable . | Definition . |
---|---|
Hn | |
Ln | |
En | |
In | |
Bn |
Variable . | Definition . |
---|---|
Hn | |
Ln | |
En | |
In | |
Bn |
Here, Tk is the random variable representing the coalescence time from k to k—1 lineages, and is the (random) length of the ith external branch of a tree with n leaves. We define Hn, Ln, and En for , In for , and Bn for . The expression for Bn follows a form that incorporates terms associated with all of its contributing branches, following p. 1400 of Uyenoyama (1997) and Section 2.6 of Alimpiev and Rosenberg (2022), and it can be simplified to .
Variable . | Definition . |
---|---|
Hn | |
Ln | |
En | |
In | |
Bn |
Variable . | Definition . |
---|---|
Hn | |
Ln | |
En | |
In | |
Bn |
Here, Tk is the random variable representing the coalescence time from k to k—1 lineages, and is the (random) length of the ith external branch of a tree with n leaves. We define Hn, Ln, and En for , In for , and Bn for . The expression for Bn follows a form that incorporates terms associated with all of its contributing branches, following p. 1400 of Uyenoyama (1997) and Section 2.6 of Alimpiev and Rosenberg (2022), and it can be simplified to .
Xn . | . | . | . | . |
---|---|---|---|---|
Hn | 2 | |||
Ln | ||||
En | 2 | 2 | 0 | |
In | ||||
Bn | ||||
Tk |
Xn . | . | . | . | . |
---|---|---|---|---|
Hn | 2 | |||
Ln | ||||
En | 2 | 2 | 0 | |
In | ||||
Bn | ||||
Tk |
These expressions can be found in Alimpiev and Rosenberg (2022). Note that for Ln and In, although the limiting variance is finite, the expectation is infinite (Tavaré et al. 1997; Wakeley 2009, p. 76).
Xn . | . | . | . | . |
---|---|---|---|---|
Hn | 2 | |||
Ln | ||||
En | 2 | 2 | 0 | |
In | ||||
Bn | ||||
Tk |
Xn . | . | . | . | . |
---|---|---|---|---|
Hn | 2 | |||
Ln | ||||
En | 2 | 2 | 0 | |
In | ||||
Bn | ||||
Tk |
These expressions can be found in Alimpiev and Rosenberg (2022). Note that for Ln and In, although the limiting variance is finite, the expectation is infinite (Tavaré et al. 1997; Wakeley 2009, p. 76).
In addition to computing the covariance and correlation coefficient of Hn and Ln, Arbisser et al. (2018) also found approximations to the expectation and variance of the ratio under the coalescent model. This ratio gives a summary of the joint distribution of Hn and Ln that characterizes the relative magnitudes of the variables—a feature not captured by their covariance or correlation. Arbisser et al. (2018) found that although the approximation to differed noticeably from the exact value, as obtained by numerical integration and simulations of the coalescent model, the approximation to was quite accurate.
In this article, we extend the work of Arbisser et al. (2018) to compute approximations to the expectations and variances for ratios of the 14 remaining pairs among . The study performs for the expectation and variance of coalescent ratios an analogous extension of Arbisser et al. (2018) to that performed by Alimpiev and Rosenberg (2022) for the covariance and correlation coefficient.
Materials and methods
Tree variables
We work with a haploid population of constant size N that follows a standard coalescent model. Time is measured in units of N generations. In this section, we recall the definitions of the coalescence time Tk and tree properties Hn, Ln, En, In, and Bn for sample size and .
The tree properties Hn, Ln, En, In, and Bn are defined in terms of the Tk. Visual depictions of these properties appear in Fig. 1, and mathematical definitions of these quantities appear in Table 1.

Properties of genealogical trees. The tree height is Hn. The sum of the lengths of all branches is Ln. External branches have total length En (green). Internal branches have total length In (orange). Basal branches have mean length Bn (blue).
We define as a useful shorthand. The limit is the Riemann zeta function, usually denoted . In particular, diverges, , and is Apéry’s constant, approximately 1.20206.
Taylor approximations to expectations and variances of ratios
We use and to denote approximations from equations (3) and (4). For both the expectation and the variance, we also take the limit of the approximations.
Exact expectations, variances, and covariances of tree properties
Expected values and variances of variables Hn, Ln, En, In, Bn, and Tk that are used in equations (3) and (4) are known, in many cases, from early studies in coalescent theory (Fu and Li 1993; Tavaré et al. 1997; Wakeley 2009). We summarize these expectations and variances in Table 2.
The covariances compiled by Alimpiev and Rosenberg (2022) appear in Table 3. In the case of pairs (En, Bn) and (In, Bn), the covariances are approximate, as described by Alimpiev and Rosenberg (2022).
(Xn, Yn) . | . | . |
---|---|---|
Hn, Tk | ||
Hn, Ln | ||
Hn, En | 0 | |
Hn, In | ||
Hn, Bn | ||
Ln, Tk | ||
Ln, En | 0 | |
Ln, In | ||
Ln, Bn | ||
En, Tk | 0 | |
En, In | 0 | |
En, Bn | 0 | |
In, Tk | ||
In, Bn | ||
Bn, Tk |
(Xn, Yn) . | . | . |
---|---|---|
Hn, Tk | ||
Hn, Ln | ||
Hn, En | 0 | |
Hn, In | ||
Hn, Bn | ||
Ln, Tk | ||
Ln, En | 0 | |
Ln, In | ||
Ln, Bn | ||
En, Tk | 0 | |
En, In | 0 | |
En, Bn | 0 | |
In, Tk | ||
In, Bn | ||
Bn, Tk |
For pairs involving En or In, expressions apply for ; expressions involving Bn apply for . The expressions can be found in Alimpiev and Rosenberg (2022).
(Xn, Yn) . | . | . |
---|---|---|
Hn, Tk | ||
Hn, Ln | ||
Hn, En | 0 | |
Hn, In | ||
Hn, Bn | ||
Ln, Tk | ||
Ln, En | 0 | |
Ln, In | ||
Ln, Bn | ||
En, Tk | 0 | |
En, In | 0 | |
En, Bn | 0 | |
In, Tk | ||
In, Bn | ||
Bn, Tk |
(Xn, Yn) . | . | . |
---|---|---|
Hn, Tk | ||
Hn, Ln | ||
Hn, En | 0 | |
Hn, In | ||
Hn, Bn | ||
Ln, Tk | ||
Ln, En | 0 | |
Ln, In | ||
Ln, Bn | ||
En, Tk | 0 | |
En, In | 0 | |
En, Bn | 0 | |
In, Tk | ||
In, Bn | ||
Bn, Tk |
For pairs involving En or In, expressions apply for ; expressions involving Bn apply for . The expressions can be found in Alimpiev and Rosenberg (2022).
Evaluating the approximations
For each of 15 pairs of random variables, considering Hn, Ln, En, In, and Bn as well as Tk, we substitute expressions from Tables 2 and 3 into equations (3) and (4) to obtain approximate expectations and variances for ratios of pairs of variables. For each pair, we choose one variable for the numerator and the other for the denominator; approximate expectations and variances for the reciprocals can be obtained similarly. We present the approximations in Tables 4 and 5, and we plot them in Figs. 2–5.

Simulated and theoretical approximations of expectations of ratios of pairs of variables, plotted as functions of sample size n. Expressions for theoretical values are taken from Table 4.
![Theoretical approximations E˜[X/Tk] for variables X in {Hn,Ln,En,In,Bn}, plotted as functions of k for n = 10, n = 20, and n = 50. The expressions plotted are taken from Table 4.](https://oup.silverchair-cdn.com/oup/backfile/Content_public/Journal/g3journal/12/10/10.1093_g3journal_jkac205/1/m_jkac205f3.jpeg?Expires=1748341604&Signature=jP9WIxOI8HuhqJbJlBChFMqy8PvUPuGJQy4gJs-X~dRUMZDc6IDVVKkGF5TD400Xw3VnsVp~asFoYX~xEG5stQgCElytTK1caFt7SsYnE91EoVNqoqFMphlpDR9geel6j-pGKwD8-w3gBtNmYe2SlH2-t67bdf3j0s-aJ8SKRZ5f4GFV~OcrZTKYjIJ6z9sr42tegg-j6YR22IiULZxBYdyt0zjGQAwn1FoTR1qQ-uJrcI7zfObJJalUSJDnZvwPCUKlyLDzJEPtim67~NI4IEfm-04rxi387PvBigBlofSdwErUa0KXYZWhX9SnRChPwOsny-HrTh0Ayu~FuKwtSQ__&Key-Pair-Id=APKAIE5G5CRDK6RD3PGA)
Theoretical approximations for variables X in , plotted as functions of k for n = 10, n = 20, and n = 50. The expressions plotted are taken from Table 4.

Simulated and theoretical approximations of variances of ratios of pairs of variables, plotted as functions of sample size n. Expressions for theoretical values are taken from Table 5.
![Theoretical approximations Var˜[X/Tk] for variables X in {Hn,Ln,En,In,Bn}, plotted as functions of k for n = 10, n = 20, and n = 50. The expressions plotted are taken from Table 5.](https://oup.silverchair-cdn.com/oup/backfile/Content_public/Journal/g3journal/12/10/10.1093_g3journal_jkac205/1/m_jkac205f5.jpeg?Expires=1748341604&Signature=TFjSyiXUa1~k4G3vX2ZezK6EWdVMnA86qYMM3xbE0chIqS4mZZNTgZBuX375-qw2zMgQ0mKDXCUic0KxBLIGVYMo0mdu9aMZ-JhU3qctiG75gD78Tb-0DLlS5~EZ0a09Y2KKZA~C217t1-hMf18z8BTWTAokY-6e5kvs4mxwPTDtOnJt1XcX3s4uoFl4YpxPpI-Dnk-W9hlbSIPsh1CqvvD3i9STF~H3XgpeNvnwEaVOzTNK-DVGfLyOPvZIqIZNimESjN6eSlYo2pH80g2s4fViEVg69~~bHv5NxajaR1gC~5pbWqkCgzOej8cVwtkz5JhhyArw0WA1R4zhvub~kQ__&Key-Pair-Id=APKAIE5G5CRDK6RD3PGA)
Theoretical approximations for variables X in , plotted as functions of k for n = 10, n = 20, and n = 50. The expressions plotted are taken from Table 5.
(Xn, Yn) . | . | . |
---|---|---|
Hn, Tk | ||
Hn, Ln | 0 | |
En, Hn | ||
Hn, In | 0 | |
Bn, Hn | ||
Ln, Tk | ||
En, Ln | 0 | |
Ln, In | 1 | |
Bn, Ln | 0 | |
En, Tk | ||
En, In | 0 | |
Bn, En | ||
In, Tk | ||
Bn, In | 0 | |
Bn, Tk |
(Xn, Yn) . | . | . |
---|---|---|
Hn, Tk | ||
Hn, Ln | 0 | |
En, Hn | ||
Hn, In | 0 | |
Bn, Hn | ||
Ln, Tk | ||
En, Ln | 0 | |
Ln, In | 1 | |
Bn, Ln | 0 | |
En, Tk | ||
En, In | 0 | |
Bn, En | ||
In, Tk | ||
Bn, In | 0 | |
Bn, Tk |
Expressions involving En or In apply for ; expressions involving Bn apply for . The value for (Hn, Ln) follows equation 15 of Arbisser et al. (2018). The expressions are obtained using equation 3 and Tables 2 and 3.
(Xn, Yn) . | . | . |
---|---|---|
Hn, Tk | ||
Hn, Ln | 0 | |
En, Hn | ||
Hn, In | 0 | |
Bn, Hn | ||
Ln, Tk | ||
En, Ln | 0 | |
Ln, In | 1 | |
Bn, Ln | 0 | |
En, Tk | ||
En, In | 0 | |
Bn, En | ||
In, Tk | ||
Bn, In | 0 | |
Bn, Tk |
(Xn, Yn) . | . | . |
---|---|---|
Hn, Tk | ||
Hn, Ln | 0 | |
En, Hn | ||
Hn, In | 0 | |
Bn, Hn | ||
Ln, Tk | ||
En, Ln | 0 | |
Ln, In | 1 | |
Bn, Ln | 0 | |
En, Tk | ||
En, In | 0 | |
Bn, En | ||
In, Tk | ||
Bn, In | 0 | |
Bn, Tk |
Expressions involving En or In apply for ; expressions involving Bn apply for . The value for (Hn, Ln) follows equation 15 of Arbisser et al. (2018). The expressions are obtained using equation 3 and Tables 2 and 3.
(Xn, Yn) . | . | . |
---|---|---|
Hn, Tk | ||
Hn, Ln | 0 | |
En, Hn | ||
Hn, In | 0 | |
Bn, Hn | ||
Ln, Tk | ||
En, Ln | 0 | |
Ln, In | 0 | |
Bn, Ln | 0 | |
En, Tk | ||
En, In | 0 | |
Bn, En | ||
In, Tk | ||
Bn, In | 0 | |
Bn, Tk |
(Xn, Yn) . | . | . |
---|---|---|
Hn, Tk | ||
Hn, Ln | 0 | |
En, Hn | ||
Hn, In | 0 | |
Bn, Hn | ||
Ln, Tk | ||
En, Ln | 0 | |
Ln, In | 0 | |
Bn, Ln | 0 | |
En, Tk | ||
En, In | 0 | |
Bn, En | ||
In, Tk | ||
Bn, In | 0 | |
Bn, Tk |
Expressions involving En or In apply for ; expressions involving Bn apply for . The value for (Hn, Ln) follows equation 18 of Arbisser et al. (2018). The expressions are obtained using equation 4 and Tables 2 and 3.
(Xn, Yn) . | . | . |
---|---|---|
Hn, Tk | ||
Hn, Ln | 0 | |
En, Hn | ||
Hn, In | 0 | |
Bn, Hn | ||
Ln, Tk | ||
En, Ln | 0 | |
Ln, In | 0 | |
Bn, Ln | 0 | |
En, Tk | ||
En, In | 0 | |
Bn, En | ||
In, Tk | ||
Bn, In | 0 | |
Bn, Tk |
(Xn, Yn) . | . | . |
---|---|---|
Hn, Tk | ||
Hn, Ln | 0 | |
En, Hn | ||
Hn, In | 0 | |
Bn, Hn | ||
Ln, Tk | ||
En, Ln | 0 | |
Ln, In | 0 | |
Bn, Ln | 0 | |
En, Tk | ||
En, In | 0 | |
Bn, En | ||
In, Tk | ||
Bn, In | 0 | |
Bn, Tk |
Expressions involving En or In apply for ; expressions involving Bn apply for . The value for (Hn, Ln) follows equation 18 of Arbisser et al. (2018). The expressions are obtained using equation 4 and Tables 2 and 3.
For pairs (Xn, Yn), we simulate the values of and under the coalescent model using ms (Hudson 2002), performing 100,000 replicate simulations for each tree size . We plot the simulated values alongside the approximate values from Tables 4 and 5 in Figs. 2 and 4.
Results
Expectations of the ratios
The approximate expected values in Table 4, as approximations of ratios, have the form of rational functions. As n grows, the approximate expectations of , and approach 0. This behavior is sensible when considering the properties of the coalescent model: in the numerators, En has expectation 2 and and have bounded expectation in the limit as ; in the denominators, Ln and In have expectations that grow without bound (Table 2). Similarly, approximate expectations of ratios or with Ln and In in the numerator and Tk in the denominator grow to infinity as n increases. The approximation to approaches 1 in the limit as : as the number of leaves in the tree grows, internal branches occupy an increasingly large fraction of the total branch length.
For pairs of variables that both have finite expectation, the approximate expectations of their associated ratios—, and —also approach finite values in the limit as . It is interesting to observe that although (Table 2), . In other words, although expectations of the individual variables approach the same value, we expect to be somewhat larger than 1 on average.
For each of the 10 pairs of variables among , the approximate expectations from Table 4 are plotted in Fig. 2 together with the simulated values. Although some divergences are present for small n, the approximate and simulated values match closely.
The approximate ratios involving Tk are shown in Fig. 3 as functions of k for each of three values of n. Ln is the fastest-growing variable according to the expression for its expectation (Table 2), and the graph for is topmost in all three plots. As expectations of Hn and En are close (Table 2), the graphs for and are close in Fig. 3.
Variances of the ratios
The limits of approximations of variances of ratios are presented in Table 5. They behave similarly to the expectations in Table 4. Because Ln and In have expectations that grow without bound, for ratios —with Ln or In in the denominator—the limits of the variance approximations are 0. As n grows, the denominators grow much faster than the numerators, and the values are therefore increasingly concentrated around 0. Hence, the variances also approach 0.
Because Ln and In are much larger than the coalescence times Tk, approximations to variances of and diverge to infinity as n increases. Interestingly, however, the approximate variance of , a ratio of two quantities with diverging expectations, approaches 0.
The variance approximations with finite nonzero limits are those for , and . All give ratios of two variables with finite expectation and variance as (Table 2).
Figure 4 shows the expressions from Table 5 together with the simulated values. Compared to the plots of expectations of ratios (Fig. 2), differences between the simulated and approximate variances are prominent at small n. For the variances of , and , the simulated and approximate values differ substantially even as n increases. Because the theoretical value of that contributes to the approximate variance of is itself an approximation, one of the larger differences between simulation and approximation occurs for the plot for .
Figure 5 shows variances of ratios involving Tk for varying k, for each of three values of n. Qualitatively, the values for approximate variances behave similarly to expectations in Fig. 3: in particular, the vertical placement of the curves follows the same order. Our approximations to the variances of and grow fastest, as the numerators are typically large and the expected value of the denominator Tk decreases as k grows. Approximations to variances of , and all display much slower growth; for these quantities, the expectations of numerators of the ratios are bounded above by 2 for all n.
Discussion
In this article, we have computed approximations to expected values and variances of ratios of various branch lengths under the standard coalescent model. We have considered all 15 possible pairs of variables in , a set of variables whose properties have been studied in detail individually. We have also assessed the accuracy of approximations to the expectation and variance by comparing them with values computed by simulation. We have observed that the approximate expressions behave in a way that matches mathematical intuition about the behavior of random variables associated with the branch lengths.
In plots of the various approximations, we have illustrated how the random variables relate to each other, both among (Figs. 2 and 4) as well as between pairs including one of along with Tk (Figs. 3 and 5). As n grows large, the ratios involving Ln and In have nearly identical behavior in the plots, an observation that is explained by the fact that internal branches take up increasingly large fractions of the total branch length. In the limit as , expectations of both Hn and En approach a constant value of 2 (Table 2), and approaches 0 (Table 3). However, we observed that is not equal to . For the ratio , the approximation aligns with the naive prediction, , even though is also zero in the limit (Table 2). For Bn and Hn, which possess a high correlation, , whereas .
Previously, we evaluated covariances and correlation coefficients under the coalescent model for the pairs of variables that we consider here, obtaining exact covariances and correlations for 13 of 15 pairs and approximations for the other two. We obtained limiting expressions for these covariances and correlations as . The approximate values that we have provided here for expectations and variances of ratios make use of these previous results concerning covariances, adding to the understanding of the properties of joint distributions of pairs of genealogical variables in coalescent theory.
Many statistical tests of population-genetic models rely on a model prediction of an equivalence between two quantities, framed as a null hypothesis that a test statistic equals a particular value. The prediction is often formulated as a null hypothesis that a difference between two quantities equals 0 or that their ratio equals a null value such as 1. In coalescent theory, tests that evaluate site-frequency spectra for agreement with predictions of coalescent models tend to use differences or other linear combinations (Zeng et al. 2006; Achaz 2009; Ferretti et al. 2010, 2017; Ronen et al. 2013; Fu 2022). However, several modeling studies and inference procedures in coalescent theory do emphasize ratios (Slatkin 1996; Uyenoyama 1997; Schierup and Hein 2000; Rosenberg and Hirsh 2003; Eldon 2011; Arbisser et al. 2018), as do some test statistics (Schlötterer 2002; Lohse and Kelleher 2009). Widely used tests in the area of molecular evolution, such as tests of the relative count of nonsynonymous and synonymous substitutions and the McDonald–Kreitman test of polymorphism and divergence, also make use of ratios (Yang 2014).
The choice of a difference or a ratio in formulating a test statistic can rely on several factors. Ratios are unitless, so that their values do not depend on conventions chosen during computation (e.g. scaling time in units of N or 2N). Ratios might take values in a prescribed range that can be simply interpreted, such as the range of the coalescent ratio from to (Arbisser et al. 2018). However, the statistical properties of random variables formulated as differences are generally easier to compute from the properties of the separate random variables whose difference is taken than are the properties of corresponding statistics formulated as ratios. In general, corresponding differences and ratios in coalescent theory have not been formally compared for features such as their power to reject the null hypothesis when processes such as natural selection or population or species divergence affect the shapes of evolutionary trees. Our work to obtain approximate expectations and variances of ratios can augment understanding of scenarios in which coalescent ratios are considered, and it can assist in evaluating the relative utility of difference-based and ratio-based statistics.
We have found that approximations for fixed n and in the limit as are quite accurate in predicting the expected values seen in coalescent simulations of the ratios (Fig. 2). For the variances, the approximations are generally less accurate, although in most cases, graphs of the approximations and simulated values have similar shape (Fig. 4). These approximations are obtained from a Taylor approximation for the variance of a ratio (equation 4), and higher-order approximations of this variance could potentially be applied by use of Taylor’s theorem; as the order of the approximation increases, however, the complexity of the resulting formula also increases. For those variances for which the approximation and simulation are not close in Fig. 4, we advise caution in using the variances in settings in which a precise approximation is needed.
Data Availability
The ms command for simulations is ms n 100,000 -T, where n is taken from and gives the number of leaves of simulated trees.
Funding
The authors acknowledge support from NIH grants R01 GM131404 and R01 HG005855 and NSF grant BCS-2116322.
Conflicts of interest
None declared.