Accurate halo mass functions from the simplest excursion set theory

Excursion set theory is a powerful and widely used tool for describing the distribution of dark matter haloes, but it is normally applied with simplifying approximations. We use numerical sampling methods to study the mass functions predicted by the theory without approximations. With a spherical top-hat window and a constant $\delta=1.5$ threshold, the theory accurately predicts mass functions with the $M_{200}$ mass definition, both unconditional and conditional, in simulations of a range of matter-dominated cosmologies. For $\Lambda$CDM at the present epoch, predictions lie between the $M_\mathrm{200m}$ and $M_\mathrm{200c}$ mass functions. In contrast, with the same window function, a nonconstant threshold based on ellipsoidal collapse predicts uniformly too few haloes. This work indicates a new way to simply and accurately evaluate halo mass functions, clustering bias, and assembly histories for a range of cosmologies. We provide a simple fitting function that accurately represents the predictions of the theory for a wide range of parameters.


INTRODUCTION
The theory of excursion sets 1 is a powerful tool for understanding dark matter haloes and how their distribution connects to the cosmological initial conditions.First used to predict halo mass functions (Bond et al. 1991), excursion sets are also employed to study other aspects of the halo distribution, such as their merger rates (Lacey & Cole 1993) and spatial clustering (Cole & Kaiser 1989;Mo & White 1996) (for a review, see Zentner 2007).The basic idea of the excursion set approach is that a particle's density environment in the initial conditions directly predicts its halo membership at later times.If the density contrast field (x) ≡ [(x) − ρ]/ ρ averaged on the mass scale  exceeds a preset threshold  c , then the particle initially at x is part of a halo of at least mass  at a preset later time.The masses  satisfying this condition comprise the excursion set associated with the threshold  c (Adler 2000).The mass of the particle's host halo is taken to be the largest mass in the excursion set.
In particular, consider the density contrast field  (  ) (x, ) extrapolated to the time  using linear-order cosmological perturbation theory and averaged on the mass scale .At fixed position x and time ,  (  ) can be regarded as executing a random walk in decreasing , starting from  (∞) = 0. Then the particle's host mass is the  for which  (  ) first crosses the threshold  c .To make calculations of the first-crossing distribution analytically tractable, the random walk is ordinarily approximated to be Markovian, i.e., each step is uncorrelated with other steps.If the threshold  c is taken to be the spherical collapse threshold,  c = 1.686,Bond et al. (1991) showed that this leads to the halo mass function of Press & Schechter (1974).Approximations to the first-crossing distribution with correlated steps have also been studied (Peacock & Heavens 1990; Maggiore & Ri-★ E-mail: mdelos@carnegiescience.edu 1 Also known as extended Press-Schechter theory.otto 2010a; Paranjape et al. 2012;Musso & Sheth 2012;Farahi & Benson 2013;Musso & Sheth 2014a,b;Nikakhtar et al. 2018).
A common refinement of this approach is to employ a non-constant threshold  c .Motivated by the Bond & Myers (1996a) model of ellipsoidal collapse, Sheth et al. (2001) considered a threshold that depends on the three-dimensional shape of the tidal deformation tensor    ≡ −    , where  is a solution to ∇ 2  = −.Ellipsoidal collapse is delayed by tidal forces, so the threshold  c becomes higher if    is significantly aspherical.The random walk is now in the six independent components of  (  )   , and Chiueh & Lee (2001), Sheth & Tormen (2002), and Sandvik et al. (2007) tested the first-crossing distributions that result therefrom.To simplify the problem, however, Sheth et al. (2001) exploited how the typical anisotropy of  (  )   depends on  to approximate a mass-dependent "moving" threshold,  c ().Since smaller  is associated with more ellipsoidal tides,  c is taken to be a decreasing function of .Compared to a constant threshold, halo mass functions resulting from the moving threshold of Sheth et al. (2001) yield a better match to the results of numerical simulations, assuming uncorrelated steps, although Robertson et al. (2009) found that this may not hold when steps are correlated.Further refinements to the ellipsoidal collapse threshold have also been explored (Angrick & Bartelmann 2010;Ludlow et al. 2014;Borzyszkowski et al. 2014), while recent machine-learning-based studies have questioned whether anisotropic information is relevant at all to halo mass predictions (Lucie-Smith et al. 2018, 2019, 2020).
In this work, we relax the approximations by considering the full 6-dimensional  (  )   and accounting fully for correlations between steps, under several choices of averaging window function.We use direct numerical sampling, which can be done quite efficiently (e.g.Nikakhtar et al. 2018), to generate halo mass functions for a range of cosmologies.We compare these mass functions both to the standard analytical predictions and to simulation results.Specifically, we consider both the scale-free and the concordance ΛCDM cos-mological simulations of Diemer & Kravtsov (2015).We test both unconditional and conditional mass functions.
We find that for the standard choice of window function -the spherical top-hat in real space -the ellipsoidal collapse threshold predicts too few haloes of every mass.In contrast, a constant  c = 1.5 threshold yields halo mass functions that closely match those of the scale-free simulations with the  200 mass definition.The predicted conditional mass functions related to halo clustering bias and assembly history are also generally accurate.For ΛCDM, the same threshold yields predictions that lie between  200m and  200c mass functions.Other common choices of window function cannot match the simulation results as closely with either constant or ellipsoidal-collapse-motivated thresholds.Mass functions predicted by excursion set theory with the top-hat window and a constant threshold are nearly independent of the linear power spectrum, when they are considered as a function of the rms density variance , but they can exhibit significant variations for extreme spectral indices or when there are features in the power spectrum.
This article is organized as follows.Section 2 describes our approach for numerically sampling the trajectories  (  ) and  (  )   .In Sec. 3, we use those sampled trajectories to generate halo mass functions, and we compare the outcomes for different thresholds and window functions.In Sec. 4, we compare the excursion set mass functions to those derived from cosmological simulations, considering both unconditional and conditional mass functions.In Sec. 5, we explore the degree to which excursion set mass functions adhere to a universal form when they are expressed in terms of , and we provide a fitting function.We present conclusions in Sec. 6. Appendix A tests how well the mass functions that we calculate are numerically converged, while Appendix B details the extent to which the predicted conditional mass functions stick to the same universal form.

NUMERICAL SAMPLING OF EXCURSIONS
This section describes how we numerically sample the trajectories of  or    as a function of the averaging scale.

Spherical collapse
For simplicity, we begin by restricting our consideration to the density contrast (x), which is a function of comoving position x.Let (k) = ∫ d 3 x e −ik•x (x) be its Fourier transform.The density contrast averaged on radius  with the window function  is whence it follows from the definition of the power spectrum () that Here P () ≡ [ 3 /(2 2 )]() is the dimensionless power spectrum.
In general,  () is a function ranging from 1 for  ≪ 1 to 0 for  ≫ 1.
If we discretize the windowing radii  into a sequence  1 <  2 < ... <   , then ⟨ (  ) (x) (  ) (x)⟩ is the  ×  covariance matrix of the vector of  (  ) ≡   .Under this discretization, a trajectory in  is just a random vector   distributed according to a -variate Gaussian distribution with mean 0 and covariance ⟨    ⟩ given in accordance with Eq. ( 3).There are many ways to sample random   from such a distribution (e.g.Nikakhtar et al. 2018); we follow Delos et al. (2019) in diagonalizing where  is an orthogonal matrix and   are the eigenvalues of ⟨⟩.
A randomly sampled trajectory is then given by where each   is sampled from the univariate Gaussian distribution of mean 0 and variance   .
The upper panels of Fig. 1 show examples of density trajectories sampled for three choices of spherical window function  (): (i) the top-hat window function,  () = (3/ 3 )(sin  −  cos ), which cuts off sharply in real space; (ii) the Gaussian window function,  () = e −  2 /2 , which is smooth in both real and Fourier space; and (iii) the sharp -space window function,  () = 1 if  < 1 and 0 otherwise.
We express these trajectories in terms of the windowing mass scale  ∝  3 .We consider two power spectra: () ∝   with  = −1 (left) and  = −2.5 (right).The windowing mass scales are expressed in units of  * , the mass scale associated with the windowing radius  * such that ( * ) = 1.Here is the variance of  ( ) (x), the density field averaged on the radius .Note that  * depends on the window function.When results are expressed in these units, the normalization of the power spectrum () is irrelevant, and the normalization of the windowing mass scale  is also irrelevant.
Figure 1 shows example trajectories in , illustrating typical behaviour with the different window functions.For the sharp -space window, steps in  as  is varied are uncorrelated, which leads to an extremely noisy trajectory.In contrast, the (real-space) top-hat window (blue) leads to much less noisy trajectories, and those with the Gaussian window (orange) are smooth.It is interesting to note how the impact of Gaussian windowing is not intermediate between tophat and sharp- windows, despite the mathematical sense in which the Gaussian window is intermediate between the other two.

Ellipsoidal collapse
We now extend the above treatment to consider trajectories in the tidal tensor    .In terms of the Fourier-transformed density contrast,    has the explicit expression Since it is a symmetric 3 × 3 matrix, it has six independent components.Three of them can be taken to be the eigenvalues  1 ≥  2 ≥  3 , which describe the strength of tidal deformation along principal axes, while the remaining three degrees of freedom orient those axes.Note that  =  1 +  2 +  3 .Following standard terminology for ellipsoidal collapse (e.g.Bond & Myers 1996a), we define the ellipticity  ≡ ( 1 −  3 )/(2) and prolateness  ≡ ( 1 +  3 − 2 2 )/(2); thus , , and  parametrize the eigenvalues of .We decompose    into the density contrast  (upper panels), the ellipticity  (middle panels), and the prolateness  (lower panels, expressed in units of ).We consider a single trajectory for each of three averaging window functions (different colours) and two different scale-free power spectra (left-hand versus right-hand panels).The mass  is expressed in units of the mass scale  * on which the rms variance in  (, upper axis) is unity.Where  < 0, we do not plot  and .The trajectories with the sharp -space window (green) are extremely noisy, which reflects a lack of correlation between steps in  as  is varied.For the top-hat (blue) and Gaussian (orange) windows, the trajectories are much smoother.The circles mark first crossing of the spherical collapse threshold  c = 1.686 for each trajectory, while the squares mark first crossing of the ellipticity-and prolateness-dependent ellipsoidal collapse threshold  c = 1.686  ec (, ).
The tidal tensor averaged on the radius  with a window function  is straightforwardly from which it follows that where    is the Kronecker delta (equal to 1 if  =  and 0 otherwise).Equation (10) results from carrying out the angular integrals in Eq. ( 9).If we discretize the windowing radii  1 <  2 < ... <   again, then ⟨ (  ) (x) (  )  (x)⟩ is a 6 × 6 covariance matrix, since  has 6 independent components.By diagonalizing this covariance matrix, we can sample random trajectories   ,   ,   using the same methods as in Sec.2.1.The trajectories in Fig. 1 were generated using this approach, and the lower panels show the ellipticity  and prolateness  as a function of the averaging mass scale .

Conditional trajectories
We will also have occasion to sample trajectories in  ( ) conditioned on it taking a particular value  ( r ) = δ when windowed on a chosen scale r.Given the discretization scheme  1 < ... <   again, the conditional distribution of the   ≡  (  ) is Gaussian with mean and covariance (see Eqs. 3 and 6).Here Δ  ≡   − ⟨  ⟩ |  ( r ) = δ .These results follow from a classic theorem on conditional Gaussian distributions (e.g.Appendix D of Bardeen et al. 1986).Similarly to Sec. 2.1, we can diagonalize where  is an orthogonal matrix and   are the eigenvalues of ⟨ΔΔ⟩|  ( r ) = δ .A random trajectory is now given by where each   is sampled from the Gaussian distribution of mean 0 and variance   .

MASS FUNCTIONS FROM EXCURSION SET THEORY
In the excursion set approach, a particle is deemed to belong to a halo of mass  if that is the largest mass scale for which the (linearly evolved) density contrast  averaged around that particle's location exceeds some threshold  c .That is, the halo mass is the location of first crossing of the threshold, if the trajectory in  is viewed as a random walk in decreasing averaging scale .The circles in the upper panels of Fig. 1 mark these first crossings if the spherical collapse threshold  c = 1.686 is adopted.We also consider the ellipsoidal collapse threshold  c = 1.686  ec (, ), where  ec is the solution to as approximated by Sheth et al. (2001).Note that  ec ≥ 1.The squares in Fig. 1 mark each trajectory's first crossing of the ellipsoidal collapse threshold.With respect to the horizontal axis, the circles and squares indicate the mass of the particle's host halo, as determined by spherical and ellipsoidal collapse, respectively.We discretize the windowing radii  1 <  2 < ... <   such that (  ) = 0.28 (see Eq. 6).This choice ensures that the probability is negligible that a trajectory would already exceed the collapse threshold  c at or above the maximum radius   , which is important because in that case the first crossing would be missed.We consider power spectra () ∝   with  = −1,  = −1.5,  = −2, and  = −2.5; the lower limit  1 of the discretization is taken such that ( 1 ) ranges from about 400 in the first case to about 30 in the last.When we sample one-dimensional trajectories in  alone, we use  = 3200 logarithmically spaced averaging radii, while to sample six-dimensional trajectories in    , we reduce this number to  = 800. 2 The spacing between successive windowing mass scales  ∝ 3 is listed in Table 1 for each power spectrum.
For each power spectrum, we sample 10 5 trajectories in  alone and 10 5 trajectories in    .For each trajectory, we find the largest windowing scale for which  exceeds the threshold for spherical or ellipsoidal collapse, and then we attempt to interpolate the precise scale at which the crossing occurred.The first crossing sets the mass  of the halo hosting the particle associated with the trajectory in question.Since we are sampling trajectories associated with arbitrary points in the initial density field, our sample is of arbitrary dark matter particles.Therefore, the distribution of first-crossing masses is precisely the differential fraction d  /d of all mass that resides in haloes of mass .We will generally present halo mass functions as d  /d ln , the differential mass fraction per logarithmic interval numpy matrix operations (Harris et al. 2020), the author's personal computer samples about 400 trajectories in  per second for  = 3200 or about 150 trajectories in    per second for  = 800 (including the decomposition into , , and ).See also Nikakhtar et al. (2018) for a potentially faster approach.
Table 1.Spacing of steps in the top-hat window mass scale  for the different power spectra.We also show the spacing in  (Eq.6).For other window functions, the steps are different by under 5 per cent.For one-dimensional trajectories in  alone, we use much tighter spacing than for six-dimensional trajectories in    at similar computational expense.

𝛿 (spherical)
(ellipsoidal) A great convenience of considering scale-free cosmologies, with () ∝   , is that a change in time is equivalent to a change in mass scale. 3This means that we can improve the statistical precision of this calculation -particularly at the large-mass end -by stacking the distributions of first-crossing masses evaluated at different times.We assume scale-independent growth (valid for a dark matter-dominated universe) and adopt growth factors  ranging from (  )/( 1 ) to 1 separated by factors of 1.03, where ( 1 ) ∼ O (100) and (  ) = 0.28 are the rms variance at the minimum and maximum window scales, respectively, as described above.Thus, at the smallest  (earliest time), no haloes are expected within the resolution limit implied by the choice of window scales.By uniformly scaling each previously sampled trajectory in  by the growth factor  (or equivalently scaling the thresholds by 1/), we obtain the first-crossing mass distribution at the time when the growth factor was .The characteristic mass scale  * () at this earlier time is defined to be the mass associated with the window radius  * such that ( * ) = 1 (see Eq. 6).In units of  * (), the underlying first-crossing mass distributions at different  must be all exactly the same.
We count first-crossing masses / * () in logarithmic bins of width Δ ln  = 0.33.For each mass bin, we stack the counts from all growth factors  for which the bin lies fully between the lower and upper mass limits,  1 / * () and   / * (), where   is the mass associated with the window radius   .Since  * () grows with , the distributions at lower  (earlier times) tend to improve the count of first-crossing masses at high / * ().We will show in Sec. 4 that the statistical uncertainty in d  /d ln  resulting from this procedure is mostly at the per cent level.Note that the uncertainty of the count in each mass bin is not simply Poissonian, because the counts contributed by different  are correlated.
The solid curves in Fig. 2 show the d  /d ln  that result from this calculation, for both the spherical collapse (blue) and ellipsoidal collapse (orange) thresholds.We consider three different window functions  (different columns) and the two power spectra () ∝   with  = −1 and  = −2.5 (different rows).For comparison, we We consider both the constant spherical-collapse threshold  c = 1.686 (blue curves) and the ellipsoidal collapse threshold  c = 1.686  ec (, ) (orange curves; see Eq. 15).The upper panels are for a  ( ) ∝  −1 power spectrum, while the lower panels are for  ( ) ∝  −2.5 .From left to right, we show results for the top-hat, Gaussian, and sharp -space window functions.The d  /d ln  mass functions are expressed in units of the characteristic mass scale  * on which the rms variance is 1; note that  * depends on the window function.For comparison, the dashed curves show the classic analytic predictions (Press & Schechter 1974;Sheth et al. 2001), which are the same in the left-hand, centre, and right-hand panels.Meanwhile, the black dotted curves show the  200 mass functions from the simulations of Diemer & Kravtsov (2015) (which depend on the window only because  * does).The numerically evaluated mass functions with top-hat and Gaussian windows differ significantly from the analytic approximations.For these windows, the ellipsoidal collapse threshold yields too few haloes of every mass compared to the simulation results.also show as dashed lines the standard analytic predictions,  (Sheth et al. 2001) for the ellipsoidal collapse threshold.Here   is shorthand for () (Eq.6) evaluated for the radius  ∝  1/3 associated with the mass scale .For spherical collapse with the sharp -space window, our numerical calculation matches the analytic prediction almost exactly.This outcome is expected since the analytic prediction is exact under the assumption that steps in  are uncorrelated, and that assumption is satisfied for sharp -space windowing.For ellipsoidal collapse with the same window, the analytic prediction is close to our numerical result but does not exactly match it; this outcome reflects that Eq. ( 18) is only approximate even under the assumption that steps are uncorrelated.
The top-hat and Gaussian windows yield more remarkable outcomes.For the top-hat window, the numerically evaluated mass function from spherical collapse nearly matches the analytic prediction for ellipsoidal collapse (not spherical collapse).A similar outcome was noticed before by Musso & Sheth (2014b).Meanwhile, the numerically evaluated mass function from ellipsoidal collapse is completely different, predicting significantly too little mass in haloes of every mass (as noted by Robertson et al. 2009).Similar behaviour is also true for the Gaussian window.To the extent that ellipsoidal collapse improves the match between excursion set mass functions and simulation results (Sheth et al. 2001), this outcome suggests that the improvement may only have been an artifact of the assumption that steps in the trajectories of  are uncorrelated.In the next section, we will test the degree to which excursion set theory with a constant threshold  c can predict mass functions that match simulation results.
There is one important source of error in our numerically evaluated mass functions.Due to the discretization of window scales, it is possible for the procedure to miss a sufficiently "brief" crossing of the  =  c threshold.That is, we may have  (  ) <  c and  ( +1 ) <  c , but  ( ) >  c for some   <  <  +1 .This effect tends to improperly shift the first-crossing distribution to smaller window scales, an outcome that can be seen in the comparison between the analytic prediction for spherical collapse and the numerical counterpart in the rightmost panels of Fig. 2. The numerical distribution (solid blue curve) lies slightly below the analytic one (dashed blue curve) for most of the mass range, except for the lowest masses for the () ∝  −1 power spectrum, where the numerical curve is slightly above the analytic one.In Appendix A, we test the degree to which our numerically evaluated mass functions are converged with respect to the resolution of the discretization scheme.We find that the mass functions converge much more readily for the top-hat and Gaussian windows than for the sharp- window.This result can be understood by appealing to the example trajectories in Fig. 1.Trajectories with the sharp- window are extremely noisy, which makes "brief" threshold crossings likely.In contrast, trajectories with the top-hat and Gaussian windows are much smoother.Although our numerically evaluated mass functions with the sharp -space window may be inaccurate at the 10 per cent level, there is no reason to expect the top-hat and Gaussian counterparts to be inaccurate to any comparable degree.

Scale-free cosmology
The dotted curves in Fig. 2 compare our results to mass functions derived from cosmological simulations.We use the publicly available halo catalogues for the scale-free simulations of Diemer & Kravtsov (2015), which are part of the Erebos simulation suite (Diemer 2020a).There are four simulations, initialized with power spectra () ∝   with  = −1, −1.5, −2, and −2.5, although we only represent two of them in Fig. 2.These simulations involve dark matter particles only and adopt flat, matter-dominated (Einstein-de Sitter) cosmologies.Catalogued haloes are identified with the rockstar halo finder (Behroozi et al. 2013a), and the calculation of their masses is described by Diemer (2020a).We use the  200 mass definition, which is the mass enclosed within a sphere centred on the halo's inner cusp, where that sphere is the largest one that encloses average density 200 times the cosmological average.We consider only field haloes (as opposed to subhaloes). 4 Note that the simulated mass functions in Fig. 2 depend on the window function only because they are expressed in units of the characteristic mass scale  * (on which the rms variance of  is 1), which depends on the window function.
For each simulation, a range of snapshots are available, separated by factors of about 1.03 in the scale factor  and hence in the growth factor .The snapshots span a total factor in  ranging from 6 to 11, depending on the simulation.Since these simulations involve scale-free cosmologies, different snapshots sample different portions of the same halo mass distribution, as we discussed in the previous section.The mass scale  * grows over time, so earlier snapshots sample haloes of larger / * while later snapshots sample haloes of smaller / * .The simulations involved 1024 3 particles in a periodic box, and Table 2 lists how the particle mass and the total box mass compare to  * for the earliest and latest snapshots of each simulation.We count haloes in the same bins of width Δ ln  = 0.33 as in the previous section.Since haloes with too few particles can be influenced by discreteness or other resolution artifacts, we follow Diemer (2020b) in only considering haloes for which  200 is more than 500 times the particle mass. 4Halo mass functions from the same simulations were previously analysed by Diemer (2020b).While scale-free simulations are less widely studied, mass functions from ΛCDM simulations in the same Diemer (2020a) suite were shown by Mansfield & Avestruz (2021) to agree with a wide range of independent simulations.2.0 × 10 8 1.9 × 10 −1 1.5 × 10 4 1.4 × 10 −5 ∝  −2 3.0 × 10 10 2.8 × 10 1 1.6 × 10 4 1.5 × 10 −5 ∝  −2.5 2.4 × 10 13 2.2 × 10 4 5.0 × 10 3 4.7 × 10 −6 Figure 2 shows that with the top-hat window (left-hand panels), the excursion set mass functions with the spherical collapse threshold  c = 1.686 (solid blue curve) lie close to the simulation mass functions but are offset in mass.However, there is no reason to adopt precisely the spherical collapse threshold, which is associated with the time at which a spherical shell collapses to radius 0 (under simplifying assumptions).We expect in general that a particle will cross a halo's  200 boundary (the radius of the 200-times-overdense sphere), to contribute to the halo's  200 , significantly earlier.Also, if the true threshold  c exhibits stochasticity, this can be statistically equivalent to a lower threshold without stochasticity (Maggiore & Riotto 2010b).Thus, it is natural to choose a lower threshold, which would shift the excursion set mass functions to the right (higher masses).
Figure 3 shows excursion set mass functions with the top-hat window and a lower threshold  c = 1.5.Here we consider all of the four different power spectra () ∝   with  = −1, −1.5, −2, and −2.5 (different panels).In each case, we compare to the mass function from the respective simulation.We include 90 per cent confidence uncertainty bands, which we estimate by bootstrapping.For the excursion set predictions, we resample 10 5 samples (with replacement) from the previously sampled trajectories.For the simulations, we split the simulation volume into 64 cubes and resample 64 at random (with replacement).In both cases, we evaluate d  /d ln  for 101 such resamplings, and the uncertainty bands extend between the 5th and 95th percentiles.Where d  /d ln  is close to its maximum value, the uncertainty in the excursion set predictions is around 1 per cent, too small to be visible.
The match between simulations and predictions is quite close, remaining at the 10 per cent level except where d  /d ln  drops off steeply at high masses.Moreover, we will show in Sec.4.2 that the discrepancy at high masses for () ∝  −2.5 is an artificial consequence of the finite simulation box size.The success of this model is particularly notable because it has only a single parameter, the threshold  c .We emphasize that once  c is chosen, there is no further freedom, and that the same value,  c = 1.5, works well for all four of the power spectra.This constant threshold evidently suffices for excursion set theory to predict halo mass functions accurately, and no correction for ellipsoidal collapse appears to be necessary.
The possibility that ellipsoidal collapse does not improve excursion set predictions of halo masses may come as a surprise.The ellipsoidal collapse threshold in Eq. ( 15) is known to accurately predict the outcome of the collapse of a local maximum in the unwindowed density field (Delos et al. 2019;Delos & White 2023;Ondaro-Mallea et al. 2023).Ellipsoidal collapse may also be relevant to the association of haloes with local maxima in the windowed initial density field (e.g.Bond & Myers 1996b;Castorina et al. 2016).However, the standard excursion set theory considered in this work is conceptually quite different.Rather than tracking haloes, it tags individual particles with the mass of their host halo.The threshold  c is associated with the time at which a particle becomes part of a halo of this mass, which could involve the particle's current halo growing through accretion but could also involve that halo accreting onto a larger host.It is unclear the extent to which the ellipsoidal collapse threshold in Eq. ( 15) -which marks the time for all three axes of a homogeneous ellipsoid to collapse (under simplifying assumptions about the behaviour of the first two axes as they approach collapse; see Bond & Myers 1996a;Sheth et al. 2001) -should be expected to apply to this problem.Moreover, our result is consistent with the findings of Lucie-Smith et al. (2018,2019,2020), which showed that tidal shear and three-dimensional shape information do not improve machine-learning-based halo mass predictions compared to using spherically averaged density information alone.
The accuracy of the constant-threshold excursion set theory is also surprising because it cannot hold at the level of individual particles (as pointed out by Bond et al. 1991).For example, neighboring particles are associated in general with haloes of different masses, even if those masses are large enough to imply that the particles belong to the same halo.This deficiency has motivated alternative approaches that attempt to identify the precise set of particles in the initial conditions that will belong to a halo at late times, often by associating haloes with local maxima in the windowed density field (e.g.Appel & Jones 1990;Bond & Myers 1996a,b;Manrique et al. 1998;Hanami 2001;Ludlow & Porciani 2011;Paranjape & Sheth 2012b;Rossi 2012;Paranjape et al. 2013;Rossi 2013;Hahn & Paranjape 2014;Castorina et al. 2016;Musso & Sheth 2021, 2023).Nevertheless, the conceptual simplicity of the excursion set approach is a major advantage.Moreover, we will show next that the theory can accurately predict not only unconditional but also conditional mass functions, making it suitable for a wide range of applications.

Conditional mass functions
A powerful feature of excursion set theory is its capacity to predict halo mass functions in regions that have a specified density contrast  at a larger mass scale.These conditional mass functions employ constrained trajectories in , which can be sampled as described in Sec.2.3.Conditional mass functions naturally enable treatments of halo clustering bias (e.g.Cole & Kaiser 1989;Mo & White 1996;Sheth & Tormen 1999;Ma et al. 2011;Paranjape & Sheth 2012a;Zhang et al. 2014;Zheng et al. 2023).A conditional mass function can also be interpreted as the mass function of a given halo's progenitors at some earlier time (e.g.Lacey & Cole 1993;Cole et al. 2000;Giocoli et al. 2007;Parkinson et al. 2008;Angulo & White 2010;Benson et al. 2013;Jiang & van den Bosch 2014;Nadler et al. 2023).We now test the degree to which conditional mass functions predicted by the  c = 1.5 excursion set theory match the results of the scale-free simulations.
As periodic simulations of flat cosmologies, the simulations that we use enforce  = 0 on the scale of the box.Ordinarily, this constraint does not significantly influence structures at scales much smaller than the box (e.g.Power & Knebe 2006), because the amplitudes of density contrasts are typically much larger at those scales than they would be expected to be at the box scale.However, for () ∝  −2.5 , the rms density contrast  scales as  ∝  −1/12 .Since the smallest halo mass that we consider (500 simulation particle masses) is about 5 × 10 −7 times the mass of the full simulation box, this means that between the smallest scale and the box scale of the () ∝  −2.5 simulation, there is only a factor of about 3 in .
For the () ∝  −2.5 simulation, we consider three snapshots corresponding to box masses ranging from about 10 8 to about 10 10  * .Using a set of window scales separated by Δ ln  = 0.002, we generate 10 6 trajectories for each box size, conditioned on  = 0 at the box scale. 5We count first-crossing masses (with  c = 1.5) in the same mass bins as in Sec.4.1.In Fig. 4, we show both these conditional mass functions and the mass functions from the corresponding simulation snapshots.For comparison, we also repeat the unconditional mass function.The conditional mass functions appear to match the simulation results reasonably well.Evidently, the  c = 1.5 model accurately accounts for halo clustering bias, at least in this case.
Next, we test whether the  c = 1.5 model can predict halo progenitor mass functions.For haloes of mass  final =  * at the scale factor , the dotted curves in Fig. 5 show the differential fraction of their mass that was in haloes of mass  prog at the earlier scale factor /(1 +  prog ).Specifically, we employ the () ∝  −1.5 simulation and consider field haloes between the masses  * /1.03 and 1.03 * at each scale factor . Then we track the progenitors of those haloes and all of their subhaloes (as determined by the rockstar and consistent-trees codes; Behroozi et al. 2013a,b) back to the scale factor /(1+ prog ) and determine the mass function of the host haloes of all of those progenitors (where a field halo is regarded as its own host).We count these progenitors in bins of width Δ ln  prog = 0.1 and stack the counts from all scale factors  for which there is a simulation snapshot.
Excursion set theory predicts progenitor mass functions in the following way.For trajectories that first cross  c = 1.5 at the  final window scale, we seek the distribution of first crossings of the higher threshold (1+ prog ) c .With a succession of window scales spaced by Δ ln  = 0.002, we use the method of Sec.2.3 to sample trajectories that cross  c = 1.5 at the mass  final =  * .To ensure that this crossing is the first, we include window scales above  final up to the point that  =  c would represent a 5 upward deviation, and then we reject any trajectories that exceed  c at any mass scale larger than  final .We obtain 10 5 trajectories in this way.For each  prog , we find the first-crossing distribution of these trajectories for the threshold (1 +  prog ) c , counting first crossings in the same bins of width Δ ln  prog = 0.1 as we used for simulated haloes.The resulting conditional mass functions are shown in Fig. 5 as solid curves.
The simulated progenitor mass functions in Fig. 5 match the excursion set predictions reasonably well.The main difference is that the simulated mass functions tend to extend to slightly higher masses (including, for low  prog ,  prog >  final ).This outcome is likely connected to the existence of "backsplash" haloes that pass through a host halo before becoming (at least briefly) field haloes again (e.g.Diemer 2021).This possibility means that the mass of a dark matter particle's field-halo host can decrease in time, which would explain why  prog can exceed  final .However, in excursion set theory, the mass of a particle's field-halo host can only increase over time.A modified halo mass definition that accounts for backsplash haloes (either treating them as subhaloes of their previous hosts or taking them to be field haloes at all times until their last infall) may be needed to match excursion set predictions more precisely.

Concordance cosmology
Excursion set theory with the top-hat window and  c = 1.5 evidently predicts  200 halo mass functions accurately for a range of scalefree cosmologies.Figure 6 now tests the model's predictions for a concordance (ΛCDM) cosmology.We compare the simulations of Diemer & Kravtsov (2015) that were carried out with Planck Collaboration et al. ( 2014) cosmological parameters, which are also part of the Erebos simulation suite (Diemer 2020a).There are three 1024 3 -particle simulations with periodic box sizes 187, 373, and 746 Mpc.Since concordance cosmology includes dark energy, it is necessary to clarify the halo mass definition further.We consider both the  200m mass definition, that of the sphere that has 200 times the average matter density, and the  200c mass definition, that of the sphere of density 200 times the total (or critical) energy density.We consider four different redshifts (different panels), in each case stacking halo counts from all three simulations.As before, we count haloes in mass bins of width Δ ln  = 0.33 and consider only haloes larger than 500 times the simulation particle mass.To generate the excursion set predictions, we sample 4 × 10 5 trajectories in  between the masses 1.5 × 10 10 M ⊙ (at which  ≃ 4) and 1.5 × 10 16 M ⊙ (at which  = 0.28).We take an interval of log.diff.Δ ln  = 0.002, resulting in  = 1327 window scales.We obtain first-crossing distributions at different redshifts  by employing the growth factor (e.g.Bueno Belloso et al. 2011), where 2  1 is the hypergeometric function and  m / Λ is the ratio of matter to dark energy density as a function of .We normalize the top-hat window mass as  = (4/3)  DM  3 , where  DM is the comoving dark matter density, and we count the resulting halo masses in the same mass bins as we used for the simulated haloes.
Figure 6 shows that the excursion set predictions closely match the simulation results at high redshifts (left-hand panels), as we expect from the results of the previous subsection, since matter dominates at these redshifts.At lower redshifts (right-hand panels), dark energy begins to dominate, leading the simulated mass functions with the two different mass definitions to diverge from each other.We find that the excursion set prediction does not match either simulated mass function but instead lands between them, although it tends to match the  200m mass function at the lowest and highest masses.The influence of the mass definition on halo mass functions has been widely discussed (e.g.Sheth et al. 2001;Tinker et al. 2008;Despali et al. 2016;Diemer 2020b), and it is possible that a different mass definition would yield a closer match at low redshifts to the excursion set theory predictions.6It may also be appropriate to vary the threshold  c with redshift, due to the increasing influence of dark energy (e.g.Lacey & Cole 1993).mass definition (motivated by spherical collapse arguments, e.g.Bryan & Norman 1998).While it matches the  c = 1.5 prediction at  = 0 for a large portion of the mass range, the match is not strong at all masses and redshifts.

UNIVERSALITY OF MASS FUNCTIONS?
When the halo mass function is expressed in units of  instead of , excursion set theory with uncorrelated steps predicts that it is universal.That is, let d  /d ln  be the differential mass fraction in haloes of the mass scale for which the rms density variance is .Then, under the approximation that steps in  as the window scale is varied are uncorrelated, d  /d ln  is predicted to have the same form for any power spectrum. 7This universality arises because if steps in  are uncorrelated, irrespective of the power spectrum.
Much attention has been paid to whether the mass functions in simulations are indeed universal and the degree to which they deviate (Jenkins et al. 2001;White 2002;Reed et al. 2003;Warren et al. 2006;Lukić et al. 2007;Reed et al. 2007;Tinker et al. 2008;Crocce et al. 2010;Bhattacharya et al. 2011;More et al. 2011;Courtin et al. 2011;Watson et al. 2013;Juan et al. 2014;Bocquet et al. 2016;Despali et al. 2016;Bocquet et al. 2020;Diemer 2020b;Ondaro-Mallea et al. 2022).However, excursion set theory with properly correlated steps does not predict that d  /d ln  is necessarily universal.For example, for a Gaussian window and a () ∝   power spectrum, which manifestly depends on the spectral index .In this section, we explore the universality of our numerically evaluated d  /d ln .2020) cosmological parameters, as extrapolated at linear order using cosmological perturbation theory.Growth of dark matter perturbations for  ≳ 10 2 Mpc −1 is suppressed because baryonic matter resists clustering on such small scales.On the top, we show the approximate mass scale associated with each wavenumber .
shown in Fig. 10, is evaluated using cosmological perturbation theory at linear order using the CLASS Boltzmann solver (Blas et al. 2011), and we extrapolate below the code's resolution limit with the analytic solution of Hu & Sugiyama (1996).We adopt cosmological parameters from Planck Collaboration et al. (2020). 11We use the natural normalization for the top-hat window mass,  = (4/3)  DM  3 , where  DM ≃ 3.31 × 10 10 M ⊙ Mpc −3 is the comoving dark matter density.We generate 10 5 trajectories in  using window masses ranging from 10 −30 M ⊙ (for which  ≃ 40) to 1.1 × 10 16 M ⊙ (for which  ≃ 0.28) and separated by Δ ln  = 0.002.We count first-crossing s in bins of width Δ ln  = 0.1.For this power spectrum, the spectral index d ln /d ln  runs from close to 1 at large scales down to nearly −3 at small scales, but this effect does not lead the predicted d  /d ln  (dotted curve in Fig. 9) to deviate significantly from Eq. ( 23).Instead, there is only significant deviation near  ∼ 10-20.This is associated with a characteristic feature in the power spectrum (Fig. 10) near  ∼ 10 2.5 Mpc −1 , which arises because baryonic matter resists clustering at smaller scales, suppressing the growth rate of dark matter perturbations on those scales (e.g.Hu & Sugiyama 1996;Bertschinger 2006).The effect of this deviation from universality is clearer in Fig. 11 23).Equation ( 23) matches the excursion set prediction reasonably well except near the dip in halo abundance below around 10 3 M ⊙ , which is associated with a feature in the power spectrum (see Fig. 10).
where we define Note that the differential mass fraction itself is then since d ln /d ln  = −1, and the differential halo number density is related by Eq. ( 16).Not only is Eq. ( 24) valid for arbitrary  c , but it can be applied to conditional mass functions as well, as we discuss next.

Conditional mass functions
Focusing on trajectories that are constrained to cross  0 at  0 , we may redefine is the differential fraction, in haloes of mass , of particles that satisfy the condition.For excursion set theory with uncorrelated steps and a constant threshold, d  /d ln  necessarily has the same form, not only for any power spectrum, but also for any  0 and  0 .However, with properly correlated steps, d  /d ln  may depend on all of these parameters.
In Appendix B, we test the universality of d  /d ln , interpreted as a conditional mass function as described above, for excursion set theory with top-hat windowing.We show that d  /d ln  is approximately universal, and is fit well by Eq. ( 24), only as long as  c ≳  0 +  0 .This condition would often be satisfied in the context of halo clustering bias, where the region under consideration is much larger than the haloes (so  ≫  0 ).However, it would fail, for example, when considering progenitor mass functions at a recent time (so  c is not much larger than  0 ), which are relevant when constructing halo merger trees.In the opposite regime,  c ≲  0 +  0 , the mass function d  /d ln  depends strongly on  0 and  0 , and it also depends on whether or not the crossing of  0 at  0 is constrained to be the first crossing of  0 .The first-crossing constraint is appropriate for progenitor mass functions but is not appropriate for studies of halo clustering bias.Compared to Eq. ( 24), d  /d ln  tends to become more sharply peaked in the  c ≲  0 +  0 regime, and it peaks at lower values of  (lower masses).

CONCLUSIONS
We evaluated halo mass functions from excursion set theory by direct numerical sampling, without the simplifying approximations used in most previous work.When a real-space spherical top-hat window function is employed, excursion set theory with a constant  c = 1.5 threshold accurately predicts halo mass functions with the  200 mass definition in cosmological simulations of a range of matter-dominated cosmologies.The model is also able to account for halo clustering bias and to predict progenitor mass functions with good accuracy.For a concordance ΛCDM cosmology, predicted mass functions lie between simulated mass functions with the  200m and  200c mass definitions once dark energy starts to become important.
In contrast, a nonconstant threshold based on ellipsoidal collapse predicts too few haloes of every mass, except if a sharp -space window function is used.The physical reason for this surprising outcome is unclear, but we note that the physical picture relevant to excursion set theory is significantly more complicated than that assumed by the modeling that motivates the ellipsoidal collapse threshold.Moreover, our result is consistent with previous studies that used machine learning to explore which information in the initial conditions is needed to predict halo masses.Apparently, the simplest excursion set theory -that with the standard spherical top-hat window and a constant threshold -is sufficient to produce accurate halo mass functions and assembly histories.The widely discussed refinement of considering a nonconstant threshold does not appear to be needed or appropriate.
Excursion set theory with the top-hat window and  c = 1.5 threshold predicts a halo mass function that is nearly universal when expressed in terms of the rms density variance  and is closely approximated by Eq. ( 24).In this sense, it agrees with a wide variety of studies noting that the mass function is either universal or nearly so, although the functional form of Eq. ( 24) is somewhat different from what has been previously proposed.Moreover, within the regimes relevant to studies of halo clustering bias or of halo progenitors in the distant past, the accuracy of Eq. ( 24) even extends to conditional mass functions.However, significant deviations from universality can arise at very small mass scales, for extreme spectral indices, or when there are features in the power spectrum.Conditional mass functions greatly deviate from universality in the range of parameters relevant to studies of recent progenitors (so, for example, the approximation of Eq. 24 is likely less suitable for constructing merger trees than is the direct calculation).It remains to be seen whether all of these deviations would accord with simulation results., 400 (red), 800 (green), 1600 (orange), and 3200 (blue).The corresponding steps in ln  and ln  are indicated in the respective panels.The left-hand and right-hand panels consider power spectra  ( ) ∝   with  = −1 and −2.5, respectively.The upper, central, and lower panels use top-hat, Gaussian, and sharp -space windows, respectively.For spherical collapse, we generate these d  /d ln  using a sample of 10 5 trajectories in , while for ellipsoidal collapse, we use only 10 4 trajectories in    , which leads to a higher level of statistical noise.For the sharp- window (bottom) with the spherical collapse threshold (crosses), we mark the exact analytic d  /d ln  with horizontal lines.For this window, it is difficult to achieve convergence even for very high  , a consequence of how abruptly the trajectories in  can jump (see Fig. 1).However, for the top-hat and Gaussian windows, the trajectories in  are much smoother, and the d  /d ln  appear to be reasonably well converged as long as Δ ln  ≲ 0.01.

Figure 1 .
Figure 1.Example trajectories of the tidal tensor    as a function of averaging mass scale  at a fixed position and time.We decompose    into the density contrast  (upper panels), the ellipticity  (middle panels), and the prolateness  (lower panels, expressed in units of ).We consider a single trajectory for each of three averaging window functions (different colours) and two different scale-free power spectra (left-hand versus right-hand panels).The mass  is expressed in units of the mass scale  * on which the rms variance in  (, upper axis) is unity.Where  < 0, we do not plot  and .The trajectories with the sharp -space window (green) are extremely noisy, which reflects a lack of correlation between steps in  as  is varied.For the top-hat (blue) and Gaussian (orange) windows, the trajectories are much smoother.The circles mark first crossing of the spherical collapse threshold  c = 1.686 for each trajectory, while the squares mark first crossing of the ellipticity-and prolateness-dependent ellipsoidal collapse threshold  c = 1.686  ec (, ).

Figure 2 .
Figure2.Differential fraction of particles d  /d ln  in haloes of mass , as evaluated via excursion set theory through our numerical procedure (solid curves).We consider both the constant spherical-collapse threshold  c = 1.686 (blue curves) and the ellipsoidal collapse threshold  c = 1.686  ec (, ) (orange curves; see Eq. 15).The upper panels are for a  ( ) ∝  −1 power spectrum, while the lower panels are for  ( ) ∝  −2.5 .From left to right, we show results for the top-hat, Gaussian, and sharp -space window functions.The d  /d ln  mass functions are expressed in units of the characteristic mass scale  * on which the rms variance is 1; note that  * depends on the window function.For comparison, the dashed curves show the classic analytic predictions(Press & Schechter 1974;Sheth et al. 2001), which are the same in the left-hand, centre, and right-hand panels.Meanwhile, the black dotted curves show the  200 mass functions from the simulations ofDiemer & Kravtsov (2015) (which depend on the window only because  * does).The numerically evaluated mass functions with top-hat and Gaussian windows differ significantly from the analytic approximations.For these windows, the ellipsoidal collapse threshold yields too few haloes of every mass compared to the simulation results.

Figure 3 .
Figure 3. Comparing halo mass functions from excursion set theory with simulation results.As in Fig. 2, the upper panels show the differential fraction d  /d ln  of mass that resides in haloes of mass .For the excursion set predictions, we use top-hat windowing and adopt the threshold  c = 1.5.For the simulations, we adopt the  200 mass definition.Different panels represent different scale-free power spectra.Shading indicates the 90 per cent confidence uncertainty bands.The lower panels show the (base e) logarithmic differences Δ ln(d  /d ln  ) between simulations and predictions.Excursion set theory with  c = 1.5 appears to predict the simulation mass functions in all cases to a high level of accuracy.

Figure 4 .
Figure 4. Testing how accurately excursion set theory with top-hat window and  c = 1.5 accounts for bias due to the simulation box size.The points (with 2 Poisson uncertainty bars) are mass functions d  /d ln  for the  ( ) ∝  −2.5 simulation at three different snapshots (different colours), specified by how the box size compares to the characteristic mass scale  * (at which  = 1).For visual convenience, points corresponding to the same mass bin have slightly different horizontal offsets for different simulations.The solid curves (with 2 Poisson uncertainty bands) show the conditional mass functions predicted by excursion set theory for the same box size.For comparison, the black curve is the unconditional mass function.

PFigure 5 .
Figure5.Testing how accurately excursion set theory with top-hat window and  c 1.5 predicts progenitor mass functions.For field haloes of mass  final =  * at a scale factor  in the  ( ) ∝  −1.5 simulation, the dotted curves show the differential fraction d  /d ln  prog of their mass that is determined to have been in field haloes of mass  prog at the earlier scale factor /(1 +  prog ).The solid curves show the corresponding predictions of the excursion set theory.Different colours correspond to different  prog .Predictions generally match the simulation results well.The principal discrepancy is at the largest masses, which may be related to backsplash haloes, as discussed in the text.

Figure 6 .
Figure 6.Testing excursion set theory predictions in a concordance cosmology.The upper panels show the differential fraction d  /d ln  of mass that resides in haloes of mass .For the excursion set predictions (solid curves), we use top-hat windowing and the threshold  c = 1.5, as in Fig. 3.For the simulations (dotted curves), we consider the  200m (black) and  200c (magenta) mass definitions.Shading marks the estimated 2 Poisson uncertainty for each curve.Different panels correspond to different redshifts.The lower panels show the (base e) logarithmic differences Δ ln(d  /d ln  ) between simulations and predictions.Excursion set theory with  c = 1.5 accurately predicts the simulation mass functions at high redshift, when matter dominates, but at low redshifts the prediction lies between the  200m and  200c mass functions.

Figure 10 .
Figure 10.Cold dark matter power spectrum at redshift  = 0 for Planck Collaboration et al. (2020) cosmological parameters, as extrapolated at linear order using cosmological perturbation theory.Growth of dark matter perturbations for  ≳ 10 2 Mpc −1 is suppressed because baryonic matter resists clustering on such small scales.On the top, we show the approximate mass scale associated with each wavenumber .

Figure 11 .
Figure 11.Differential fraction d  /d ln  of dark matter in haloes of mass  for a concordance cosmology at redshift  = 0; we use the power spectrum in Fig. 10.The solid blue curve shows the excursion set prediction for a top-hat window and  c = 1.5 threshold, while the shading indicates the 2 Poisson uncertainty range.The dashed curve shows the mass function predicted by Eq. (23).Equation (23) matches the excursion set prediction reasonably well except near the dip in halo abundance below around 10 3 M ⊙ , which is associated with a feature in the power spectrum (see Fig.10).

Figure A1 .
Figure A1.Testing the degree to which numerically sampled set halo mass functions are converged.As in Sec. 3, we plot the differential fraction d  /d ln  of particles deemed to reside in haloes of mass , considering both the spherical collapse threshold  c = 1.686 (crosses) and the ellipsoidal collapse threshold  c = 1.686  ec (, ) (squares).Each sequence of 5 differently coloured points marks d  /d ln  at the same mass ; the horizontal separation is only for visual convenience.We evaluate these d  /d ln  for different numbers  of logarithmically spaced mass scales:  = 200 (purple), 400 (red), 800 (green), 1600 (orange), and 3200 (blue).The corresponding steps in ln  and ln  are indicated in the respective panels.The left-hand and right-hand panels consider power spectra  ( ) ∝   with  = −1 and −2.5, respectively.The upper, central, and lower panels use top-hat, Gaussian, and sharp -space windows, respectively.For spherical collapse, we generate these d  /d ln  using a sample of 10 5 trajectories in , while for ellipsoidal collapse, we use only 10 4 trajectories in    , which leads to a higher level of statistical noise.For the sharp- window (bottom) with the spherical collapse threshold (crosses), we mark the exact analytic d  /d ln  with horizontal lines.For this window, it is difficult to achieve convergence even for very high  , a consequence of how abruptly the trajectories in  can jump (see Fig.1).However, for the top-hat and Gaussian windows, the trajectories in  are much smoother, and the d  /d ln  appear to be reasonably well converged as long as Δ ln  ≲ 0.01.

Table 2 .
Diemer & Kravtsov (2015)limits for the scale-free simulations ofDiemer & Kravtsov (2015)employed in this work compare to the characteristic mass scale  * (on which the rms variance of  is 1).Here we evaluate  * using a top-hat window.
27)and let d  /d ln  be the differential fraction of these trajectories that first cross  c at  (which depends on ).Since  is a function of , which is in turn a function of the mass scale, d  /d ln  is a conditional mass function of the sort discussed in Sec.4.2, useful for studying halo clustering bias and the mass functions of halo progenitors.