Improving the Accuracy of Halo Mass Based Statistics For Fast Approximate N-body Simulations

Approximate N-body methods, such as FastPM and COLA, have been successful in modelling halo and galaxy clustering statistics, but their low resolution on small scales is a limitation for applications that require high precision. Full N-body simulations can provide better accuracy but are too computationally expensive for a quick exploration of cosmological parameters. This paper presents a method for correcting distinct haloes identified in fast N-body simulations, so that various halo statistics improve to a percent level accuracy. The scheme seeks to find empirical corrections to halo properties such that the virial mass is the same as that of a corresponding halo in a full N-body simulation. The modified outer density contour of the corrected halo is determined on the basis of the FastPM settings and the number of particles inside the halo. This method only changes some parameters of the halo finder, and does not require any extra CPU-cost. We demonstrate that the adjusted halo catalogues of FastPM simulations significantly improve the precision of halo mass-based statistics from redshifts $z=0.0$ to $1.0$, and that our calibration can be applied to different cosmologies without needing to be recalibrated.


INTRODUCTION
Numerical simulations are a key tool for cosmological studies to determine the fundamental characteristics of the universe (see e.g., Angulo & Hahn 2022, for a review).These simulations can be used to generate mock catalogues of the real universe with varying cosmological parameters.Full N-body simulations with large volume and high mass resolution are still very expensive in terms of CPU time and disk storage (e.g.Potter et al. 2017;Heitmann et al. 2019;Ishiyama et al. 2021;Hernández-Aguayo et al. 2023).Approximate N-body solvers (Colavincenzo et al. 2019;Lippich et al. 2019;Monaco 2016) provide an alternative to full N-body simulations to quickly produce realisations of the large-scale structure (LSS).These approximate methods include low-resolution N-body techniques such as FastPM (Feng et al. 2016;Chartier et al. 2021) and COLA (Tassev et al. 2013;Howlett et al. 2015;Izard et al. 2016Izard et al. , 2018;;Wright et al. 2023;Ding et al. 2024), and schemes based solely on Lagrangian perturbation theory such as PINOCCHIO (Monaco et al. 2002;Rizzo et al. 2017).More recent developments include the use of deep neural networks to attempt to predict the formation of nonlinear structures of the universe (e.g.He et al. 2019).
Although the halo catalogues from approximate simulations are correlated to those from full N-body simulations with the same initial conditions, the dark matter distribution on small scales is substantially less accurate.Increasing the particle-mesh resolution and performing finer timesteps may enhance the accuracy on small scales, ★ E-mail: guohong@shao.ac.cn † Email: vspringel@mpa-garching.mpg.debut the resulting improvement may not be worth the additional computational cost if the precision stays below the several percent level for the dark-matter power spectrum at scales  > 1 ℎ Mpc −1 .The reason for a deficit of the predicted matter power spectrum at these scales in the approximate methods is their inability to resolve the detailed density profile of haloes around the centre (e.g.Neto et al. 2007), as well as the substructure they contain (e.g.Gao et al. 2012).
For the same reason, the haloes actually reach a modified virial equilibrium where the density profile is much shallower than the NFW profile (Navarro et al. 1997) expected from high-resolution simulations, and this leads to an underestimate of the halo mass function compared to full N-body simulations.These defects have a negative effect on the utility of the approximate simulations for cosmological applications in studies of large-scale structure, weak gravitational lensing, SZ effect, etc.Despite the advantages of approximate methods, there have been efforts to improve them while preserving their low computational cost.Dai et al. (2018Dai et al. ( , 2020) ) proposed a gradient-based approach that mimics the short-range force that is absent in FastPM simulations.After calibration, the matter power spectrum can be improved to a precision of a few percent for a range of wave numbers  and different redshifts.However, the free parameters in this method must be calibrated for different simulation settings.Furthermore, their research focused on the properties and substructures of FOF haloes.Fiorini et al. (2021) modified the FOF haloes identified in COLA to spherical overdensity haloes to facilitate research on modified gravity models.
This paper presents a method for correcting the mass  200 of individual haloes, so that the accuracy of basic halo statistics is improved to the percent level when compared to full N-body simulations.Our approach is based on the fact that the particle distribution on small scales is less concentrated in approximate simulations, meaning that the halo radii contain fewer particles than in full Nbody simulations.This results in the overdensity of the material belonging to the halo not reaching a density contrast of 200 times the critical density, but a lower value.To quantify this effect, we ran FastPM simulations with different particle numbers and cosmological parameters to compare them with full N-body simulations.This enabled us to study the relation between the effective halo density and the number of particles in the haloes.This, in turn, allows us to apply a fiducial halo overdensity, resulting in a corrected halo mass estimate.An advantage of this method is that the free parameters of the correction method are only dependent on the starting redshift and the number of time steps used for the FastPM simulation and are not significantly influenced by other cosmological parameters.
This paper is organised as follows.We first describe in Section 2 the details of our correction method for approximate simulations and then investigate its performance for different halo statistics, including the halo mass function and the halo clustering power spectrum in real and redshift spaces.In Section 3 we run a set of five pairs of full N-body and matching FastPM simulation boxes with different cosmological models to test the applicability of our method to different cosmologies.In Section 4, we then extend the correction method to higher redshifts.In Section 5, we apply our method to the creation of mock galaxy catalogues with halo occupation distribution modelling.Finally, we discuss our results in Section 6, and summarise in Section 7.

METHODS
In this section, we provide a brief overview of the approximate FastPM approach and discuss the details of the full N-body simulations of the IllustrisTNG300-Dark series, which we use for comparison.We use the same initial conditions for the N-body simulation and run FastPM for a range of particle numbers to empirically determine the relationship between the halo density, the halo particle number, and the total particle number density.

Approximate FastPM method
FastPM (Feng et al. 2016) is a powerful and efficient approximate particle-mesh N-body solver that uses modified kick and drift operators to ensure that linear growth is accurately reproduced on large scales, even with a limited number of time steps.It has the same initial condition generator as the N-GenIC (Springel 2015) and AREPO code (Springel 2010;Weinberger et al. 2020), meaning the same initial conditions are generated when using the same random number seed.Alternatively, FastPM can also be used with externally prescribed initial conditions, such as a linear density perturbation field at  = 0 that is then scaled back to the starting redshift.
We opt for a linear time step in the scale factor for our FastPM simulations, which produces better halo statistics than logarithmic time steps (Feng et al. 2016).We ran all FastPM simulations from  = 9 to  = 0 in 40 steps.The Fourier mesh size is set to four times the number of particles per dimension,i.e.Nmesh=4.For example, a simulation of 1024 3 particles has a mesh size of 4096 3 .We have tested cases with different Nmesh, and the results indicate that for Nmesh=4 a best compromise between the obtained halo properties and the invested computational resources is reached.
We are only considering the dark-matter simulations of TNG, which are conducted with the moving-mesh code AREPO in three different box sizes and with varying resolutions.The TNG simulations were carried out with Planck15 cosmology (Planck Collaboration et al. 2016), which includes Ω  = 0.3089, Ω  = 0.0486, ℎ = 0.6774,   = 0.9667, and  8 = 0.8159.The simulations evolved from redshift  = 127 to the present day.We focus on the dark-matter-only suite of TNG300-2-Dark and TNG300-3-Dark, which have 1250 3 and 625 3 dark matter particles, respectively, in a 205 ℎ −1 Mpc periodic box.We will refer to these simulations as TNG.
We compare the halo properties between FastPM and N-body simulations by running FastPM simulations with the same initial conditions as TNG.Unfortunately, FastPM cannot work with an uneven number of particles, such as 625 3 .To compensate for this, we run FastPM with slightly different total particle numbers of 480 3 and 640 3 for TNG300-3-Dark (625 3 ), and with 960 3 and 1280 3 for TNG300-2-Dark (1250 3 ), as indicated in Table 1.
For fair comparisons, we applied the phase space halo finder of ROCKSTAR (Behroozi et al. 2013) to FastPM and TNG.We focus on the halo mass definition of  200c , i.e., the average mass density enclosed in a halo is 200 times the critical density,  cr .

One-to-one halo matches
When the same initial conditions are applied, the large-scale structures of the TNG and FastPM simulations agree with each other (Feng et al. 2016).However, the one-to-one halo comparisons between FastPM and full N-body simulations may differ depending on the cosmological parameters and mass resolutions.To quantify the discrepancies on a one-to-one halo basis, we need to identify pairs of cross-matched haloes in the TNG and FastPM simulations, which were both run with the same initial conditions.For a given main halo in TNG with a radius of  200c , we search for all the neighbouring haloes within 2 200c in the FastPM simulation box.If more than one halo is found, we select the most massive halo as the appropriate match.
In Figure 1, we show a halo with a mass of approximately  200c ∼ 10 14 M ⊙ to demonstrate the similarity between FastPM and TNG.The spatial positions and shapes of the haloes in both simulations are comparable.However, the halo masses in FastPM are usually much lower than those in TNG, and the percentage difference in mass decreases with the halo mass.Figure 2 shows examples of halo density profiles for TNG haloes (solid lines) and FastPM  The upper panel of Figure 3 displays the ratio of the halo masses of the corresponding pairs when the haloes in TNG are matched with FastPM.It is evident that the masses of most haloes in the FastPM simulation are lower than the corresponding ones in TNG, and the proportion of less massive haloes is higher than that of more massive ones.The middle panel shows the distance of the matched FastPM haloes to the corresponding TNG halo divided by the radius of the TNG halo, with the blue line showing the running median value.For massive haloes, the one-to-one matched haloes in the FastPM simulation are always situated within the virial sphere of the corresponding TNG halo, and the pair distance is always very small.For low-mass haloes, the distances become significantly larger in terms of the TNG halo radius, and there are more haloes located outside of the original TNG halo sphere.Although the average distance remains within the virial sphere of the primary TNG halo, the number of misidentifications probably starts to increase.
To obtain reliable statistics, we only consider halo pairs as valid matches when they are separated by no more than one halo radius and the mass ratio of  TNG / is between 0.5 and 2. The bottom panel of Figure 3 shows the percentage of one-to-one halo matches that meet this reliability criterion.This fraction decreases with decreasing halo mass.However, for a wide range of halo masses, the percentage of successfully identified one-to-one halo matches is greater than 90%.As expected, the higher the halo mass, the higher this fraction is.

Empirical halo mass correction
We attempt to match the mass of the haloes in FastPM to those in TNG by drawing spheres from the halo centre in FastPM and calculating the enclosed mass.When the mass inside the sphere is equal to the halo mass in TNG, we assume that the new haloes contain a similar Middle panel: Distance of the halo centres of pairs of matched haloes in FastPM and TNG simulations, in units of the radius of the halo in the TNG simulation.The blue line marks a running median value.Most haloes in the FastPM simulation are located within the virial sphere of the corresponding TNG halo, and the more massive the halo, the smaller the spatial offsets tend to be.Lower panel: Fraction of matched halo pairs in the FastPM and TNG simulations that can be considered robust identifications.Even for low particle numbers per halo, more than ∼ 90 percent of the haloes can be robustly matched across the simulations amount of material as in TNG.However, even when the halo masses are equal, other halo properties may still differ.
To gain a more concise representation of the mean correction, we calculated the density inside the newly discovered haloes and studied the connection between the mean enclosed density and the original particle number of the halo.Figure 4 shows the relationship between the number of particles and the overdensity of the corrected haloes determined in our FastPM simulation set, with the colours corresponding to the results for different total numbers of particles used for the FastPM simulations, as labelled.We can observe that the inferred density of small haloes is usually lower than that of larger ones, until the haloes become large enough and a plateau appears.We can describe the behaviour of this empirical relation with three The dashed lines give our corresponding fitting formula, which is meant to match this empirical finding, adjusted however at the low particle number end in order to avoid an overcorrection of the halo abundance there.The different colors correspond to different particle resolutions employed in the FastPM simulations.
parameters: a slope  for the increase part at low masses, a turning point in the number of particles  when an approximately constant value is attained, and a height ℎ for the overdensity value of the emerging plateau.When the number of particles inside the original FastPM halo is , a fitting formula for the average overdensity of the empirically corrected FastPM haloes can be expressed as where we denote   as the overdensity of the corrected halo.For each halo with a particle number  in the original FastPM simulation, we assign an overdensity   according to Eq. ( 1).The new halo mass and its other properties are then calculated from the sphere with an enclosed density of    cr .
The parametric form of Eq. ( 1) can be used to fit the data of the filtered set of halo pairs.However, when applied to all haloes in a FastPM simulation, the resulting mass function does not match that from a full N-body simulation at the low-mass end.This is due to the coarser resolution of the structures in the FastPM box, which leads to a higher rate of spurious haloes and a larger number of smaller haloes that have not yet merged.This effect is clearly illustrated in Figure 7, where the halo located in the centre of the TNG panel is incorrectly identified as two smaller haloes in FastPM.To address this problem, we found that changing the slope value  to  ≃ 45 and keeping  = 10 3 (as suggested in Figure 4) produces the best reproduction of the halo mass function and clustering of the TNG haloes.
The height of the plateau ℎ is slightly influenced by the resolution of the FastPM simulation, as demonstrated in Figure 5.This figure shows the connection between ℎ and the average particle distance of the simulation  s , with the following fitting formula, ℎ = 26.81log(  / ℎ −1 Mpc) + 166.11. (2) The resolution of the simulation is represented by  s , and it appears that the height ℎ is directly proportional to the logarithm of the mean  particle separation of the FastPM simulation.This implies that a fitting formula for   can be easily derived when the particle number density is changed.It is important to note that the configuration of the FastPM parameters has a significant impact on the fitting parameters.We also emphasise that the fitting parameters depend on the adopted Nmesh value, which we set as four times the particle number per dimension.In this sense, the force resolution and the mass resolution are coupled in this study.

𝑓 𝑐 -corrected halo catalogues
We can use the ROCKSTAR code (Behroozi et al. 2013) to find the particle groups and the spherical overdensity mass of the haloes, with the empirically determined halo overdensity   .This public code, which usually employs a constant overdensity value, can be easily modified to determine a variable spherical overdensity mass   instead by changing its properties.cfile.Our modification replaces the original definition of the halo mass  500 in ROCKSTAR with our   corrected value, while the original standard  200 is still kept for comparison.We compare the standard  200 halo catalogues of FastPM and simulations to validate the    halo catalogues constructed for FastPM.Taking into account the limited force resolution and coarse timestepping of the approximate methods, we only analyse the properties of individual haloes without considering subhaloes.Figure 6 shows a comparison of pairs of matched haloes, considering the FastPM masses of the  200 halo (left panel) and the    halo (right panel) compared to the spherical overdensity masses of the corresponding haloes in TNG.The line with unit slope marks the identity.It is evident that the    halo masses fit the TNG haloes quite well from the low-mass end to the high-mass end, while the plain  200 halo masses are always slightly lower than those found in the TNG simulation.The matching works quite well, especially at the high-mass end, while for low-mass haloes, a larger number of spurious haloes and haloes that are not identified in the FastPM simulation create substantial scatter.
A statistical comparison is presented in Figure 8, which compares the halo mass functions of four FastPM simulations with those of their corresponding TNG simulations.FastPM runs of 480 3 and 640 3 particles are compared with TNG-3, while FastPM runs of 960 3 and 1280 3 are compared with TNG-2.The dashed lines indicate the results when the ordinary  200 halo masses are used, whereas the solid lines are based on the   halo masses for FastPM runs.The thin vertical lines mark halo masses corresponding to 100 particles for the four different FastPM simulations.It is evident that the  200 halo catalogues have a large discrepancy compared to the TNG catalogues, the difference in the halo number density being greater than 10% even for the most massive haloes.In contrast, the precision of the    catalogues is much improved, and the halo density agrees with TNG to within 5% for almost the entire mass range, except for slightly larger deviations at the low and high mass ends.
We also analyse the halo clustering power spectrum in both realand redshift spaces for the corrected halo catalogues.Figure 9 shows the comparison of the bulk velocities of the haloes in FastPM 1280 runs and the corresponding haloes in TNG-2.We randomly select one spatial direction vector for each halo and compare the velocity along this direction in the left panel, as well as the absolute value of the full velocity in the right panel.The results show that the value and individual components of the velocity match well with the TNG on average, so the velocity can be considered reliable for placing the haloes into redshift space.
We can compare the halo power spectra of FastPM and full Nbody simulations by selecting haloes with a mass of more than 100 particles for each FastPM simulation.The results are shown in Figure 10, with the upper panel showing the power spectrum in real space and the lower panel showing the corresponding measurement in the redshift space at redshift  = 0.3.The dashed lines represent the results for the selected  200 haloes, while the solid lines represent the selected    haloes.Compared to the original  200 halo catalogues, the power spectrum of the   halo catalogues shows a significant improvement and is within the accuracy 5% for the entire range of wave numbers.Even at the smallest scale, the precision is still better than 2 percent for the power spectrum in both real and redshift spaces.
In the analysis of the halo power spectrum in redshift space, multipole moments are of significant importance.Therefore, we present the results for the quadrupole moment of the redshift-space halo power spectrum ( 2 ) in Figure 11.Considering the consistent outcomes across different particle numbers of  = 480, 640, 960, and 1280, we only display the results for  = 1280, for clarity.It can be observed that both the corrected FastPM and the   method closely align with the trends exhibited by the TNG model for the quadrupole moment.However, unlike for the real space and redshift space power spectra, the   method does not yield significant improvement in the multipole moments on large scales, but it is clearly better than the original FastPM results on small scales of  > 1 ℎ Mpc −1 .The ratios of  2 () FastPM / 2 () TNG for the original FastPM and   results are shown in the bottom panel as the yellow and blue lines, respectively.The large discrepancies at around  ∼ 0.5 ℎ Mpc −1 are simply caused by the small values of  2 () TNG that is approaching zero at these scales.

APPLICATION TO DIFFERENT COSMOLOGIES
In Section 2, we developed the   -correction method to substitute the standard  200 halo catalogues in FastPM simulations with modified ones that are more similar to the results of full N-body simulations such as IllustrisTNG-300-Dark.We now want to investigate whether this method can be used in other contexts, particularly for different cosmologies, as this would enable the utilisation of cost-effective FastPM simulations in cosmological inference.We conducted five sets of simulations with varying cosmological parameters, which is a much wider range than what is currently accepted by cosmological constraints.Full N-body simulations of this validation set were run with the AREPO code, just as the IllustrisTNG series, while FastPM runs used default settings.The five pairs of simulations had the same box size of 205 ℎ −1 Mpc and a total particle number of 512 3 to reduce computational cost.The cosmological parameters of the models are listed in Table 2 for reference.
For each pair of simulations with the same cosmological parameters and initial conditions, we apply the   -correction method to the FastPM simulations to obtain halo catalogues with plain  200 and corrected    halo masses.Figure 12   clustering power spectrum in real space.The vertical black lines in the upper panel represent a fiducial halo mass composed of 100 particles.As the particle mass differs slightly in different cosmologies, we just draw one vertical line for simplicity.The same mass limit has also been applied to select the haloes used for calculating the clustering power spectra shown in the lower panel.The precision achieved for the   halo catalogue is consistent with that found in Section 2, and both the halo mass function and the power spectrum agree within 5% with the full -body results.For the model 's1', the accuracy of the correction for the halo mass function is slightly worse at the massive end, but this could still be due to small number statistics.Nevertheless, these results demonstrate that the empirically calibrated   correction method can be applied to different cosmologies over a wide range of cosmological parameters without requiring a renewed calibration.
Comparisons of the quadrupole moment ( 2 ( )) between FastPM and TNG at redshift  = 0.3 in redshift space.The dark, yellow, and blue lines correspond to the outcomes of TNG, FastPM, and   , respectively.The top panel illustrates the power spectrum quadrupole moments, while the bottom panel displays the ratios of  2 ( ) FastPM / 2 ( ) TNG for the original FastPM and   results.

REDSHIFT DEPENDENCE
Previously, we discussed why approximate methods are much faster than full N-body simulations.This is because they only require tens of steps, whereas full N-body simulations require thousands.In addition, FastPM uses comparatively low force resolution (plain PM instead, e.g., TreePM) which is faster to compute.In the setup studied, FastPM only needs 40 steps, which are linearly spaced from  = 0.1 to  = 1.This means that FastPM has very few steps at higher redshifts.In fact, it only takes 17 steps to reach  = 1.0, suggesting that statistics at high redshifts may not be as accurate as at  = 0.
We analyse five redshifts:  = 0.0, 0.3, 0.5, 0.7, and 1.0, which cover the most important redshift range for cosmological studies with galaxy surveys.Applying the same methods as in Section 2, Figure 13 shows the relationship between particle number and the mean effective overdensity of the mass-matched haloes identified at different redshifts.The solid lines represent the average results for the matched pairs of haloes and demonstrate three distinct systematic characteristics.Firstly, the slope of the increase at low masses, , is similar for different redshifts.Secondly, the height of the plateau, ℎ, remains constant.Third, the turning point moves to a lower particle number with increasing redshift.Therefore, we use the same slope  and height ℎ as for the redshift  = 0, but we simply multiply the turning point by the cubic scale factor  3 , to account for the variation in the redshift of the turning point.The corresponding fitting formula 11.0 11.5 12.0 12.5 13.0 13.5 14.0 log(M/M ) 0.5 0.6 0.7 0.8 0.9 1.0 Figure 12.Comparison of halo mass functions (upper panel) and halo power spectra (lower panel) for 5 sets of simulations with different cosmological parameters, as listed in Table 2.The dashed lines are the results of ordinary halo catalogues based on an overdensity of 200 relative to the critical density, whereas the solid lines are for   -overdensity values in the FastPM halo catalogues.For the 5 sets of cosmology settings explored, the precision of the halo mass function and the power spectrum stays (almost) always within 5 percent, implying that the correction method can be applied independently of cosmology for a reasonable range around the ΛCDM concordance model.can thus be expressed as (3) The dashed lines in Figure 13 illustrate the fit results, where a shallower slope was deliberately used to prevent an excessive correction of the halo mass function at the low mass end.The halo mass function and the halo power spectra for the corrected catalogues are shown in Figure 14.For the original  200 halo catalogues, the halo mass function and power spectra generated by FastPM become worse with increasing redshift.Although the precision of the corrected results also decreases somewhat at higher redshifts, we still find that they meet the accuracy goal of 5% for both the halo mass function and the halo power spectrum.Compared to results at other redshifts, the results for  = 1.0 are less accurate, particularly for the massive halo end in the halo mass function but also for the clustering power spectrum.However, this is not unexpected, since FastPM only invests 17 steps to obtain the results at redshift  = 1.0, and thus it is expected that the halo mass function has a poor accuracy for most non-linear objects.

HALO OCCUPATION DISTRIBUTION MODELING
The halo occupation distribution (HOD) approach is a statistical technique used to model the distribution of galaxies within dark matter haloes (see e.g., Jing et al. 1998;Berlind & Weinberg 2002;Cooray & Sheth 2002;Yang et al. 2004;Zheng et al. 2007;Zehavi et al. 2011;Guo et al. 2015).The method prescribes the probability of a dark matter halo of a given mass hosting a certain number of galaxies.HOD models have been used to investigate a variety of astrophysical phenomena, from understanding galaxy clustering and the shape of galaxy power spectra to tracing the evolution of cosmic structures over time.Additionally, HOD models are an essential tool for interpreting the clustering patterns of galaxies in large-scale surveys, giving us insight into the underlying cosmological framework and the relationship between dark matter and luminous structures in the universe.
In order to model the clustering of galaxies, we use the HOD approach and the parameterisation proposed by Zheng et al. (2007).This involves splitting the mean occupation function, ⟨ tot ()⟩, which is the average number of galaxies in a given sample located in haloes of mass , into two components, central galaxies ⟨ cen ()⟩ and satellite ⟨ sat ()⟩ galaxies, which are expressed as follows, This model has three parameters for satellite galaxies: the cutoff mass scale  0 , the normalisation mass scale  ′ 1 , and the powerlaw slope  at the high-mass end.The central galaxy occupation function is characterised by the cutoff halo mass  min , and the scatter between the galaxy luminosity and the halo mass  log  .For testing purposes, we used the values of these HOD parameters for galaxy samples with -band luminosity thresholds   < −19, −20, and −21 listed in Table 2 of Guo et al. (2015).The halo masses are all expressed in units of ℎ −1 M ⊙ .We then employ the Python package Halotools (Hearin et al. 2017) to generate our galaxy catalogues and measure the projected galaxy correlation function   .
To further facilitate comparisons, we also applied the abundance matching method.We first rank order all the FastPM haloes by their uncorrected masses and then use the abundance method with the TNG simulation to assign the "corrected" halo masses.We compare the HOD model predictions from the original FastPM, the   method, and the abundance matching results in Figure 15.For each set of HOD parameters, we generate 100 mock galaxy catalogues with different random number seeds to estimate the measurement errors for the TNG300-2-Dark simulation, the original FastPM halo catalogue, the abundance matching halo catalogue and the   corrected halo catalogue with 1280 3 particles (see Table 1).The mean galaxy number density for each of the three luminosity thresholds in each simulation is presented in Table 3.The galaxy number density in the original FastPM halo catalogues is usually more than 10% lower than in the N-body simulation, while the   -corrected method improves the accuracy to within 1 percent or better.As for the abundance matching method, the number density almost equals the one in the TNG simulation.
We computed the projected galaxy correlation function  p ( p ) by integrating the 3D two-point correlation function up to a maximum distance of  max = 40 ℎ −1 Mpc as described in Guo et al. (2015).We show the ratios of  p,Fast / p,TNG for the original FastPM The ratios of  p,Fast / p,TNG for the original FastPM (dashed lines), the   method (solid lines) and the abundance matching method (dotdashed lines).The error bars denote standard errors from counting statistics.From the top to bottom, the three panels represent the results for the three luminosity threshold samples of   < −19,   < −20 and   < −21, respectively.We can observe that the abundance matching method shows some improvement over the original halo catalogue, particularly in enhancing the representation of high-luminosity galaxies.However, it does not significantly enhance the statistical properties of low-luminosity galaxies.When the   method is employed, the accuracy of the measurement is improved on all scales.(dashed lines), the   method (solid lines) and the abundance matching method (dot-dashed lines) in Figure 15.The error bars denote standard errors from counting statistics.The three panels are assigned to the three luminosity thresholds of   < −19,   < −20 and   < −21, respectively.We can observe that the abundance matching method shows some improvement over the original halo catalogue, particularly in enhancing the representation of high-luminosity galaxies.However, it does not significantly enhance the statistical properties of lower-luminosity galaxies.When the   method is em-ployed, measurements on both large and small scales are significantly improved.The deviation from the TNG simulation is around 1% on large scales and less than 5% on small scales.This finding confirms that it is possible to use the empirically calibrated   -correction approach for HOD schemes.This thus demonstrates that this method is an effective tool for creating mock galaxy catalogues quickly.

DISCUSSION
In principle, the   correction will have an impact on more than just the halo mass, such as the halo radius, the scale radius, the spin parameter and the formation time.We discovered that the halo radius  200 associated with  200 was smaller in the original simulation than in the full N-body simulations.However, with the   correction, the new halo radius was more consistent with the full -body value.
Although reasonable estimates of the halo radius were obtained, the halo scale radius and concentration parameter in the FastPM simulations cannot be accurately determined as a result of insufficient small-scale resolution.
We also compare the spin parameter () in the original and corrected FastPM halo catalogues and find that the agreement with the full -body results is improved with the   correction.However, for the halo formation time parameter (defined as the epoch when the halos reach half of their peak mass values in the whole merger trees), both the corrected and uncorrected FastPM halos show considerable scatters compared to the full -body measurements.This implies that the halo formation time should be used with caution in the FastPM simulations.
This present work only focused on the FastPM technique, however, simulations created with L-PICOLA (Howlett et al. 2015) should have similar features and should be able to be adjusted with a comparable correction approach.We also expect that halo catalogues produced with COLA could be corrected with the direct method presented here, modulo a possible need for a recalibration.Since the subhalos are not well resolved in such rapid simulation techniques, including the substructures will necessitate extra efforts to calibrate the halo catalogues.We postpone this exploration to future work.

CONCLUSIONS
This paper proposes a simple solution to rectifying the mass definition of haloes identified in approximate simulations, such as FastPM.Our goal is to make the halo catalogues generated from these costeffective simulations more similar to those from full N-body simulations, particularly in terms of the halo mass function and halo clustering.If successful, the halo catalogues can be used as input for galaxy assignment models, such as HOD modelling, to create more accurate mock galaxy surveys.Our conclusions can be summarised as follows.
(i) The halo masses in the FastPM simulations are significantly underestimated at the low-mass end.Without any corrections, the halo mass functions will be underestimated by more than 10% for low mass haloes (Figure 3).
(ii) We propose a correction method for the definition of halo overdensity (Equation 3), which is based on comparing the halo masses from the FastPM and full -body simulations of IllustrisTNG suites.This method is accurate (within 5% for most cases) and can be used to obtain accurate mass definitions for the FastPM simulation outputs at different redshifts (up to  = 1) and for different cosmologies.
(iii) We tested simulations with different resolutions and found that the halo power spectra measured at various redshifts and cosmologies using the   -corrected FastPM haloes are in agreement with the full -body results, as demonstrated in Figures 12 and 14.
(iv) We further tested the HOD modelling approach on the FastPM haloes that had been adjusted with the   -correction.We discovered that, with the   -correction, the FastPM haloes could reproduce the clustering results of the full -body simulations with an accuracy of up to 5%.This is a considerable improvement over the original FastPM halo catalogues in terms of small-scale clustering measurements.This shows that the correction method proposed in this work can be very beneficial in creating galaxy mock catalogues for various cosmologies and redshifts for upcoming surveys.

Figure 1 .
Figure 1.Visualisation of the particle distribution in a slice from TNG (left panel) and FastPM (right panel), at the same position in the simulation box.In the left panel, the black circle marks a halo with an enclosed overdensity of 200 times the critical density in TNG.In the right panel, the black circle is the exact same circle as in the left panel, whereas the blue circle gives the boundary of the halo in FastPM when constructed with an equal overdensity value of 200.The green circle, on the other hand, is the corresponding boundary of a halo with overdensity   , which (on average) contains the same mass as TNG haloes of this size.Unlike for the spherical overdensity mass, the locations and shapes of haloes in FastPM fit well with those in TNG.

Figure 2 .
Figure 2. Exemplary halo density profiles for TNG (solid lines) and FastPM (dotted lines) haloes.Colors correspond to different halo mass bins, and the vertical lines are corresponding halo radii  200 .The density near the centers of FastPM haloes is washed out and much lower than for the full N-body TNG haloes.The mass missing in the core is instead spread out more smoothly around the edge and outside of the haloes.

Figure 3 .
Figure 3. Upper panel: Ratio of the FastPM halo mass and the corresponding TNG halo mass in matched pairs of haloes, as a function of TNG halo mass.The masses of most haloes in the FastPM simulation are measured to be lower than the corresponding ones in the TNG.The shaded regions correspond to the 25, 50 and 80 percentiles of the distribution around the median values.

Figure 4 .
Figure4.The relation between particle number and the effective halo overdensity required to reach in FastPM the exact same  200 halo mass as found for the corresponding halo in a full N-body simulation.The solid lines are the average values of these overdensities for the mass-matched haloes.The dashed lines give our corresponding fitting formula, which is meant to match this empirical finding, adjusted however at the low particle number end in order to avoid an overcorrection of the halo abundance there.The different colors correspond to different particle resolutions employed in the FastPM simulations.

Figure 5 .
Figure 5. Mean values of the overdensity plateau ℎ in Figure 4, for different particle densities of the underlying FastPM simulations.The blue line is a loglinear fit to the measurements, allowing to account for resolution dependence in Eq. (1).

Figure 6 .
Figure 6.Left panel: Comparison of the spherical overdensity halo masses obtained for FastPM and TNG simulations in matched halo pairs when using the canonical overdensity of 200 with respect to critical in both cases.Right panel: Here the FastPM masses with overdensity   are compared to the TNG halo masses.The oblique black line marks the identity relation in both cases.The   -corrected halo masses fit the TNG halo masses rather well, as desired.

Figure 7 .Figure 8 .
Figure 7. Illustration of the spurious haloes in FastPM caused by its lack of small-scale modes.The halo in the centre of the TNG simulation (left panel) is wrongly identified as two smaller haloes in the FastPM simulation (right panel).This is caused by the decreased halo concentration in FastPM as indicated in Figure 2.

Figure 9 .
Figure 9. Halo bulk velocity comparison for one-to-one matched pairs of haloes in FastPM and full N-body simulations.In the left panel, the velocity along a randomly chosen direction is shown, whereas the right panel compares the absolute values of the halo velocities.

Figure 10 .
Figure 10.Halo clustering power spectrum comparison between FastPM and TNG at redshift  = 0.3.The line styles are the same as in Fig.8.The top panel shows the power spectrum in real space while the bottom panel compares the power spectrum in redshift space.The precision of the results for the   -based halo catalogues are improved to lie within 5 percent of the full N-body results for both types of power spectra.

Figure 13 .
Figure 13.Similar to Fig. 4, but for different redshifts, as labelled by the colors.The solid lines give the average overdensity values needed for FastPM haloes to recover the masses expected based on matched pairs of TNG haloes, while dashed lines are our corresponding correction formulae, tuned to avoid an overcorrection of the halo abundance at the low mass end.

Figure 14 .
Figure14.Comparison of halo mass functions (upper panel) and halo power spectra (lower panel) between FastPM and full N-body simulations for different redshifts.The dashed lines are the results of halo catalogues with a spherical overdensity of 200, while the solid lines are for   -corrected FastPM halo catalogues.The precision reached with the   -based method is about 5 percent or better for the different redshifts, except for the highest mass end in the halo mass function at redshift  = 1.0.The precision of the halo clustering power spectra is improved to be within 5 percent for all examined redshifts.
Figure15.The ratios of  p,Fast / p,TNG for the original FastPM (dashed lines), the   method (solid lines) and the abundance matching method (dotdashed lines).The error bars denote standard errors from counting statistics.From the top to bottom, the three panels represent the results for the three luminosity threshold samples of   < −19,   < −20 and   < −21, respectively.We can observe that the abundance matching method shows some improvement over the original halo catalogue, particularly in enhancing the representation of high-luminosity galaxies.However, it does not significantly enhance the statistical properties of low-luminosity galaxies.When the   method is employed, the accuracy of the measurement is improved on all scales.

Table 2 .
Cosmology parameters used in validation runs