SP(k) -- A hydrodynamical simulation-based model for the impact of baryon physics on the non-linear matter power spectrum

Upcoming large-scale structure surveys will measure the matter power spectrum to approximately percent level accuracy with the aim of searching for evidence for new physics beyond the standard model of cosmology. In order to avoid biasing our conclusions, the theoretical predictions need to be at least as accurate as the measurements for a given choice of cosmological parameters. However, recent theoretical work has shown that complex physical processes associated with galaxy formation (particularly energetic feedback processes associated with stars and especially supermassive black holes) can alter the predictions by many times larger than the required accuracy. Here we present $\texttt{SP(k)}$, a model for the effects of baryon physics on the non-linear matter power spectrum based on a new large suite of hydrodynamical simulations. Specifically, the ANTILLES suite consists of 400 simulations spanning a very wide range of the"feedback landscape"and show that the effects of baryons on the matter power spectrum can be understood at approaching the percent level in terms of the mean baryon fraction of haloes, at scales of up to $k \lesssim 10 \, h \, $Mpc$^{-1}$ and redshifts up to $z=3$. For the range of scales and redshifts that will be probed by forthcoming cosmic shear measurements, most of the effects are driven by galaxy group-mass haloes ($M \sim 10^{13-14}$ M$_\odot$). We present a simple Python implementation of our model, available at $\href{https://github.com/jemme07/pyspk}{\mathrm{https{:}//github.com/jemme07/pyspk}}$, which can be used to incorporate baryon effects in standard gravity-only predictions, allowing for marginalisation over baryon physics within cosmological pipelines.


INTRODUCTION
Measurements of the growth of large-scale structure (LSS) provide an important test of our cosmological theoretical framework (Peebles 1980;Bond et al. 1980;Davis et al. 1985;Kaiser 1987;Peacock & Dodds 1994). They are independent of, and complementary to, constraints from analyses of fluctuations in the cosmic microwave background (CMB) and geometric probes, such as Type Ia supernovae (SNe) and baryon acoustic oscillations (BAOs). The different LSS tests (e.g., galaxy clustering, Sunyaev-Zel'dovich power spectrum, cosmic shear, CMB lensing, redshift-space distortions, etc.) characterise the clustering of matter on different scales and its evolution over cosmic time. On the largest scales where the fluctuations are small and gravity is the sole force of relevance, perturbation theory is sufficiently accurate to calculate the clustering of matter. However, at present most LSS tests probe well into the non-linear regime, since that is typically where most of the observational signal and much of the cosmological sensitivity originates (e.g., Amon & Efstathiou 2022). The standard approach is therefore to adopt the (semi-)analytic 'halo model' formalism (Peacock & Smith 2000;Seljak 2000;Ma & Fry 2000;Cooray & Sheth 2002;Smith et al. 2003;Mead et al. 2016;Acuto et al. 2021), which is often calibrated using aspects of large N-body cosmological simulations, or to use such N-body sim-★ E-mail: j.salcidonegrete@ljmu.ac.uk † E-mail: i.g.mccarthy@ljmu.ac.uk ulations to directly correct linear theory in an empirical fashion (e.g., Takahashi et al. 2012;Heitmann et al. 2016;Lawrence et al. 2017;DeRose et al. 2019;Euclid Collaboration et al. 2019;Angulo et al. 2021). These approaches would be fully adequate if the matter in the universe were composed entirely of dark matter. However, baryons contribute a non-negligible fraction of the matter density (Ω /Ω ≈ 0.157 ± 0.001; Planck Collaboration et al. 2020) and work based on cosmological hydrodynamical simulations has shown that feedback processes associated with galaxy formation can have a relatively large effect (typically many times larger than the statistical precision of upcoming surveys) on the matter distribution on scales of up to a few tens of megaparsecs Schneider & Teyssier 2015;Mummery et al. 2017;Springel et al. 2018;van Daalen et al. 2020). It is therefore crucially important that we model these effects as accurately as possible, as they will introduce significant biases in the inferred cosmological parameters from upcoming surveys if no action is taken (e.g., Semboloni et al. 2011;Schneider et al. 2020;Castro et al. 2021;Debackere et al. 2021).
One way to tackle this challenging problem is to modify the gravity-only predictions with simple analytic prescriptions for baryon physics that have some number of associated free parameters and to jointly constrain the cosmological and feedback parameters through comparisons to LSS observables. Examples of this approach include the halo model HM of Mead et al. (2016) (see also Mead et al. 2021) or the 'baryonification' method of Schneider & Teyssier (2015) and Aricò et al. (2021), which directly modifies the outputs of gravity-only simulations by adjusting the radial position of particles within haloes in order to mimic the effects of baryons on small scales. Some of the strengths of these approaches include: i) they are considerably computationally cheaper than running full cosmological hydrodynamical simulations; ii) the baryon prescriptions are generally flexible and easy to adjust; and iii) they can be straightforwardly incorporated within existing pipelines based on gravity-only simulations or the halo model.
Recent cosmic shear surveys have employed these methods in an attempt to account for and measure, baryonic effects. For example, the fiducial Kilo Degree Survey (KiDS) analysis employs a halo model prescription where the effects of baryons are incorporated by marginalising over a phenomenological 'bloating' parameter that modifies the concentrations of dark matter haloes (e.g., Asgari et al. 2021;Heymans et al. 2021;Tröster et al. 2021). The Dark Energy Survey (DES) have taken a different approach, by introducing scale cuts to remove small-scale measurements that are most affected by baryonic effects (e.g., DeRose et al. 2019;Amon et al. 2022;Krause et al. 2021;Secco et al. 2022). This approach is motivated by the concern that errors in the modeling of baryonic effects could lead to biased cosmological parameter estimates. However, introducing scale cuts comes at the expense of losing valuable information on small scales, which can be important for constraining cosmological parameters and testing extensions to the standard model. More recently, the full DES data (including small-scale measurements) has been reanalysed using the baryonification approach to account for the impact of baryon physics (Chen et al. 2023;Aricò et al. 2023). Both the re-analysis of the DES data and a recent joint analysis of KiDS cosmic shear and Sunyaev-Zel'dovich effect data (Tröster et al. 2022) using a more physical halo model (Mead et al. 2020), detect the impact of baryon physics even in current data, at the approximately 2 − 3 level. These studies conclude that the implied level of suppression of the power spectrum is consistent with the predictions of recent calibrated hydrodynamical simulations such as BAHAMAS (see below). While the halo model and baryonification approaches have some important strengths, there are also some disadvantages to these methods. Particularly, the modelling of baryon physics and its back reaction on dark matter is simplistic and generally not self-consistent and that there may be non-negligible degeneracies between the various baryonic 'nuisance' parameters themselves and also between those parameters and the cosmological parameters being varied. Marginalisation over the uncertain baryon physics may therefore lead to a significant degradation of the cosmological constraining power. Of course, cosmological hydrodynamical simulations also have adjustable free parameters (as discussed immediately below), but the resulting diversity of outcomes in terms of the matter clustering is likely to be more constrained in hydrodynamical simulations than simple empirical models would allow. For example, the baryon fraction-halo mass relation cannot be arbitrarily steep in simulations because high-mass objects are assembled from the accretion of lower-mass objects (e.g., Balogh et al. 2008) and haloes can recapture gas as they grow, too. Also, naturally emerging conditions such as convective and virial equilibrium place constraints on the radial distribution of matter within haloes, and so on. The upshot is that we expect many of the parameters that characterise mass distributions of groups and clusters to be physically correlated rather than independent of each other.
Full cosmological hydrodynamical simulations could therefore be employed as a means of self-consistently incorporating the impact of baryons on LSS when constraining cosmology. However, this has so far not proved possible due to their computational expense, noting that a typical MCMC chain in cosmology can require ∼ 10 5 evaluations whereas currently available suites of simulations typically only contain a handful of realisations. Furthermore, a major obstacle in directly simulating the impact of baryon physics on the matter clustering is that current simulations do not resolve all of the physical scales necessary to capture such processes in an ab initio way. Consequently, so-called subgrid models are required to include these effects and they often have considerable uncertainties, which are not unlike the uncertainties in simple baryon models discussed above (although the impact of subgrid models is often bounded by physical constraints, whereas the empirical models may not be, as discussed above). Indeed, previous simulation work has shown that variations of the parameters associated with the efficiencies of feedback processes even within plausible bounds can lead to relatively large differences in the predicted properties of galaxy groups and clusters (e.g., Puchwein et al. 2008;McCarthy et al. 2010;Le Brun et al. 2014;Planelles et al. 2014;Oppenheimer et al. 2021) which dominate the matter clustering (van Daalen & Schaye 2015;Mead et al. 2020). A consequence of these variations in the predicted properties of groups/clusters is relatively large study-to-study variations in the predicted impact of baryons on the matter power spectrum (see the simulation comparisons in Chisari et al. 2019 andvan Daalen et al. 2020), in spite of the simulations being more constrained than phenomenological models.
As discussed recently in Oppenheimer et al. (2021), the variations in the predicted properties of groups/clusters and the impact of baryons on the matter power spectrum from hydrodynamical simulations in the literature is not unexpected. It is a consequence of not being able to derive the efficiencies for the relevant feedback processes from first principles (see discussion in Schaye et al. 2015). The efficiency of feedback in simulations must therefore generally be calibrated in order to ensure they reproduce particular observed quantities, after which the realism of the simulations may be tested against independent quantities. The approach of the BAHAMAS programme ) (see also the recent FABLE simulations; Henden et al. 2018Henden et al. , 2020, was to explicitly calibrate the feedback efficiencies so that they reproduce the observed baryon fractions of galaxy groups. Aside from an explicit dependence on the universal baryon fraction (White et al. 1993), Ω /Ω , which is tightly constrained by the CMB, the baryon fractions of groups should be insensitive to changes in cosmology and therefore represents a fairly ideal quantity on which to calibrate the feedback. Note also that since the growth of fluctuations is fundamentally a gravitational process, by ensuring the simulations have the correct baryon fractions on the scale of groups/clusters, the impact of baryons on ( ) ought to be strongly constrained by this approach. van Daalen et al. (2020) have recently confirmed this simple picture by demonstrating that the differences in the predicted impact of baryons on the present-day ( ) from different simulations can be understood at approximately the percent level up to ≈ 1 ℎ Mpc −1 in terms of the differences in baryon fraction in the various simulations at a mass scale of ∼ 10 14 M . We highlight here that the simulations analysed in that study varied by more than a factor of a thousand in mass resolution, used different hydro solvers and subgrid physics implementations, assumed different baseline cosmologies, and some (specifically BAHAMAS) varied the initial conditions to explore the potential impact of cosmic variance. None of these variations were found to significantly affect the impact of baryons on ( ) after differences due to the group baryon fractions were factored out.
The results of van Daalen et al. (2020) are very promising and potentially offer a path forward for incorporating the impact of baryons on the matter power spectrum from cosmological hydrodynamical simulations. Note that with a mapping between the baryon fraction and the impact of baryons ( ), there is no longer a necessity to calibrate the hydrodynamical simulations to high precision to match some particular data set. Instead, one can properly account for the uncertainties in the feedback/subgrid effects on ( ) by, for example, conservatively marginalising over the uncertainties in the observed baryon fractions. Furthermore, with a simple mapping, the correction to the gravity-only clustering could be straightforwardly included in cosmological pipelines because it can be rapidly evaluated. However, before these aims can be achieved a number of limitations must first be overcome. First and foremost, the quantitative link between halo baryon fraction and the suppression of the power spectrum must be established for a wider range of models. van Daalen et al. (2020) used a small number (≈ 10) of publicly-available simulations which likely do not bracket the full range of possible behaviours. Furthermore, the mapping between baryon fraction and ( ) was established only at = 0 and up to a maximum wavenumber of ≈ 1 ℎ Mpc −1 , both of which are insufficient for current and future LSS tests.
In the present study, we overcome these limitations by presenting a new large suite of cosmological hydrodynamical simulations designed specifically to build a mathematical model for the suppression of the matter power spectrum with the baryon fractions of galaxy groups as its input. The ANTILLES suite contains 400 cosmological hydrodynamical simulations that vary both the important parameters that characterise the efficiencies of stellar and active galactic nuclei (AGN) feedback in the simulations as well as the employed hydrodynamics scheme, and span an unprecedentedly wide range of behaviours in terms of baryon fractions and ( ) modifications. We develop a model that can reproduce the effects of baryons on ( ) to typically better than ≈2% precision up to = 10 ℎ Mpc −1 out to a redshift of = 3.
The present study is structured as follows. In Section 2 we describe the new suite of cosmological hydrodynamical simulations. In Section 3 we present SP(k), an empirical model that provides the mapping between the observable baryon fractions of groups/clusters and the suppression of the matter power spectrum, ( ). In Section 4 we test SP(k) against the independent BAHAMAS simulations. In Section 5 we discuss some limitations of the present work and in Section 6 we summarise our findings.

SIMULATIONS
To quantify the potential impact of baryon physics on the non-linear matter power spectrum, our simulations must satisfy a number of requirements. Firstly, the simulated volumes must be sufficiently large to contain a representative sample of the haloes that contribute most significantly to the matter power spectrum. They must also be of sufficiently high resolution to resolve the range of scales over which cosmological measurements are made and so that we resolve the locations of important feedback processes (specifically, the simulations contain the haloes from which the majority of the energetic feedback originates). Finally, we need to explore a wide range of feedback possibilities, which we dub the "feedback landscape", such that key properties such as the baryon fractions (stellar and gas fractions) span a wide range and conservatively bracket current and hopefully future observational constraints.
With these requirements in mind, we have created a new bespoke suite of 400 simulations, the ANTILLES suite. Each simulation has a box size of 100 Mpc/ℎ on a side, which van Daalen et al. (2020) have shown is sufficiently large to capture the relative effects of baryons on Table 1. The cosmological parameters for the simulations used in this study. We adopt a flat ΛCDM cosmology with WMAP 9-year based cosmological parameters (Hinshaw et al. 2013). Ω m , Ω Λ , Ω b are the average densities of matter, dark energy, and baryonic matter in units of the critical density at redshift = 0; 0 is the Hubble constant, 8 is the square root of the linear variance of the matter distribution when smoothed with a top-hat filter of radius 8 ℎ −1 cMpc, and is the scalar power-law index of the power spectrum of primordial adiabatic perturbations.

Cosmological Parameter
Value Thus, each simulation has 256 3 baryon and dark matter particles (each), corresponding to (initial) masses 1 of g = 1.09×10 9 M and dm = 5.51 × 10 9 M respectively, given our choice of cosmology (below). The gravitational softening is fixed to 4ℎ −1 kpc in physical coordinates below = 3 and in comoving coordinates at higher redshifts.
We adopt a flat ΛCDM cosmology consistent with the WMAP 9year results (Hinshaw et al. 2013). The cosmological parameters used in these simulations are listed in Table 1. As we focus on the relative impact of baryons on the matter power spectrum, we do not expect the precise choice of cosmology to be important for our purposes. We nevertheless discuss the possible dependence of our results on cosmology in Section 5.
The Boltzmann code 2 (Lewis et al. 2000) was used to compute the transfer functions which were supplied to a modified version of the N-G IC 3 code to include second-order Lagrangian Perturbation Theory to create the initial conditions at a starting redshift of = 127. When producing initial conditions for hydrodynamical simulations, we use the separate transfer functions computed by for each individual component (i.e., baryons and dark matter). Similarly, in order to avoid any offset in the amplitude of the matter power spectrum of the hydro simulations with respect to their dark matter-only counterpart at large scales, for the dark matter-only version of the simulations, we generated two separate fluids, one with the dark matter transfer function and the other with the baryon transfer function (Valkenburg & Villaescusa-Navarro 2017;van Daalen et al. 2020). Additionally, the same random phases were used to generate each set of initial conditions. Hence, comparisons made between the different simulations are not subject to cosmic variance complications.
The simulation suite was run with a modified version of the -3 smoothed particle hydrodynamics (SPH) code (last described by Springel 2005) and includes a full treatment of gravity and hydrodynamics. Specifically, we use version of -3 that was modified for the EAGLE project .
In order to examine the potential effects of different hydrodynamic schemes, the ANTILLES suite comprises a set of simulations  (Baldry et al. 2012;Bernardi et al. 2013;Moustakas et al. 2013;Driver et al. 2022), for the gas fractions of groups and clusters (Vikhlinin et al. 2006;Maughan et al. 2008;Pratt et al. 2009;Rasmussen & Ponman 2009;Sun et al. 2009;Lin et al. 2012;Gonzalez et al. 2013;Sanderson et al. 2013;Lovisari et al. 2015;Pearson et al. 2017), the latter of which are derived from resolved X-ray observations, and the fit to the median baryon fraction from the latest HSC-XXL weak gravitational lensing data from Akino et al. (2022), including a correction for the contribution of blue galaxies and the diffuse intracluster light. The light shaded region encloses the 1 uncertainty. The orange line in the middle panel highlights the median gas − 500c relation from observations, including a correction from the inferred scaling relations from the X-ray selected sample of galaxy groups and clusters in Sereno et al. (2020). Comparing the solid black curves to the solid orange curve (i.e., median relations) in the middle panel, the simulations conservatively bracket the observed gas fractions, the observed galaxy stellar mass function (left panel), as well as the median total baryon fraction (right panel).
that use the standard flavour of SPH, as as well as set that uses the more recent state-of-the-art formulation. The improvements within include the use of the pressure-entropy formulation of SPH derived by Hopkins (2013), the artificial viscosity switch from Cullen & Dehnen (2010), an artificial conduction switch similar to that of Price (2008), the C 2 kernel of Wendland (1995), and the time-step limiters of Durier & Dalla Vecchia (2012).
Important non-gravitational processes, such as stellar and AGN feedback, that are not resolved by the simulations are implemented as subgrid physical models. A full description of these subgrid models can be found in Schaye et al. (2015). In summary: (i) Radiative cooling and photoheating are implemented elementby-element as in Wiersma et al. (2009a), including the 11 elements found to be important, namely, H, He, C, N, O, Ne, Mg, Si, S, Ca, and Fe. Hydrogen reionization is implemented by switching on the full Haardt & Madau (2001) background at redshift = 7.5.
(ii) Star formation is implemented stochastically following the pressure-dependent Kennicutt-Schmidt relation as in . Above a density threshold * H = 0.1 cm −1 , which is designed to track the transition from a warm atomic to an unresolved cold molecular gas phase (Schaye 2004), gas particles have a probability of forming stars determined by their pressure. We use a single fixed density threshold for these simulations, as opposed to a metallicity-dependent threshold in EAGLE.
(iii) Time-dependent stellar mass loss due to winds from massive stars and AGB stars, core collapse supernovae and type Ia supernovae, is tracked following Wiersma et al. (2009b).
(iv) Stellar feedback is implemented using the kinetic wind model of Dalla , which differs from the thermal implementation used in EAGLE. This is motivated by resolution considerations, particularly that relative high resolution is required for efficient thermal feedback due to its stochastic implementation. We instead adopt a kinetic implementation, as also adopted in BA-HAMAS.
(v) Seed BHs of mass M = 1 × 10 6 M , are placed in haloes with a mass greater than 2.75 × 10 11 M (corresponding to ≈ 50 DM particles) and tracked following the methodology of Springel et al. (2005) and Booth & Schaye (2009). Once seeded, BHs can grow via Eddington-limited gas accretion, at a rate which is proportional to the Bondi-Hoyle-Lyttleton rate, as well as through mergers with other BHs following Booth & Schaye (2009).
(vi) Feedback from AGN is implemented following the stochastic heating scheme. A fraction of the accreted gas onto the BH is released as thermal energy with a fixed heating temperature into the surrounding gas following Booth & Schaye (2009).
In order to explore a wide feedback landscape that brackets current observational constraints (with their associated uncertainties) on the stellar and gas fractions, we systematically vary the main subgrid parameters governing the efficiencies of stellar and AGN feedback. In particular, the galaxy (star) formation efficiency is quite sensitive to variations of the wind velocity ( ) and mass-loading ( ) parameters. These parameters have a significant effect on the star formation histories of galaxies and the shape of the GSMF, especially at the low-mass end where stellar feedback is expected to dominate (e.g. Schaye et al. 2010;McCarthy et al. 2017). For AGN feedback, the two main parameters that determine how frequent and energetic the feedback events are, are the number of neighbouring gas particles to heat ( heat ) and the temperature increase of such particles (Δ heat ). In particular, changes in Δ heat have a significant impact on the gas mass fraction of groups and clusters of galaxies , which can be understood by recognising that the likelihood of significant gas ejection will be favourable if Δ heat vir . Finally, in the Booth & Schaye (2009) accretion model implemented in the simulations, the "boost factor", , relative to pure Bondi-Hoyle accretion is a power-law function of the local density for gas above a pivot point, * H,BH , which is set to the star formation threshold * H = 0.1cm −3 , as the simulations resolve lower densities, where no cold, molecular gas phase, is expected. At higher densities, the simulations lack the cold gas phase, which would boost the Bondi-Hoyle rate due to its sound speed dependence (Schaye 2004;Booth & Schaye 2009). We set the power-law exponent to = 2 for high densities, and the boost factor is set to go to unity at low density, where the simulations are well resolved. Nevertheless, for gas at 10 4 Table 2. Subgrid parameters varied in this study. and are the wind velocity and mass-loading factor used in the kinetic wind stellar feedback model as per . Δ heat is the temperature increase of gas particles during BH feedback events, and heat is the number of neighbouring gas particles to be heated. * H,BH is the density threshold above which the BH accretion rate is boosted in the Booth & Schaye (2009) accretion model due to the lack of resolution of the cold gas phase at high densities. K, at the particle resolution of the simulations, we only resolve the Jeans length for densities < 10 −3 cm −3 . Hence, we explore boosting the accretion rate at lower densities to compensate for the lack of resolution. We find that changing the pivot point, * H,BH , can have a significant impact at the mass-scale at which AGN feedback becomes efficient, effectively changing the shape of the knee of the GSMF. The five parameters varied in this study, with their respective range of values, are provided in Table 2.
In order to ensure a full parameter space coverage, we employed a Latin hypercube sampling with multidimensional uniformity (LHSMDU) developed by Deutsch & Deutsch (2012). We used 200 nodes to uniformly sample the parameter space. Finally, we "mirrored" these 200 simulations using the exact same subgrid physics implementation and sampled parameters, but using a standard flavour of SPH, as opposed to the more recent state-of-the-art formulation. This gives a total of 400 simulations in the ANTILLES sample.
The GSMF, median gas fraction, and median total baryon fraction as a function of halo mass spanned by the simulations at redshift = 0.125 is shown in Fig. 1. The figure shows how our parameter space exploration easily brackets the current observational constraints. Note that when comparing the simulations to the observations in the middle panel (gas mass fractions), one should compare the solid black curves, which represent the median relations of each of the 400 simulations, to the solid orange curve, which represents the median relation from resolved X-ray observations. (The orange data points represent individual observed groups and clusters, we do not show individual simulated clusters for clarity, but we note the intrinsic scatter in the simulated relations is similar to the observed intrinsic scatter). Thus, the simulations conservatively bracket both the observed gas fractions and the observed stellar mass function. If one adopts the statistical uncertainties on the median baryon mass fractions 4 from the recent study of Akino et al. (2022), then our simulations span a range of approximately −7 to +6 with respect to the observed baryon fraction at a mass scale of ≈ 10 14 M .
Note that our desire to span a much wider range of baryon fractions than is apparently allowed by current observations is motivated by two factors: i) the low-redshift data we compare to may have nonnegligible biases; and ii) the baryon fractions of higher redshift (e.g., 4 We note that for our simulations, we include all stellar and gas particles within a spherical overdensity radius. Hence, in order to make reasonable comparisons with the fits in Akino et al. (2022), we included an additional 15% contribution to the total stellar masses from the contribution of blue galaxies, and 30% additional stellar mass to the brightest cluster galaxies (BCGs) to account for the diffuse intracluster light (ICL, see Akino et al. 2022). ≈ 0.5 − 1) groups and clusters, which give rise to much of the lensing signal, is not well constrained at present by observations, thus we want a range of simulated behaviours that is wide enough to hopefully encapsulate future measurements of high-redshift systems as well. With regards to possible biases in current data, observational measurements are always subject to both random and systematic errors. For instance, the observationally-derived stellar masses are subject to systematic errors originating from, e.g., stellar population modelling, spectral energy distribution fitting, surface brightness profile fitting and corrections for dust extinction. Gas fraction measurements from X-ray data of groups and clusters are subject to uncertainties in, e.g., deviations from spherical symmetry and hydrostatic equilibrium, and modelling of selection effects, particularly in the group regime.
Finally, we note that the prior ranges on each of the subgrid parameters were essentially selected by examining the baryon fraction results from a small number of test simulations, so it is no surprise that the resulting hypercube spans a large range of baryon fractions.
Note also that while we use the combined set of 400 simulations to construct our model for the suppression of the matter power spectrum (below), we have also explored analysing the two SPH suites separately. While for a given set of subgrid parameters the choice of SPH flavour can affect the resulting baryon fractions, the reaction in terms of ( ) is independent of the choice of SPH flavour at a given baryon fraction. This is consistent with the findings of van Daalen et al. (2020) and allows us to construct a single (more accurate) model in terms of baryon fractions using the entire set of 400 ANTILLES simulations.

MODELLING BARYON PHYSICS EFFECTS ON ( )
We begin by analysing the relative (fractional) impact of baryon physics on the total matter power spectrum for the 400 simulations.   Given the selection of prior ranges for each subgrid parameter to conservatively bracket existing observational constraints and their corresponding uncertainties, our simulation suite encompasses baryonic effects that surpass observational bounds, with a range of parameters more extreme than those used in most cosmological simulations (see e.g. Tröster et al. 2022). The effect of AGN feedback is of particular importance, due to its ejective nature at early times . The removal of large quantities of gas not only affects the distribution of the remaining gas, but it also has a gravitational effect on the dark matter, causing it to expand as well. This effect is referred to as the 'back reaction' on baryons on the dark matter (e.g., van Daalen et al. 2011). The net result of baryon ejection and dark matter expansion is that ( ) can be suppressed by a non-negligible amount (relative to forthcoming statistical errors from cosmic shear surveys) out to wavenumbers of ∼ 0.1 ℎ Mpc −1 , corresponding to physical scales of ∼ 30 Mpc/ℎ. In the following sections we will present our model based on the correlation between the power spectrum suppression and the total baryon fraction of haloes of different masses. Note that in this study, halo masses are measured within both 200c and 500c , i.e. the radius within which the mean density is 200 or 500 times the critical density of the Universe respectively. While both spherical overdensity apertures provide similar results, we find that using 200c gives slightly better accuracy. Hence, in the following sections, we will presents all of our results and plots in terms of 200c , while also 5 We characterise the ratio in terms of the power spectrum, ( ). But note that since ( ) ∝ Δ 2 ( ), our results are equivalent to a ratio of provide the best fitting parameters for our model in both 200c and 500c .

The correlation between the power spectrum suppression and the total baryon fraction
Recently, van Daalen et al. (2020) presented an empirical model that uses the mean baryon fraction of group mass haloes ( 200c = 10 14 M ) as a predictor for the power suppression for scales 1.0 ℎ Mpc −1 , achieving an accuracy of ≈ 1%. Fundamentally, the effect of baryon physics on the matter power spectrum is a gravitational process, which is why we expect the baryon fraction of haloes (which is a direct measure of the mass that has been removed) that contribute most significantly to ( ) to be a strong predictor of the effects on ( ). Taking our cue from van Daalen et al. (2020), in Fig. 2 we colour code each run by its median total baryon fraction of haloes of mass 200c = 10 14 M , and normalised by the universal baryon fraction (Ω /Ω ). In agreement with van Daalen et al. (2020), the figure shows a clear gradient of the power spectrum suppression as a function of the total baryon fraction of group mass of haloes. At closer inspection, the correlation is not perfect at small scales of ≥ 3 ℎ Mpc −1 . Nevertheless, we will show below that a simple and accurate model for the ( ) suppression can be formulated in terms of the median baryon fraction.
In this work, we aim to extend the van Daalen et al. (2020) model to smaller scales and higher redshifts, which we will achieve by taking into account the increasing importance of lower mass haloes as we push in both directions. In order to assess the strength of the correlation between the baryon fraction of haloes of different mass, we compute the power spectra at three equally spaced scales in log 10 ( = 0.1, 0.9 and 8.0 ℎ Mpc −1 ), which are shown as vertical dashed lines in Fig. 2. We show in Fig. 3 the fractional impact of baryons on the total matter power spectrum as a function the total baryon fraction for two different halo of masses (10 13 M and 10 14 M ) for these three scales. The colours correspond to the same scales of the vertical lines in Fig. 2. The two sets of SPH schemes used for the simulations are shown with different symbols.  Fig. 3 shows that the suppression of the power spectrum at a given scale correlates better with the total baryon fraction of haloes of different mass. For instance, the suppression at = 0.9ℎ Mpc −1 (orange symbols) is better correlated with the baryon fraction of haloes of mass 10 14 M (right) than with 10 13 M (left). On the other hand, for small scales, = 8.0ℎ Mpc −1 (green symbols), the suppression is better correlated with the baryon fraction of lower mass haloes (left). This dependency can be explained in terms of the differing relative contribution of haloes of different mass to the total matter power spectrum (e.g., van Daalen & Schaye 2015;Mead et al. 2020). The strength of the correlation for very large scales, = 0.1 ℎ Mpc −1 (blue symbols), seems insensitive to choice of halo mass (for the two masses shown here) which is mainly just because feedback does not significantly alter the power spectrum on these scales, so there is little dynamic range in hydro ( )/ DM ( ) to couple to the baryon fractions.
Note that whilst the selection of SPH flavor for a particular set of subgrid parameters may have an impact on the resulting baryon fractions, the response exhibited in ( ) is invariant with regard to the selection of SPH flavor at a given baryon fraction. This is consistent with the findings of van Daalen et al. (2020).
In Fig. 4 we quantify the strength of the correlation between the suppression of the total matter power spectrum and the total baryon fraction of haloes of different mass as a function of scale using the Spearman's rank correlation coefficient, . We calculate the correlation coefficient at 40 equally spaced scales 6 in log 10 in the range 0.1 ≤ [ℎ Mpc −1 ] ≤ 10. Each panel represents a different redshift in the simulations.
For each redshift, the lines are colour coded by the halo mass used to estimate the baryon fraction. We use a mass bin width of 0.1 dex around each indicated halo mass. To help guide the eye, we have highlighted the area where ≥ 0.90, i.e. where the correlation between the suppression of the total matter power spectrum and the total baryon fraction is exceptionally strong, and it can be well described using a monotonic function. 6 We re-bin the power spectra and 'smooth' them by calculating the median power for each bin.
To guide the eye, we have added three vertical lines in Fig. 4 to roughly divide small, intermediate, and large scales: (i) Small scales ( 4 ℎ Mpc −1 ): the baryon fraction of progressively lower mass haloes provides a better correlation with the suppression for increasing . The converse is also true; i.e., the strength of the correlation using higher mass haloes decreases for increasing .
(iii) Large scales ( 0.4 ℎ Mpc −1 ): the strength of the correlation decreases rapidly regardless of the halo mass. This is expected, as for (very)large scales, the suppression of the total matter power spectrum should be close to unity independent of the baryon fraction of haloes (non-monotonic, see Fig. 3).

The optimal halo mass
As we aim to determine the halo masses that best predict the suppression of the power spectrum at each scale, we define the optimal mass,ˆ, 200c , as the halo mass used to measure the baryon fraction that maximises as a function of scale and redshift: Figure 5 showsˆ, 200c as a function of scale for the five redshifts shown in Fig. 4. We used alternating filled semicircles to allow us to show overlapping points from different redshift bins. For each redshift, the shape of the relation roughly reflects the three distinct behaviours described above: at small scales ( 4 ℎ Mpc −1 ), the baryon fraction of haloes decreasing in mass provide a better correlation for increasing ; at intermediate scales (0.4 [ℎ Mpc −1 ] 4), haloes with a characteristic mass provide the best correlation (i.e., the optimal mass plateaus); and at large scales, the strength of the correlation decreases rapidly regardless of the halo mass, as the total matter power spectrum suppression is close to unity, independent of the baryon fraction of haloes (see Figs. 3 and 4). In order to avoid noise while fitting, without loss of generality, we assume thatM for each corresponding scale . Colours indicate the suppression at 5 equally spaced scales in logarithmic scale (log 10 ). Each symbol represents one of the 400 models. Solid lines show the best fit using Eq. (4) without a scale and redshift dependence, i.e. each scale shown has been fitted using a simple exponential plateau function.
remains constant for ≤ 0.4 ℎ Mpc −1 (data points are show in faint colours). Note that this is also consistent with the results of van Daalen et al. (2020), i.e., to a good approximation, the total baryon fraction for a single halo mass provides enough information to predict the power spectrum suppression due to baryons for 1.0 ℎ Mpc −1 . It is worth noting that the optimal mass exhibits a monotonic behaviour as function of redshift. Therefore, we approximateˆ, 200c Table 3. Best-fit parameters for the optimal mass in Eq. (2), expressed either in terms of 200 or 500 . For the exponential plateau functional form, is the asymptote of the function as →0, is the value for = 1, and characterises the rate of change. The redshift dependence of the parameter is given in Eq. (3). using the following exponential plateau functional form,

Spherical overdensity
where ( ) is the asymptote of the function as →0, ( ) is the value for = 1, and ( ) characterises the rate of change. We impose a simple polynomial ansatz for the redshift dependence of these parameters as follows, where = { , , }. We use a simple least squares estimator to fit the simulation data and we provide the best-fitting parameter values in Table 3. The solid lines in Fig. 5 show the best fit using Eq. (2).
(2) to compute the suppression of the total matter power spectrum as a function the total baryon fraction at redshift = 0.125. Note that we do not assume any functional form for the − halo relation. Rather, we use a piecewise cubic Hermite interpolating polynomial using the narrow binned data from the simulations to compute the baryon fractions at the required optimal mass.
The figure shows that usingˆ, 200c ( , ) significantly reduces the scatter of the relationship at all scales shown, from = 0.1 ℎ Mpc −1 , up to the one-dimensional Nyquist frequency of the simulations, Ny = / ≈ 8 ℎ Mpc −1 , where is the cube root of the total number of particles, and is the length of the cubic box. Given the wide range of physical models and the different hydrodynamic schemes used in this study, the fact that such a tight correlation is achieved solely based on the baryon fraction of haloes of different masses is remarkable. One might have expected that a thorough characterisation of the mass density profiles around haloes (as opposed to a simple aperture baryon fraction) would have been required to obtain this level of precision. For example, Debackere et al. (2020) have shown that the behaviour of the profiles between 500 and 200 can affect the matter power spectrum if the profiles are allowed to vary significantly over this range. Evidently, there is a very strong physical correlation between the integrated baryon fractions and the shape of the profile over this range in our simulations, such that most of the physical effect on the power spectrum is encoded in the integrated baryon fractions alone.

SP(k) -A model for the suppression of ( )
We find that, similar to Eq. (2), an exponential plateau functional form provides a good approximation to the fractional impact of baryons  Table 4. ( , ) is the asymptote of the function in Eq. (4) for large˜values, modelled using an exponential functional form (Eq. 6), ( , ) is the suppression for˜= 0, modelled using a logistic functional form (Eq. 7), and ( , ) is the rate of change, modelled with log-normal functional form (Eq. 8).
on the total matter power spectrum: where˜is the baryon fraction at the optimal halo mass normalised by the universal baryon fraction, i.e.: and where ( , ) is the asymptote of the function for large˜values, ( , ) is the suppression for˜= 0, and ( , ) is the rate of change parameter. In the left panel of Fig. 7, we show an illustration of the suppression of the total matter power spectrum as a function of˜, as parametrised by Eq. (4).
We use this functional form to fit the 400 simulations at each scale individually. The best fit for each scale at = 0.125 is shown as solid lines in Fig. 6. Equation (4), without a scale and redshift dependence, provides a good approximation to the full range of scales up to the Nyquist frequency. Furthermore, we approximate each of the parameters in Eq. (4) based on their behaviour as a function of scale using an exponential, logistic and log-normal functional forms (respectively): We model the evolution of each parameter as a polynomial function in redshift, using Eq. (3), with = { , , , , , , , } accordingly.
In order to reduce the sensitivity to outliers in our wide range of baryon fractions in the 400 simulations, for our parameter estimation utilised a Huber loss function defined as where is the measured variable hydro ( )/ DM ( ) andˆis the predicted values from our model using using Eqs. (4) to (8). The value of was chosen to represent the median absolute deviation. We minimised the loss function Eq. (9) at five anchor redshifts ( = 0.125, 0.5, 1.0, 2.0 and 3.0). The best-fit parameters are provided in Table 4. We present a publicly available Python implementation of our model as py-SP(k). All relevant information about installation and usage can be found at https://github.com/jemme07/pyspk.
The choice of functional form in Eq. (4) provides insight into the effect of baryons on the total matter power spectrum. In the righthand side of Fig. 7 we show the scale and redshift dependence of the best fit parameters in Table 4. The figure shows that for very large scales ( ∼ 0.1 ℎ Mpc −1 ), both and are close to unity, especially at high redshift, i.e. the suppression of the matter power spectrum is a flat function of the baryon fraction of haloes. In other words, at very large scales, baryons do not significantly affect the matter power spectrum. As we examine progressively smaller scales, on the one hand, is an increasing function of , i.e. models with large median baryon fraction (˜→ 1) exhibit weaker suppression values, or even a power spectrum enhancement for small scales ( 4 ℎ Mpc −1 ). On the other hand, is a decreasing function of , i.e. models with low median baryon fraction (˜→ 0) present stronger suppression, especially for scales 1 ℎ Mpc −1 . Finally, the rate of change parameter, , dictates how quickly the suppression changes from the value of at low baryon fractions, to at large baryon fractions. For very large scales, the value of is irrelevant, as both and are close to unity. For scales 0.2 [ℎ Mpc −1 ] 1, 1, i.e. there is a rapid transition from strong suppression values at low baryon fractions, to weak suppression values at large baryon fractions. This is indeed the case if we examine the behaviour the suppression for that scales range in Fig. 6 (orange and green lines). There is a rapid transition giving rise to a "curved" relationship that asymptotes to the value of at high baryon fractions. For smaller scales ( 1 ℎ Mpc −1 ), →0, i.e. there is a slow transition from strong suppression values at low baryon fractions, to weak suppression (or enhancement) values at large baryon fractions. Examining Fig. 6 again, this slow transition gives rise to a more linear relation (rather than curved) for small scales (red and purple lines).

Model accuracy
The level of agreement between the simulations and SP(k) is shown in Fig. 8. To appreciate the performance of the model, in the top panel, we show the fractional impact of baryons on the total matter power spectrum against data for 8 randomly selected models from the 400 simulations at redshift = 0.1 spanning a wide range of baryon fractions. Solid lines show the power spectra computed directly from the simulations. Our model is not restrictive to a particular shape of the baryon fraction -halo mass relation, and in order to show its flexibility, we show in dashed lines our best-fitting model using binned data for the − halo relation obtained from the simulations, while in dotted lines we use a simple power-law functional form fit to the same relation in the halo mass range of interest (see also Section 3.5).
Whilst the data from the simulations show complex features, the overall behaviour is captured well by our model. By construction, the suppression goes to unity for low values of . The characteristic "spoon-like" shape of the power spectrum suppression, the amplitude of which depends on the baryon fraction of haloes, is well captured by the model. In the middle panel we show the ratio between the measurements of the suppression as measured in the hydrodynamical simulations, to our best-fitting model for all 400 models. The solid blue line shows the median ratio as a function of , while the blue dashed lines enclose the 15.9 and 84.1 percentiles (one sigma). In the bottom panel we show the mean absolute percentage error (MAPE), defined as, where is the measured variable hydro ( )/ DM ( ) andˆis the predicted values from our model. The figure shows that the model is able to reproduce the wide range of baryonic effects explored here remarkably well. For all the scale range up to the Nyquist frequency, the MAPE is ≈ 1%. For large scales, the model recovers the power spectrum suppression to better than 1% accuracy, while for the smallest scales considered, the error for most models within < 5%, with an average of < 2% for the entire range of baryon fractions. We accomplish this significant reduction in the scatter for the power spectrum suppression-baryon fraction relation in our model by selecting the optimal mass that maximises for each scale. Nevertheless, for the smallest scales that can be resolved in the simulations (close to the Nyquist frequency, Ny ≈ 8 ℎ Mpc −1 ), while the model can recover a mostly monotonic relation, there is still significant scatter around it, as shown with purple dots in Fig. 6. As the scatter increases with scale , this gives rise to the heteroscedastic behaviour of the errors, i.e. the error in the model increases with scale (dashed blue lines). Nonetheless, our model is an unbiased estimator of the true baryonic effects, as the ratio between the measurements of the suppression as measured in the hydrodynamical simulations to our best-fitting model is centered around unity at all scales.
In Fig. 9 we use the individual model errors for the 400 simulations to compute a 2D mean, and maximum, absolute percentage (2) and given along the top axis. The three coloured lines indicate the total baryon fraction using the optimal mass for the three BAHAMAS models (Low AGN, fiducial BAHAMAS, and High AGN models. McCarthy et al. 2017McCarthy et al. , 2018.
error distribution as a function of scale and baryon fraction for different redshifts. The baryon fraction is computed using the optimal mass of haloes for each corresponding scale and redshift using Eq.
(2). The mean and maximum errors are computed over linearly spaced baryon fraction bins and logarithmically spaced bins between 0.1 and 10.0 ℎ Mpc −1 . Only pixels with 5 or more data points are shown. The top panels show that our model has a mean error of 2% for the entire scale and redshift range. Furthermore, as shown in the bottom panels of Fig. 9, our model is less accurate for increasing . This can be seen as the increasing contours of maximum absolute percentage error towards the right-hand side of each panel at the bottom of the figure. The maximum absolute percentage error reaches values of 4% to 6% for scales close to the Nyquist frequency at low redshift, and up to 8% to 10% at = 1 − 2. Additionally, the model is less accurate for simulations with extreme feedback prescriptions (low baryon fractions), for scales ∼ 1 ℎ Mpc −1 . While a wide range of feedback prescriptions were used to build and calibrate our model, it is important to highlight that for simulations with mean group-scale baryon fractions roughly consistent with observations, the accuracy of our model is at its best, with mean error of ≈1%, and a maximum mean error of ≈3% for the whole scale and redshift range. As an ex-ample, Fig. 9 shows such accuracy around the fiducial BAHAMAS simulation (orange line), which was specifically calibrated to reproduce the galaxy stellar mass function and the observed gas fractions of galaxy groups and clusters at redshift ≈ 0 (see Section 4),

How the baryon fraction shapes the suppression of the total matter power spectrum
We now use our model to systematically explore the effect of the − halo relation on the shape of the suppression of the total matter power spectrum. For the mass range that can be relatively well probed in current X-ray and Sunyaev-Zel'dovich effect observations (roughly 10 13 200c [M ] 10 15 ), the total baryon fraction of haloes can be roughly approximated by a power-law with constant slope (e.g. Mulroy et al. 2019;Akino et al. 2022). Hence, we use the following parameterisation, Consistent with van Daalen et al. (2020), the top panel shows the large effect of the normalisation of the baryon fraction -halo mass relation on the suppression of the matter power spectrum. As mentioned above, the effect of baryon physics on the matter power spectrum is fundamentally a gravitational process. Hence, the larger the amount of mass removed from haloes (lower ), the stronger the suppression on ( ) relative to a baryon complete (DM-only) model. Furthermore, very low baryon fractions (associated with stronger AGN feedback) can have a non-negligible impact the power spectrum suppression on very large scales ( ∼0.1 ℎ Mpc −1 , dark-blue line).
In the bottom panel, we show that the slope of the baryon fraction -halo mass relation has a large effect on the scale of the minimum of the "spoon-like" shape of the power spectrum suppression.

Alternative approaches to fitting the suppression
Our approach above was to apply simple parametric forms to describe the correlation between the power spectrum suppression and the baryon fraction of haloes that contribute most significantly at a given scale. We have shown that, in spite of its simplicity, the model provides a very accurate description of our large simulation data set. Nevertheless, our approach to modelling the simulation data is not unique and there may be alternative (e.g., non-parametric) approaches that could lead to even more accurate results. For example, principle component analysis (PCA) applied to simulation power spectra has been used as a way of characterising the impact of baryons and allowing for their marginalisation in cosmological pipelines (e.g., Eifler et al. 2015). A related strategy is to model the principle components via Gaussian Processes (e.g., Heitmann et al. 2014) or sparse polynomial chaos expansion (e.g., Euclid Collaboration et al. 2019), although to our knowledge these methods have so far been applied to only the non-linear correction to the matter power spectrum and not on the impact of baryons. In addition, neural networks have recently been used to construct a baryon emulator by Aricò et al. (2021). The neural net was trained on a large sample of power spectra generated by the baryonification approach.
While there are various advantages and disadvantages to the alternative approaches described above, we have elected to use the approach described above because it is more intuitive in terms of visualising the correlations and it allows us to more easily enforce physically-motivated boundary conditions, such as requiring that suppression of the power spectrum goes to unity as →0 (very large scales). That being said, we have also experimented with both a Gaussian Process-based emulator and a simple neural network using the baryon fractions as the input quantities. We find that these nonparametric approaches do not significantly improve on the accuracy of our fiducial analytic method and we therefore elected to retain the latter as our main model.

TESTING AGAINST BAHAMAS
We employ the BAHAMAS suite of cosmological hydrodynamical simulations  to test the accuracy of our model using simulations outside our calibration set. The reference BAHAMAS simulation uses the same cosmology and particle mass resolution as the simulations used in this study, but in a much larger box. The fiducial BAHAMAS run consist of a comoving volume with 400 cMpc ℎ −1 on a side and 2 × 1024 3 particles of DM = 3.85 × 10 9 M ℎ −1 dark matter particle mass, and g = 7.66 × 10 8 M ℎ −1 initial gas particle mass. A full discussion of the sub-grid implementation, including the prescriptions for star formation, gas heating and cooling, BH formation and supernovae and AGN feedback models can be found in McCarthy et al. (2017) (see also Schaye et al. 2010). As the BAHAMAS suite of simulations aim to study the impact of baryonic process on LSS, it was of paramount importance to ensure that group size and more massive haloes had the correct baryon fractions, as these haloes contribute the most to the matter power spectrum. Therefore, the BAHAMAS approach was to calibrate the efficiencies of the stellar and AGN feedback to reproduce galaxy stellar mass function and the observed gas fractions of galaxy groups and clusters at redshift ≈0. We utilise three different variations of the BAHAMAS simulations that differ only in the strength of their AGN feedback prescription. This strength is determined by the AGN sub-grid heating temperature, which is the temperature increase given to gas particles associated to each AGN feedback event.
Additionally, as a "proof of concept", we test our model against two additional simulations, one that varies the mass resolution, but is otherwise identical to the fiducial BAHAMAS model, and one that changes the cosmology by including the effects of massive neutrinos, both on the background expansion rate and the growth of density fluctuations (see further discussion in Section 5). For the high resolution study, we used the "high res" model presented in McCarthy et al. (2017), which left the subgrid feedback parameters unchanged with respect to the BAHAMAS fiducial resolution model, but it has 8 times better mass resolution (comoving volume of 100 cMpc ℎ −1 on a side and 2 × 512 3 particles). This represents a 'strong' convergence test in the terminology of Schaye et al. (2015). We note that only the particle data at redshifts = 0 and = 0.125 for these runs is still available. Hence, we limited our comparison for the "high res" run to redshift = 0.125.
We have also used our model to predict the baryonic suppression for the most extreme massive neutrino variation of the WMAP9 cosmology in the BAHAMAS suite (Mummery et al. 2017;McCarthy et al. 2018). This model left the subgrid feedback parameters unchanged with respect to the fiducial BAHAMAS model, but uses a total summed neutrino mass of = 0.48 eV. For this model, the amplitude of the density fluctuations at the epoch of recombination , as inferred by WMAP9 data assuming massless neutrinos, was held fixed in order to retain agreement with the observed CMB angular power spectrum, and the value of 8 was adjusted accordingly. Additionally, the matter density parameter of cold dark matter was adjusted to maintain a flat universe model. All other cosmological parameters are held fixed.
In Fig. 11 we show the impact of baryons on the total matter power spectrum for the three BAHAMAS models (Low AGN, fiducial, and High AGN), the high resolution (HiRes) and massive neutrino ( = 0.48 eV) variations. In order to compute the baryon fraction from the simulations to use in Eq. (5), we used a piecewise cubic Hermite interpolating polynomial using narrow bins of 200c with 0.1 dex width. This approach provides a high degree of flexibility, as simulations show evidence for a slightly mass-dependent slope (Farahi et al. 2018). Additionally, we show in dotted lines the results using power law fits to the median baryon fraction in the halo mass range of interest at each redshift.
The figure shows that our model follows the simulation response typically to within 1%, with a maximum error of a ∼ 3% across the entire range of scales and redshifts shown. The errors at all scales and redshifts are consistent with the expected errors from the mean error contours in Fig. 9. The largest errors are at 2 [ℎ Mpc −1 ] 5 at = 1 for most models. In particular, for the lowest baryon fraction model, as expected from the maximum error contours in Fig. 9.
For the massive neutrino cosmology, the largest errors are at 2 [ℎ Mpc −1 ] 5 at = 0.5 and = 0.1. By construction, the model recovers the suppression (or rather, lack thereof) at the large scales. Hence, the errors are very small at the largest of scales. Finally, we do not find a significant decrease of the performance of the model when using a simple power law, compared to a more precise interpolation of the baryon fraction as a function of halo mass.

DISCUSSION
It is important to note that our model was built from a suite of simulations that widely varied the feedback parameters but only within a single cosmology (i.e., the cosmological parameters were fixed). How worried should we be about possible degeneracies between astrophysics in cosmology? In other words, does the scaling derived here apply to other cosmologies? Our previous work based on the BAHAMAS simulations has shown that the effects of baryon physics on the matter distribution appear to be separable from (independent of) variations in the cosmological parameters, including extensions to massive neutrinos (Mummery et al. 2017), dynamical dark energy , and a running of scalar spectral index . In those studies, we demonstrated that the resulting cluster baryon fractions and the suppression of the matter power spectrum match those of our fiducial cosmology model to typically percent level accuracy or better. This is consistent with the findings of previous work based on the baryonification formalism, which has also shown that the suppression of the power spectrum due to baryons is independent of variations in the parameters of the standard cosmological model to approximately the percent level, with the exception of a clear dependence on the universal baryon fraction, Ω /Ω (Schneider et al. 2020;Aricò et al. 2021).
The dependence on the universal baryon fraction is straightforward to interpret: in the absence of feedback, groups and clusters are expected to contain a baryon fraction that reflects that of the universe as a whole (White et al. 1993), thus a higher (lower) universal fraction requires more (less) efficient feedback to obtain a particular (e.g., observed) baryon fraction in groups/clusters. In other words, if the baryon fractions of groups/clusters are used as an independent constraint on the feedback-induced suppression of the matter power spectrum, one must also specify the universal baryon fraction that is assumed. This is why our model takes as its independent variable the halo baryon fraction normalised by the universal baryon fraction. In practice, the quantity Ω /Ω is precisely pinned down by observations of the CMB, such that the uncertainties in baryon fractions of haloes are dominated by the statistical and systematic errors in the halo (total and baryon) mass measurements. Nevertheless, marginalising over the uncertainties in both the halo and universal baryon fractions is recommended. Going forward, it will be important to continue to check for potential degeneracies between baryon and cosmological physics as we extend beyond the standard model of cosmology and also explore other cosmological observables (e.g., galaxy clustering, cluster counts, and so on).
Another caveat of our study is that, while we have explored a wide range of the feedback parameter volume in the present study, this has only been done within the context of a given framework (or parametrisation) for feedback. It will be important for other groups to independently test and perhaps refine the predictions for the impact of baryon physics on the matter power spectrum as new simulations become available. The study of van Daalen et al. (2020), which compared a limited number of simulations in the literature but which spanned a wide range of resolutions, hydro solvers, and feedback frameworks, is very suggestive that the physics of ( ) is mainly driven by the baryon fractions. However, these simulations may not explore the full range of possibilities (see e.g., Debackere et al. 2020;Amon & Efstathiou 2022), so it is important to continue exploring the effects of baryons using a wide variety of methods, including the halo model, the baryonification formalism, and different hydrodynamical simulations.
As an initial "proof of concept", we test the separability of feedback parameters (and resolution effects) from the cosmological model in Fig. 11 using a higher resolution simulation and one that changes the cosmology by including the effects of massive neutrinos. While this is a limited test, we find that our model based solely on the baryon fraction recovers the suppression of the matter power spectrum to a typically better than ∼ 2% accuracy. Further investigation including different feedback prescriptions, cosmological models, and resolution effects will be thoroughly explored in a follow-up paper.

SUMMARY AND CONCLUSIONS
We have presented ANTILLES, a new suite of 400 hydrodynamical simulations that explores the "feedback landscape" associated with baryon physics. We have shown that a relatively simple yet remarkably accurate model can be constructed for the suppression of the matter power spectrum using only the mean baryon fraction of haloes (specifically the baryon fraction for haloes that dominate the power spectrum at a given scale, or wavenumber). Our work follows on from the recent study of van Daalen et al. (2020), by expanding greatly the number of simulations used to map this relationship and by pushing to a wider range of scales and redshifts, which are requirements for current and upcoming large-scale structure surveys including cosmic shear.
The main specific findings of our study may be summarised as follows: • In agreement with van Daalen et al. (2020), we find that the fractional impact of baryon physics on the present-day non-linear matter power spectrum up to ∼ 1 ℎ Mpc −1 correlates very strongly with the baryon fraction of haloes with total masses of ∼ 10 14 M (see Figure 2). At smaller scales, however, the relation weakens somewhat, as also found by van Daalen et al. (2020).
• The weakening in the relation between the power spectrum suppression and the baryon fraction at a mass scale of ∼ 10 14 M is a result of lower mass haloes contributing more significantly at smaller scales (see also van Daalen & Schaye 2015;Mead et al. 2020). Using the simulations, we have empirically determined the halo mass scale whose baryon fraction correlates most strongly with the suppression of the power spectrum as a function of both scale and redshift (see Fig. 4 and Fig. 5). We refer to this mass scale as the 'optimal' mass.
• Our ( ) suppression model, called SP(k), is constructed by fitting a simple parametric form to the power spectrum suppression as a function of scale ( ), redshift ( ), and the normalised baryon fraction at the optimal scale. See Eq. (4).
• We characterise the error in our best-fit model relative to our simulation training set in Fig. 8 and Fig. 9, showing our model to be accurate to typically better than ≈2% percent accuracy over the range 0.1 [ℎ Mpc −1 ] 10 and from = 0.1-3.  Fig. 11).
• We make a Python implementation of our model publicly available at https://github.com/jemme07/pyspk With a fast, accurate model for characterising the effects of baryon physics on the non-linear matter power spectrum, it is possible to straightforwardly incorporate these effects in existing theoretical pipelines based on gravity-only calculations (e.g., emulators for the absolute non-linear power spectrum or the halo model). This will allow one to consistently propagate the uncertainties in astrophysics (due to our uncertainties in the baryon fractions of groups/clusters as a function of mass and redshift) through to the cosmological constraints. An advantage that our model has over existing methods based on the halo model or the 'baryonification' formalism is that it effectively depends only on a single physically-meaningful parameter (the baryon fraction). The benefit of this is that observational constraints on the baryon fraction could be used to inform the priors used in cosmological analysis, which in turn should help minimise the degradation of cosmological constraints that results from marginalising over baryon effects (e.g. Amon & Efstathiou 2022). Furthermore, by providing a set of analytic equations, the model can be easily "inverted" and allows for rapid experiments to be conducted, providing a powerful tool to explore the differential effects of baryonic physics.
Finally, as our model is expressed in terms of the baryon fraction of groups and clusters, we require observational constraints to either inform the priors that will be used in cosmological pipelines or to be used directly in a joint analysis. While the baryon fractions are pinned down reasonably precisely in the low-redshift universe (perhaps 0.2), there are currently very few constraints at higher redshifts for haloes with 10 14 M . The situation may soon change, though, as eROSITA (X-ray), Advanced ACT (tSZ, kSZ) and Simons Observatory (tSZ, kSZ) data of large numbers of highgroups starts to become available. An important consideration in these analyses will be carefully modelling the selection function to enable an unbiased measurement of the group baryon fractions. Analyses of mock X-ray and SZ observations of cosmological hydrodynamical simulations will be important in this endeavour.