Non-linear CMB lensing with neutrinos and baryons: FLAMINGO simulations vs. fast approximations

Weak lensing of the cosmic microwave background is rapidly emerging as a powerful probe of neutrinos, dark energy, and new physics. We present a fast computation of the non-linear CMB lensing power spectrum which combines non-linear perturbation theory at early times with power spectrum emulation using cosmological simulations at late times. Comparing our calculation with lightcones from the FLAMINGO 5.6 Gpc cube dark-matter-only simulation, we confirm its accuracy to 1% (2%) up to multipoles L = 3000 (L = 5000) for a nuLambdaCDM cosmology consistent with current data. Clustering suppression due to small-scale baryonic phenomena such as feedback from active galactic nuclei can reduce the lensing power by of order 10%. To our perturbation theory and emulator-based calculation we add SP(k), a new fitting function for this suppression, and confirm its accuracy compared to the FLAMINGO hydrodynamic simulations to 4% at L = 5000, with similar accuracy for massive neutrino models. We further demonstrate that scale-dependent suppression due to neutrinos and baryons approximately factorize, implying that a careful treatment of baryonic feedback can limit biasing neutrino mass constraints.


INTRODUCTION
Cosmological bounds on the sum of neutrino masses, M ν = m ν ≤ 0.12 eV, are rapidly converging on the lower bound M ν = 0.06 eV from laboratory experiments, promising a measurement in the next several years under restrictive assumptions about the cosmological model (de Salas et al. 2018;Capozzi et al. 2018;Esteban et al. 2020;Aghanim et al. 2020a;Palanque-Delabrouille et al. 2020;Alam et al. 2021;Abbott et al. 2022).However, relaxing these assumptions by allowing greater variation in the neutrino sector, dark energy, or scale-dependent bias, or by considering different combinations of probes, can substantially weaken cosmological bounds to M ν ≲ 0.5 eV (Upadhye 2019;Di Valentino et al. 2020;Di Valentino & Melchiorri 2022;Sgier et al. 2021), while relatively cosmologyindependent terrestrial experiments are consistent with up to M ν ∼ 2 eV (Aker et al. 2019(Aker et al. , 2022)).Meanwhile, tensions in the Hubble expansion rate and small-scale clustering demand a deeper understanding of the interdependent constraints on neutrinos and other components of the standard cosmological model (Leauthaud et al. 2017;Cai et al. 2022;Amon et al. 2023;Abdalla et al. 2022).
Weak gravitational lensing of the cosmic microwave background (CMB), or CMB lensing, is a promising approach to measuring M ν .It is immune to scale-dependent galaxy biasing, which has the potential to bias M ν measurements from galaxy redshift surveys.Its source, the CMB, is well-understood, consisting of photons which last scattered over a very narrow range of redshifts around 1100.Additionally, unlensed CMB temperature and polarization perturbations are very accurately characterized using linear perturbation theory applied to adiabatic, nearly-scale-invariant Gaussian random density fluctuations.Thus, CMB lensing evades biases due to in-trinsic alignments and photometric redshift errors that affect lowerredshift weak lensing surveys (Weinberg et al. 2013).
Next-generation CMB surveys are expected to improve upon the signal-to-noise ratio of the Planck survey by over an order of magnitude (Abazajian et al. 2016;Ade et al. 2019;Liu et al. 2022).Aside from mapping the CMB temperature on smaller scales, they will substantially advance our knowledge of the polarized CMB.Systematic uncertainties due to astrophysical foregrounds and atmospheric noise are significantly smaller for the polarization than the temperature, and the lack of a small-scale primordial B mode polarization reduces the uncertainty due to cosmic variance.Thus the next generation of experiments will be able to quantify CMB lensing to Legendre moments L of a few thousand, or angles below 10 arcmin.
Precision prediction of CMB lensing at these scales requires an understanding of non-linear corrections to the clustering of matter.Several approaches to non-linear clustering have been explored in recent years.Non-linear perturbation theory typically begins with the continuity and Euler equations of fluid dynamics, whose non-linear terms couple different Fourier modes together; see Crocce & Scoccimarro (2006); McDonald (2007); Taruya & Hiramatsu (2008); Matsubara (2008a,b); Pietroni (2008); Lesgourgues et al. (2009).We focus here on the Time-Renormalization Group (Time-RG) perturbation theory of Pietroni (2008); Lesgourgues et al. (2009), designed for massive-neutrino cosmologies with scale-dependent clustering growth, as implemented in the redTime code of Upadhye (2019).Another approach to non-linear corrections begins with the halo model of clustering, detailed in Ma & Fry (2000); Seljak (2000); Cooray & Sheth (2002), tuned or supplemented by fitting functions to agree with large computer simulations, as in Smith et al. (2003); Bird et al. (2012); Takahashi et al. (2012); Mead et al. (2015Mead et al. ( , 2020)).As the Halofit function of Bird et al. (2012) was fit to neutrino simulations, we also consider it here.Finally, the most accurate estimates of non-linear clustering, within limited ranges of parameters and redshifts, are emulators based upon large suites of N-body simulations; see Heitmann et al. (2009); Lawrence et al. (2010); Knabenhans et al. (2021); Moran et al. (2023).In the present study, we use Euclid Emulator 2 of Knabenhans et al. (2021) and the Mira-Titan IV emulator of Moran et al. (2023).
As we are particularly interested in the small-scale (k ≳ 0.1 h Mpc) suppression of clustering, hence CMB lensing, due to massive neutrinos, we must distinguish this suppression from the small-scale hydrodynamic effects of baryons.Cooling and clumping of baryons leads to the formation of supernovae, which expel baryonic matter from galaxies.Baryonic clustering at the centers of large halos feeds Active Galactic Nuclei (AGN), which, in turn, heat the baryonic gas and expel baryonic particles.The combined effect of these phenomena is a suppression of clustering on ∼ 1 Mpc scales, while hydrodynamic models typically predict enhanced clustering on much smaller scales.We model hydrodynamic effects through an innovative fitting function, SP(k), by Salcido et al. (2023), which uses the fact that the total hydrodynamic suppression is strongly correlated with the baryonic content of halos of a characteristic mass; 1 see also van Daalen et al. (2020); Pandey et al. (2023).We briefly explore a oneparameter generalization of SP(k), showing that it covers a wide range of hydrodynamic models.
Our goal in this work is a fast, accurate computation of CMB lensing in the non-linear regime, quantified by the power spectrum of the lensing potential ϕ, whose gradient determines the angle by which a CMB photon is deflected.We combine the Mira-Titan IV emulator at low redshifts with Time-RG perturbation theory at high redshifts to yield a rapid computation of the matter power spectrum that is accurate at the times and length scales necessary for quantifying CMB lensing.Baryonic feedback effects are modelled using SP(k).Running in under a second, our calculation converges to sub-percentlevel precision for Legendre moments L ≤ 10000.We release our code, hyphi, publicly at github.com/upadhye/hyphi .
Rigorously quantifying the errors in hyphi requires a set of highresolution, large-volume numerical simulations.The FLAMINGO simulation suite of Schaye et al. (2023); Kugel et al. (2023) is an ideal testing ground for hyphi CMB lensing calculations.With cosmological parameters chosen to match data from CMB probes and galaxy surveys, its CMB lensing power spectrum closely matches state-of-the-art measurements, as we show below.FLAMINGO includes the largest-particle-number hydrodynamic simulation reaching z = 0, which is necessary for covering the range of redshifts contributing the most to CMB lensing, and its relatively high mass resolution of 7 × 10 9 M ⊙ provides a wealth of information on the impact of non-linear clustering.Further, the FLAMINGO suite independently varies the neutrino masses and hydrodynamic feedback, both of which suppress small-scale clustering, allowing us to investigate their separate effects on lensing.
We find close agreement between hyphi and the FLAMINGO simulations across a wide range of neutrino masses, 0.06 eV ≤ M ν ≤ 0.24 eV; source redshift bins spanning 0 ≤ z ≤ 25; and a variety of feedback models.In particular, comparison with the FLAMINGO M ν = 0.06 eV 5.6 Gpc-box dark-matter-only (DMO) simulation demonstrates the accuracy of hyphi to 1% up to L = 3000 and 2% up to L = 5000.Dividing the range 0 ≤ z ≤ 25 into eight source 1 SP(k) is publicly available at github.com/jemme07/pyspk .mass bins, we find 5% agreement to at least L = 4000 for each bin.FLAMINGO simulations with larger M ν and hydrodynamic feedback are run in smaller 1 Gpc boxes, meaning that sample variance contributes more to the discrepancy between hyphi and simulations, but even so, the two agree to 4% up to L = 4000 in a model with M ν = 0.24 eV and the standard hydrodynamic feedback.Furthermore, we demonstrate that neutrino and baryonic suppression of the lensing power factorize, facilitating a marginalization over feedback parameters to constrain M ν .We show that this result extends to a wide variety of feedback models, including those with jets, as well as those reducing the cluster gas fraction well below its best-fit value.
This article is organized as follows.Section 2 provides overviews of CMB lensing, the FLAMINGO suite of simulations, and our modeling of baryonic feedback.DMO models are considered in Sec. 3, which studies the dependence of CMB lensing on the matter clustering, quantifies its suppression by massive neutrinos, and compares our fast computation to FLAMINGO.Baryonic effects are included in Sec. 4, which demonstrates that these two types of scale-dependent suppression factorize.Finally, Sec. 5 shows our conclusions.

CMB lensing
For a thorough, modern review of CMB lensing, see Lewis & Challinor (2006), which we briefly outline here.Lensing may be understood as deflecting a light ray from the surface of last scattering as it passes through density perturbations along our past light cone.The result is that a ray incident on a detector at angle n initially came from an angular position n +⃗ ϵ on the sky.CMB temperature perturbations Θ(n) := δT (n)/ T on the last-scattering surface therefore are distorted by lensing to the observed perturbations Θ(n) = Θ(n +⃗ ϵ).
Qualitatively, the effects of lensing on the CMB temperature and polarization power spectra split into three regimes, defined by large, intermediate, and small scales.On large scales, with Legendre moments ℓ ≪ 300, lensing has only a small effect on perturbations.Acoustic peaks on intermediate scales ℓ ∼ 1000 are smeared out by lensing.On small scales, ℓ ≫ 3000, where diffusion damping sharply reduces the power of the unlensed CMB perturbations, lensing and other secondary anisotropies dominate the power.
At each position along a photon's path to us, its deflection is proportional to the gradient of the local gravitational potential.Thus its total deflection is the gradient of the line-of-sight integral of the gravitational potential, weighted by a lensing kernel g defined below.This integral, known as the lensing potential ϕ(n), sources the deflection ⃗ ϵ = ⃗ ∇ϕ.Gravitational lensing of the CMB may be quantified statistically using the power spectrum C ϕϕ L of ϕ.In the Limber approximation of Limber (1953), appropriate to the scales of interest to us here, this is (1) (3) Here, P Φ and P m are, respectively, the power spectra of the gravitational potential and total matter in units of volume; χ(z) is the comoving distance to redshift z; χ ⋆ = χ(z ⋆ ); and z ⋆ is the redshift of the baryon drag epoch.The cosmological parameters Ω m,0 and H 0 are the present-day values of the matter density as a fraction of the critical density ρm,0 / ρcrit,0 , and the derivative of the logarithm of the scale factor with respect to conformal time d log(a)/dτ, respectively.
In turn, the power spectrum of the lensed temperature perturbation Θ(n) is found from the appropriately-weighted convolution of C ϕϕ L with the unlensed power C ΘΘ ℓ : Here we work in the flat-sky limit in which ⃗ ℓ is the two-dimensional Fourier conjugate to the direction n.Factors of L and ℓ − L arise from gradients of the lensing potential power spectrum.The first term on the right hand side describes the lensing-induced smearing of C ΘΘ ℓ , while the second represents a smooth suppression of power due to lensing.Lensed CMB polarization power spectra are similarly found by convolution with C ϕϕ L ; see Lewis & Challinor (2006).Thus C ϕϕ L is necessary for quantifying lensing of the CMB, and is useful for comparing the observed lensing to theoretical predictions.Our goal henceforth is a computation of C ϕϕ L , including non-linear clustering as well as power suppression by massive neutrinos and hydrodynamic effects.

The FLAMINGO simulations
The FLAMINGO simulations used here are thoroughly described in Schaye et al. (2023); Kugel et al. (2023).They employed the swift code of Schaller et al. (2023), which includes gravitation, hydrodynamic feedback, and subgrid models for unresolved physics relevant to galaxy formation, including metal-dependent radiative cooling, star formation, stellar evolution, and stellar and AGN feedback.Neutrinos were incorporated into the simulations through the δ f method of Elbers et al. (2021), and included in the initial conditions of MonofonIC, Hahn et al. (2021Hahn et al. ( , 2020)), as detailed and implemented in Elbers et al. (2022); Elbers (2022b,a).
Table 1 lists the code and cosmological parameters for the FLAMINGO simulations used in this work.We will use the largevolume run L5p6_m10_DMO to quantify the accuracy of our C ϕϕ L computation at high redshifts; the L sim = 2.8 Gpc runs, L2p8_m9_DMO and L2p8_m9, to study the impact of baryonic feedback; and the several L sim = 1 Gpc runs to asses the impact of different neutrino masses and feedback processes on C ϕϕ L .Lightcones are data products of the FLAMINGO simulations of Schaye et al. (2023).Lightcone maps apply the HEALPix pixelization of Górski et al. (2005) to spherical shells, recording the total mass contained in each resulting volume element.Shell thicknesses in redshift space, z i+1 − z i , are 0.05 up to z = 3, and then 0.25 up to z = 5, gradually increasing to 5 at z = 15, corresponding to homogeneous-universe comoving radii χ(z) between χ i = χ(z i ) and χ i+1 = χ(z i+1 ).Each shell is divided into 12N 2 side = 3, 221, 225, 472 pixels, each of angular size 166 arcsec 2 , with the HEALPix parameter N side = 16, 384.For each HEALPix angular pixel j, the lensing convergence κ j is computed through a summation discretizing the line-of-sight integral over comoving distance: Here, ∆χ i := χ i+1 − χ i ; ⟨χ⟩ i is the mean comoving distance in redshift shell i; χ ⋆ is the comoving distance to the surface of last scattering; M i j is the mass in shell i and HEALPix pixel j; and Mi j = 4π 3 ρm,0 (χ 3 i+1 −χ 3 i )/(12N 2 side ) is the homogenous-universe mass in i and j.We fix χ ⋆ = χ(z ⋆ ), with z ⋆ = 1089.80as measured by Aghanim et al. (2020a).
Standard power spectrum computation codes such as the polspice code of Szapudi et al. (2000) cannot process more than 2 31 ≈ 2 × 10 9 pixels, so we downsample the κ map to N side = 8192.Then we use polspice with the -pixelfile YES option to compute its power spectrum, which we subsequently multiply by 4/L 4 to yield the lensing potential power spectrum C ϕϕ L .We confirm by comparison to power spectra from lower-resolution maps that our C ϕϕ L with N side = 8192 has converged to better than 1% (2%) up to L = 5000 (L = 6000), which is sufficient for testing hyphi.
The lensing potential power spectrum from the L sim = 5.6 Gpc FLAMINGO simulation, L5p6_m10_DMO, averaged over lightcones computed from eight different positions in the simulation volume, is compared with several recent observations in Fig. 1.Agreement across two orders of magnitude in the Legendre moment is impressive, demonstrating that the FLAMINGO simulation suite is an appropriate tool for quantifying non-linear CMB lensing.Since we are primarily concerned with neutrino and baryonic suppression effects, which are largest at small scales L ≳ 1000 where data error bars are large, we henceforth focus on FLAMINGO directly as a means of quantifying these small-scale effects and testing their predictions.
Figure 2 shows FLAMINGO lensing potential power spectra for several models.Although neutrinos and baryons both suppress C ϕϕ L by ∼ 10%, these effects depend very differently upon scale.The neutrino suppression extends down to L ≈ 100, while the baryonic suppression is visible only beyond L = 1000.Quantifying the difference between these effects is a major goal of this article.
The other obvious difference between L2p8_m9 and the Plancklike models, Planck and PlanckNu0p24Fix, in Fig. 2, is in their large-scale power.This difference is due to the fact the the Plancklike models are simulated in smaller boxes whose lightcones are limited to z ≤ 3, so our C ϕϕ L computation based upon Eq. ( 5) is limited to that redshift range.A direct comparison of these power spectra to the data would first require some approximation of the higher-z contribution to C ϕϕ L , such as the use of the FLAMINGO matter power spectra in the line-of-sight integral of Eqs.(1-3).However, since our goal is to test fast approximations using the lightcones, we take the opposite Table 1.List of FLAMINGO simulations, with simulation parameters and cosmological parameters.n LC is the number of high-redshift simulation lightcones produced; these cover the range 0 ≤ z ≤ 3 for L sim = 1000 Mpc/h; 0 ≤ z ≤ 5 for L sim = 2800 Mpc/h; and 0 ≤ z ≤ 25 for L sim = 5600 Mpc/h.For feedback fits and standard deviations, as well as a description of the feedback models realizing the f gas and SMF listed, see Schaye et al. (2023); Kugel et al. (2023).approach, and limit our perturbative and emulated computations to the redshift range covered by each lightcone.
One strength of the FLAMINGO simulation suite is that it contains multiple resolutions and box sizes for the same cosmology, allowing for convergence tests.Figure 3 compares C ϕϕ L integrated up to z = 3, computed from six different FLAMINGO simulations, with box sizes ranging from 1 Gpc to 11.2 Gpc and mean interparticle spacings ranging from 0.27 Mpc to 2.22 Mpc.Although the smaller-box simulations might be expected to underpredict power on large scales, we see that this effect is under a percent by L = 100.Similarly, although the simulation with the largest interparticle spacing, L11p2_m11_DMO, has significant small-scale errors, for L5p6_m10_DMO we see that these only appear for L > 10000.The runs L1_m9_DMO, L1_m8_DMO, L2p8_m9_DMO, and L5p6_m10_DMO all agree to 1% up to L = 3000 and < 1.5% beyond L = 20000, so we may use any of them for testing hyphi.In particular, FLAMINGO simulations varying the neutrino mass and the hydrodynamic feedback parameters have the same resolution as L1_m9_DMO, so Fig. 3 confirms their suitability for our purposes.
We consider one further issue related to FLAMINGO lightcones, that of simulation volume replication, with L5p6_m10_DMO and L11p2_m11_DMO as particular examples.The radius of a lightcone extending to z = 25 exceeds the simulation box size L sim for L5p6_m10_DMO, but not for L11p2_m11_DMO.Lightcones are generated by replicating the simulation volume consistently with its periodic boundary conditions.A photon propagating from z = 25 to z = 0 in L5p6_m10_DMO could pass through the same structures twice, and different portions of a large-z shell could see the same structures, introducing spurious correlations into our κ map.We consider several methods of mitigating such volume replication effects.We compute multiple κ maps, one for each z shell, and then combine them using one of four methods to obtain C ϕϕ L .
(i) Sum all κ maps along the entire line of sight as in Eq. ( 5), then compute C ϕϕ L from the resulting κ map.This is our standard C ϕϕ L computation, used in subsequent figures unless noted otherwise.
Figure 4 compares the second, third, and fourth methods to the first.Immediately apparent is the fact that the two box sizes agree to < 1% at all L, even though the L11p2_m11_DMO simulation does not suffer from volume replication errors.Thus we see that these errors are negligible for our purposes, at least within the context of our power spectrum ratios smoothed with centered 100-point moving averages.Also apparent is the close agreement between methods (ii) and (iii), showing that bin rotation and separate-bin power spectrum computation decorrelate different z bins in very similar ways.Next, consider small scales, L ≥ 1000, which are most relevant to tests of the non-linear C ϕϕ L .The sum-of-bins calculation, method (ii), and the sum-of-rotated-bins calculation, method (iii), agree with the standard method (i) to 1% across this entire range, for both simulations and both light cone sizes.Thus any of these three methods is sufficiently accurate for percent-level tests of the non-linear C ϕϕ L .Meanwhile, method (iv) exhibits significant power loss, and should not be used for testing hyphi.
Finally, consider L < 1000 in Fig. 4. Evidently, breaking the red- .For L5p6_m10_DMO, with a 5.6 Gpc box, methods (ii) (sum of powers of z bins), (iii) (power of sum of rotated z bins), and (iv) (power of sum of rotated z shells) are show divided by the standard method (i), computed over the region z ≤ 2 (dashed) and z ≤ 25 (solid).Also shown are methods (ii) and (iv) divided by (i) for L11p2_m11_DMO, whose large simulation volume makes it immune to volume replication errors.
shift range into a greater number of intervals, and rotating the resulting κ maps, leads to a greater power suppression; method (iv) uses 40 redshift shells, while methods (ii) and (iii) use 12 redshift bins each.Furthermore, the presence of this suppression even in the L11p2_m11_DMO power spectra and the z ≤ 2 L5p6_m10_DMO power spectrum, for which the light cone radii are less than L sim , shows that it is not due to volume replication effects.
The oscillatory nature of this low-L power suppression for method (iv), as well as its L-dependence, suggests that it is due to the exclusion of baryon acoustic oscillations (BAO) along the line of sight.Since χ ⋆ /2 is the peak of the lensing kernel of Eq. (3), the Legendre moment L is approximately associated with the length scale χ ⋆ /(2L) ∼ 5000L −1 Mpc/h, which for L ∼ 50 corresponds to the BAO scale.Thus, not only is map rotation unnecessary for suppressing volume replication effects, but excessive rotation erroneously throws out actual large-scale correlations which ought to be included in the power spectrum.
We conclude that κ maps from FLAMINGO lightcones, computed using method (i) above for the total power and methods (ii) or (iii) for tomographic redshift bins, are suitable for testing calculations of C ϕϕ L in the non-linear regime L ≳ 1000 at the percent level.At larger scales, L ∼ 100, the redshift-binned methods (ii) and (iii) underpredict power by 2%-3%, an error of which we should be aware when testing the application of hyphi to tomographic bins.

Baryonic feedback
Stellar and AGN feedback in the FLAMINGO simulations was calibrated using machine learning to two observables: the galaxy stellar mass function (SMF) at z = 0, measured in Driver et al. (2022); and cluster gas fractions, f gas , measured using X-ray and weak lensing observations, as compiled in Kugel et al. (2023).The SMF constrains the galaxy-halo connection, while f gas correlates with the magnitude of the clustering suppression due to baryonic feedback (van Daalen et al. 2020).This approach allowed FLAMINGO to explore the parameter space of feedback models in an observationallyrelevant manner.For example, the high-gas-fraction model fgas+2σ in Table 1 increased f gas by twice the observational uncertainty, while the low-gas-fraction model fgas-2σ decreased f gas by the same amount.For a complete description of feedback in FLAMINGO, see Schaye et al. (2023); Kugel et al. (2023).
Figure 5 demonstrates the impact of neglecting baryonic feedback above a redshift z b on the lensing potential power spectrum.The figure compares FLAMINGO runs L2p8_m9_DMO and L2p8_m9 from Table 1, which are, respectively, dark-matter-only and hydrodynamic runs in 2.8 Gpc boxes.Neglecting feedback at all redshifts (z b = 0) leads to a maximum error of 8%, which is reduced to 4% (0.4%) by including feedback for z b ≤ 1 (z b ≤ 2).Thus, percent-level accuracy in the computation of C  2023) (see also Semboloni et al. 2011;van Daalen et al. 2020;Debackere et al. 2021) demonstrated that, for the purpose of determining hydrodynamic suppression of the matter power spectrum, these phenomena may be reduced to their effect on the characteristic relative baryon fractions fb (z) = f b ( M, z)/(Ω b /Ω m ) of halos of characteristic mass M(z) at redshift z.Given fb , whether measured from observations or computed through simulations, the hydrodynamic suppression factor P m,hyd /P m,dmo is fit to 2% accuracy in the range k ≤ 12 h/Mpc and z ≤ 3 by the SP(k) function of that reference.
Given the full functional form of fb (M, z), we could in principle compute the baryonic suppression directly using a halo model.However, this function is difficult to measure.Salcido et al. (2023) showed that, for each z, the k-dependent baryonic suppression is most strongly correlated with the baryon fraction at a single characteristic mass M(z).This correlation was shown to be independent of the strength of the subgrid feedback.
Our calculations in Sec. 4 will use fb (M, z) measured directly from the FLAMINGO L1_m9 simulations, evaluated at the M(z) of Salcido et al. (2023).Qualitatively, fb is large for halos at high redshifts, before structures such as AGN have formed; for halos too small to form such structures; and for the largest halos, which efficiently capture baryons, and are representative of the universe as a whole.The SP(k) fit covers the range k ≤ 12 h/Mpc and z ≤ 3 necessary for ≈ 2%-level accuracy in C ϕϕ L .Above z = 3, we assume that the power spectrum suppression due to baryons is negligible, an approximation consistent with Fig. 5.For k > 12 h/Mpc, we again assume a negligible suppression, an approximation which we confirm to have a sub-percent-level impact on C ϕϕ L up to L = 9000.

Non-linear perturbation theory
In order to calculate the non-linear perturbative CDM+baryon (CB) power spectrum in the presence of massive neutrinos, we employ the Time-Renormalization Group (Time-RG) perturbation theory of Pietroni (2008); Lesgourgues et al. (2009).Time-RG integrates the non-linear continuity and Euler equations of fluid dynamics over time for each wave number, making it well-suited to massive neutrinos, which introduce a scale-dependence into the growth factor, and to dark energy models with evolving equations of state.
Let η = ln(a/a in ) for some initial scale factor a in , and let primes denote derivatives with respect to η.Let the perturbation indices 0 and 1 correspond, respectively, to the density contrast δ = (ρ CB − ρCB )/ ρCB and the velocity divergence θ = − ⃗ ∇ •⃗ v/H; thus, for example, P 01 (k) represents P δθ (k).Then at each k, the Time-RG equations of motion for the CB power spectra P ab are where I ′ acd,be f = −Ξ bg I acd,ge f − Ξ eg I acd,bg f − Ξ f g I acd,beg + 2A acd,be f (10) with all other γ abc zero, and summation over repeated indices implicit.Here, wave number superscripts denote functional dependence, so that P ⃗ k ab denotes the power spectrum P ab ( ⃗ k) and B ⃗ k,⃗ q,⃗ p abc the bispectrum B abc ( ⃗ k,⃗ q, ⃗ p).We use q X ⃗ k,⃗ q,⃗ p as shorthand for for any function X( ⃗ k,⃗ q, ⃗ p), where δ (D) is the three-dimensional Dirac delta function.In Time-RG, the bipsectrum integrals I acd,be f , initialized to zero, are the repositories of non-linear information, and are sourced by the modecoupling integral A acd,be f .
We may add to this system an evolution equation for the lensing potential power spectrum of Eq. (1).Substituting k = L/χ, we recast the line-of-sight integral as an integral over our time coordinate η of where a, z, H, and χ are now functions of η.Since P cb , hence P m , is available at each time step in our Time-RG integration, C ϕϕ L may simply be added to our system of Eqs.(6-11) with very little additional computational cost.Furthermore, if P m contains a hydrodynamic suppression factor parameterized by some variables, we can compute a separate C ϕϕ L for each of several choices of variables all at once, again with little additional computation.

Non-linear lensing power
Next, we use non-linear perturbation theory to consider the dependence of C ϕϕ L upon clustering in different redshift ranges.The contribution of z min ≤ z ≤ z max is found by integrating Eq. ( 12) from η(z max ) to η(z min ), where η(z) = − ln[(1 + z)a in ].We use P m = ( f cb P 1/2 00 + f ν P 1/2 ν ) 2 , as in Saito et al. (2008), where P 00 is the Time-RG power spectrum of Eq. ( 6).We shall see in the remainder of this section that Time-RG is sufficiently accurate to provide a qualitative picture of the contributions of different redshifts.
Figure 6 shows the contributions of several redshift bins to the total lensing potential power spectrum.The domination of the low-L lensing power by low-z clustering is explained by Fig. 7. Matter power spectra P m (k, z) typically peak around k = 0.01 h/Mpc, a scale which contributes to C ϕϕ L for L = 10 at z ≈ 0.5, and for L = 30 at z ≈ 2. Interestingly, at L ∼ 1000, the total contribution from z > 5 in Fig. 6 is ≈ 30%, since the lower-z clustering contributing to this L is drawn from increasingly large k, for which the power spectrum rapidly declines.
Heat maps quantifying the fractional contributions of different wave numbers and scale factors to C ϕϕ L are shown in Fig. 8, which divides dC ϕϕ L /dη from Eq. ( 12) by the final a = 1 lensing potential power.For L = 1000 (top) and L = 3000 (bottom), the contributions peak at (a, k) of (0.60, 0.26 h/Mpc) and (0.79, 0.84 h/Mpc), respectively, and are significant within factors of ∼ 3 of these values.Thus, the accurate computation of C ϕϕ L up to L of several thousand requires reliable estimates of the matter power spectrum for z ≲ 5 and k ≲ 2 h/Mpc.
Since our focus is non-linear lensing at high L, we must also quantify the contribution of non-linear clustering to C ϕϕ L , which we may do by substituting the difference between non-linear and linear matter power spectra for P m in Eq. ( 12). Figure 9 shows the result.Across the entire range L ≤ 10000 considered, the total non-linear contribution for all z > 3 is ≤ 7% of C ϕϕ L , while that for all z > 5 is ≤ 2%.Thus our numerical tests of the non-linear convergence power spectrum in the remainder of this section should focus on z ≲ 3 − 5.

N-body fits and emulation
Though Sec. 3.1 used the Time-RG non-linear perturbation theory to compute the matter power spectrum P m (k), we may substitute any non-linear P m (k) into Eq.( 12).Perturbation theory accurately approximates P m (k) at high redshifts but becomes increasingly inaccurate for z < 1, while approximations calibrated to N-body simulations have prioritized low-redshift accuracy.This work considers three such simulation-based approximations.Halofit, inspired by the halo model, splits the non-linear power spectrum into a quasi-linear term, which approaches the linear power spectrum on large scales, and a non-linear term, which is fit to simulations; see Smith et al. (2003); Bird et al. (2012); Takahashi et al. (2012).We implement the Halofit model of Bird et al. (2012) which has been fit for massive neutrino cosmologies.
The second approximation is the Mira-Titan IV (MT4) emulator of Moran et al. (2023) which uses Gaussian process modelling to interpolate a suite of 111 simulations.Chosen carefully to span a large parameter space, this suite includes 101 massive neutrino cosmologies with 0.00017 ≤ ω ν ≤ 0.01.The third approximation, Euclid Emulator 2 (EE2) of Knabenhans et al. (2021), covers a greater range of redshifts and wave numbers than the MT4 emulator, at the cost of a smaller neutrino mass range, m ν ≤ 0.15 eV.Halofit has been calibrated up to z = 3, the MT4 emulator to z = 2, and EE2 to z = 10; at higher redshifts, we revert to Time-RG in Eq. ( 12).
MT4 and EE2 are based upon simulations with mass resolutions of 10 10 M ⊙ and 10 9 M ⊙ , respectively, compared with 7 × 10 9 M ⊙ for standard FLAMINGO simulations, and 8 × 10 8 M ⊙ for L1_m8_DMO.MT4 reaches k ≈ 7 h/Mpc and EE2 reaches ≈ 9 h/Mpc, compared with 17 h/Mpc for standard FLAMINGO simulations and 33 h/Mpc for L1_m8_DMO.Thus the FLAMINGO simulation suite is appropriate for testing emulator results.
Figure 10 compares these non-linear methods at a range of redshifts.Evidently, the two emulators provide the most accurate power spectra at all redshifts for which they are available, with the MT4 emulator slightly more accurate at z ≥ 1 and 1 h/Mpc ≲ k ≲ 7 h/Mpc.As the MT4 emulator is restricted to the range k ≤ 5/Mpc ≈ 7 h/Mpc, we logarithmically extrapolate its power spectrum beyond that wave number, an approximation whose accuracy evidently diminishes around z = 2.For z > 2, Time-RG is somewhat more accurate than Halofit at large scales, k ≲ 2 h/Mpc, and less accurate at small scales.

Tests of C ϕϕ L
Figure 11 compares lensing potential power spectra from linear and Time-RG perturbation theories, Halofit, and the two emulators to the FLAMINGO lensing potential power spectrum, computed as described in Sec.2.2.As might be expected from the comparison to a higher-resolution simulation in Fig. 10, both perturbation theories underpredict small-scale power, with linear and Time-RG perturbation theory underpredicting by ≥ 3% above L ≈ 300 and L ≈ 2400, respectively.Halofit is more accurate than Time-RG for 200 ≲ L ≲ 1000 but underpredicts power by ≥ 3% above L ≈ 1300.In this and subsequent plots, the ratios to N-body power spectra are summed over all available lightcones and smoothed using centered 100-point moving averages.
Meanwhile, the emulator-based calculations are highly accurate.The two emulators, MT4 and EE2, agree at the percent level at L ≲ 6000, with MT4 somewhat more accurate for 1000 ≲ L ≲ 4000 and less accurate for L ≳ 5000.Due to its greater neutrino mass range,  12) over the appropriate redshift range, using the Mira-Titan IV Emulator for P m at z ≤ 2 and Time-RG perturbation theory at z > 2. This is compared with the FLAMINGO N-body C ϕϕ L contributions from the same redshift bins.Inner and outer dotted lines show errors of 2% and 5%, respectively.and its slight accuracy advantage in Fig. 11, we will focus henceforth on MT4.Our standard C ϕϕ L calculation (MT4+TRG) will use the MT4 emulator for the matter power spectrum in the range z ≤ 2 and Time-RG perturbation theory above that redshift.
The accuracy of this standard MT4+TRG calculation for matter sources in several redshift bins is assessed in Fig. 12.Aside from low-L bumps consistent with Fig. 4, our MT4+TRG calculation is accurate to ≤ 2% up to L = 3000.For all but one redshift bin, MT4+TRG is ≤ 5% accurate for L ≤ 6000.Thus our calculation is accurate not just for the total CMB lensing power, but also for individual redshift bins, as needed for tomographic analyses such as Peacock & Bilicki ( 2018

Massive neutrino suppression
We begin by considering the effects of neutrinos on the matter power itself.Following the approach of Lesgourgues et al. (2009); Upadhye et al. (2014); Upadhye (2019), we include neutrinos at the fully linear level, interpolating the ratio δ ν /δ cb from camb (Lewis et al. 2000;Lewis & Bridle 2002).Thus, we include neutrinos in the gravitational potential Φ(k, z) of Eq. ( 7) as Ratios of matter power spectra differing only in M ν , with Ω m,0 , Ω b,0 , h, n s , A s , w 0 , and w a held fixed, exhibit a characteristic "spoon" feature shown in Fig. 13 that even non-linear perturbation theory fails to capture.Hannestad et al. (2020) showed this to be a consequence of halo formation.Halofit, which was calibrated in Bird et al. (2012) to fit this feature, agrees closely with the N-body ratio, at least at low z, while the Mira-Titan IV emulator is accurate to a few percent.
Figure 14 plots the matter power spectrum for a massive neutrino cosmology with M ν = 0.24 eV.In spite of Halofit's accurate computation of the neutrino spoon for z < 2, we find that the matter power itself is more accurately computed for z ≤ 2 by the Mira-Titan IV emulator, and for z > 2 at large scales, k ≤ 1 h/Mpc, by Time-RG perturbation theory.Thus, our MT4+TRG calculation continues to be the best approximation to the matter power spectrum for the k .Spoon-like feature in the ratios of power spectra of two models, PlanckNu0p24Fix_DMO and Planck_DMO from Table 1, differing only in their neutrino mass sums M ν .Halofit has been calibrated to fit low-redshift spoons accurately, while perturbation theory is qualitatively incapable of reproducing this non-perturbative effect.MT4 is limited to z ≤ 2.
and z relevant to computing the CMB lensing power spectrum in νΛCDM cosmologies.Next, we return to our computation of the lensing potential power spectrum in massive neutrino models.We tested our fully-linear neutrino approximation by allowing neutrinos to respond linearly to the non-linear CB growth, using the code of Chen et al. (2021), and found the impact on C ϕϕ L to be < 0.06% even for neutrino fractions as large as Ω ν,0 h 2 = 0.01.Chen et al. ( 2023) used a non-linear perturbation theory for massive neutrinos to show that the non-linear neutrino power is at most 2-3 times the linear response power for Ω ν,0 h 2 ≤ 0.005, corresponding to M ν = 0.47 eV, so we may safely bound the impact of neutrino non-linearity to < 0.2% over this range.This is a negligible source of error for near-future experiments, so we use the fully-linear-ν approximation of Eq. ( 13) henceforth.
Errors in the CMB lensing potential power spectra for a model with M ν = 0.24 eV are shown in Fig. 15.The FLAMINGO simulation uses a 1000 Mpc box, for which we have lightcones up to z = 3, so the other calculations are integrated only to z = 3 for comparison.In accordance with our previous results, the MT4+TRG calculation is accurate to < 2.5% up to L = 2000 and to < 5% up to L = 6000.These error estimates are limited by noise in the simulation itself, since only one high-z lightcone is available in the high-M ν run.

Matter and convergence power spectra
Dashed curves in Fig. 10 compare the total matter power spectrum from the high-resolution FLAMINGO hydrodynamic simulation, L1_m8, to approximations applying the SP(k) fitting function of Salcido et al. (2023) to each of the following: linear and Time-RG perturbation theories, Halofit, and MT4 and EE2 emulators.Since SP(k) applied to P m is a multiplicative correction, the ratio of corresponding dashed and solid lines is the same in all cases, and the deviation between each dashed and solid line is due to error in SP(k).At all redshifts, this error is ≤ 2% up to k ≈ 5 h/Mpc.
Aside from the small systematic overprediction of power at small scales, k ≳ 5 h/Mpc, for the hydro models, the DMO and hydro curves are qualitatively similar.Thus, our earlier conclusions apply as well to the hydro power spectra of Fig. 10: the two emulators predict the power spectrum accurately across their tested redshift ranges, while above z = 2, Time-RG perturbation theory accurately calculates the power up to k ≈ 1 h/Mpc and Halofit above k ≈ 2 h/Mpc.
Each of these methods, along with the SP(k) fitting function, has been applied to the computation of the CMB lensing potential power spectrum in the dashed curves of Fig. 11.They are compared to the FLAMINGO hydrodynamic simulation, L2p8_m9, whose lightcones cover redshifts z ≤ 5. Our standard MT4+TRG+SP(k)computation is 2% accurate up to L = 3000 and 5% accurate to L = 6000, while the EE2+SP(k)computation, which reaches smaller scales, is 4% accurate all the way to L = 10000.Figure 15 shows a similar accuracy level, 3% to L = 2000 and 5% to L = 5000, for a model with both baryonic feedback and massive neutrinos, PlanckNu0p24Fix.
Figure 16 plots neutrino spoons for the lensing potential power spectra, analogous to Fig. 13 for the matter power.The power spectrum ratio is less sensitive to hydrodynamic effects than the power spectrum itself, which is why the dark-matter-only and hydrodynamic curves are nearly identical.Furthermore, this ratio also reduces errors present in perturbation theory and the emulator, so that the MT4+TRG spoon is correct to < 2.5% for all L ≤ 10000.
Nevertheless, we may wonder about the impact of the systematic ≈ 2% overprediction of the spoon depth seen in Fig. 16.While this would bias neutrino mass measurements from the spoon feature, at present the C ϕϕ L spoon is not a reliable observable.Firstly, this is because the upturn in the spoon occurs only after L = 8000, well beyond current observational capabilities.Secondly and more fundamentally, the spoon itself is a comparison between the real, observed universe, whose cosmological parameters we do not know, and a hypothetical universe differing only in M ν , significantly complicating its use as an observable.Theoretical investigations of the spoon, along the lines of Hannestad et al. (2020), may prefer to use Halofit rather than MT4+TRG.

Factorizability of suppressions
Since neutrinos and baryonic feedback both suppress small-scale clustering, one may worry that errors in our feedback approximations will lead to significant biases in the neutrino mass determination.However, Mummery et al. (2017), using the BAHAMAS simulations of McCarthy et al. (2017McCarthy et al. ( , 2018)), demonstrated for several clustering statistics that neutrino and baryonic suppressions factorize.That is, the combined effect of neutrino and baryon suppression is the product of the individual suppression factors.Here we show that this factorization of neutrino and baryon effects also applies to the FLAMINGO CMB lensing potential power spectrum.
Consider the Planck-like cosmologies of Planck_DMO, PlanckNu0p24Fix_DMO, Planck, and PlanckNu0p24Fix from Table 1, which have either a minimal (M ν = 0.06 eV) or high (M ν = 0.24 eV) neutrino mass sum, and are simulated using dark matter only (dmo) or a hydrodynamic (hyd) simulations.We may define three different suppression factors, capturing the effects of neutrinos, baryons, and both at the same time: Figure 17 demonstrates the factorizability of neutrino and baryon suppressions in the CMB lensing potential power spectrum, using Figure 18.Our p f feedback parameterization (dashed), which raises the L1_m9 baryon fraction fb to the power p f before applying the SP(k)fit of Salcido et al. (2023), is compared to a wide range of FLAMINGO hydrodynamic suppression factors (solid).Dashed curves range from p f = 0.5 (uppermost curve) to p f = 1.5 (lowest curve) in increments of ∆p f = 0.2.simulations as well as our standard MT4+TRG+SP(k)calculation.While the simulations and emulators differ by ≈ 2% on small scales, in keeping with our previous results, in each case factorizability of the neutrino and baryon suppressions is accurate to better than 1%.This is due to the fact that the two suppressions depend very differently on length and time.Neutrino free-streaming suppresses clustering below the free-streaming length, corresponding to k FS ≲ 0.1 h/Mpc for typical masses, which is manifested in a ∼ 10% suppression of C ϕϕ L even at L = 100.Meanwhile, baryonic suppression is negligible below L = 1000 and is most significant around L = 10000.Further, the neutrino free-streaming length is larger at earlier times, and the resulting impact upon C ϕϕ L occurs over a wide range of redshifts, as opposed to the baryonic suppression, which was shown in Fig. 5 to be dominated by z ≲ 1.
The reader may wonder how well such factorizability applies to very different hydrodynamic feedback models.While SP(k)was calibrated to a wide range of feedback models (Salcido et al. 2023), a thorough exploration of the parameter space is beyond the scope of this article.For simplicity, consider a one-parameter family of generalizations which raise fb (z) = f b ( M, z)/(Ω b /Ω m ) to a non-negative power p f before applying SP(k).In this case, p f = 0 implies no hydrodynamic feedback; p f < 1 implies feedback that is weaker than that of the FLAMINGO fiducial model; and p f > 1 to feedback that is stronger.Figure 18 compares this approximation to a wide range of FLAMINGO feedback methods and finds 0.5 ≤ p f ≤ 1.5 to cover this range, to L ≈ 4000, aside from one model, Jet_fgas-4σ.
Figure 19 shows that factorizability of C ϕϕ L holds to excellent precision for a wide range of p f .While errors at 100 ≲ L ≲ 6000 tend to rise with p f , they remain ≤ 0.2% all the way to p f = 2, which has substantially stronger feedback, and are < 0.1% at all L for the fiducial case, p f = 1.Also shown is the factorization error in FLAMINGO itself, which is ≤ 0.3% everywhere and has a similar functional form to hyphi results.Factorizability opens up the possibility of a new SP(k)-like feedback fit applying directly to C ϕϕ L (hyd)/C ϕϕ L (dmo), whose parameters could be treated as nuisance parameters whose marginalization would reveal the underlying neutrino suppression.
While Salcido et al. (2023) found that factorizability is a good approximation at the 2% level with a history-independent baryonic correction model, the possibility remains that factorizability fails below this error threshold, or that baryonic correction is somewhat history- dependent.This limits the bound that we can place on the accuracy of factorizability, though we note that the direct factorizability measurement from FLAMINGO in Fig. 19 is consistent with our result of sub-percent-level factorization errors.

Applicability to data constraints
Complete forecast constraints such as those of McCarthy et al. (2022), but using SP(k)through hyphi, would first require forecast constraints on the halo baryon fraction fb (z) as a function of redshift.As such, they are beyond the scope of this article.However, before concluding, we comment on the applicability of hyphi to data and forecast constraints.
The simplest, but least powerful, approach is to parameterize fb (M, z) and then to treat these parameters as nuisance parameters over which to marginalize.Salcido et al. (2023) explores two-and three-parameter approximations to fb (M, z).A one-parameter approach, described in Sec.4.2, begins with a representative fb (M, z) measured from a hydrodynamic simulation, and raises it to a power p f prior to applying SP(k), with p f marginalized over as a nuisance parameter.Figure 18 demonstrates that the range 0.5 ≤ p f ≤ 1.5 covers a wide range of feedback models.
However, marginalizing over broad priors on fb (M, z) offers no significant advantages over directly parameterizing P m,hyd (k)/P m,dmo (k) as a function of subgrid feedback parameters, as done by Mead et al. (2020) in HMcode.As such, we anticipate that the resulting forecast constraints will be similar to those of Mc-Carthy et al. (2022).
The true strength of SP(k)-based power spectrum computations such as hyphi is that fb (M, z) is a physical observable.Measurements of the baryon fraction may be used to narrow the range over which fb (M, z) are marginalized.For example, Salcido et al. (2023) parameterizes fb (M, z) and then provides parameter ranges consistent at the 2σ and 3σ level with the low-redshift measurements of Akino et al. (2022).As CMB lensing is sensitive to baryonic suppression at higher redshifts, z ∼ 1, SZ cluster constraints such as those of Pandey et al. (2022) also provide useful priors on the baryon fraction, with the caveat that SZ surveys are sensitive to halos with masses larger than the M(z) used in SP(k).

CONCLUSION
Weak lensing of the cosmic microwave background is becoming increasingly important as a galaxy-bias-independent constraint on the sum of neutrino masses M ν .Our ability to interpret upcoming lensing measurements at small scales depends crucially on our ability to distinguish between the scale-dependent clustering suppressions due to baryons and neutrino free-streaming in the non-linear regime.Employing the FLAMINGO suite of simulations, the largest-particlenumber hydrodynamic simulations reaching z = 0 to date, we systematically compared several methods for computing the non-linear CMB lensing potential power spectrum, implemented in our hyphi code.
A technical point in the PlanckNu0p24Fix_DMO computation deserves further comment.The FLAMINGO PlanckNu0p24Fix_DMO lightcone covers the redshift range 0 ≤ z ≤ 3, so we compare it with the hyphi C ϕϕ L including only lens masses from that range of redshifts.Although hyphi can begin integrating C ϕϕ L at z = 3, we find that doing so accurately requires a low absolute error tolerance, and correspondingly longer running times.Instead, in the dashed curves of Fig. A2, we have integrated C ϕϕ L over all redshifts, but printed two separate power spectra at z of 3 and 0. These are then differenced to obtain the power spectra shown in the figure, allowing us to achieve running times comparable to those of L5p6_m10_DMO.This technique is also applicable to tomographic analyses.For example, printing C ϕϕ L at redshifts 3, 2, 1, and 0, then differencing successive outputs, will provide fast and accurate power spectra in the 2 ≤ z ≤ 3, 1 ≤ z ≤ 2, and 0 ≤ z ≤ 1 bins, respectively.
Each hyphi run requires an input transfer function for the purpose of normalizing its matter power spectrum.Furthermore, the linear neutrino density as a function of time is interpolated from transfer functions over a range of redshifts.We find accurate results using transfer functions at the following 12 redshifts : 200, 100, 50, 20, 10, 5, 4, 3, 2, 1, 0.5, and 0. Currently, hyphi is written to accept the 13-column transfer function files output by current versions of the camb code.Additionally, in its most accurate setting, hyphi uses the Mira-Titan IV emulator, which must be compiled separately.Further instructions for compilation and usage of hyphi are available at the web site listed above.
Two options are provided for implementing hydrodynamic corrections in hyphi through SP(k).Firstly, the baryon fraction for the BAHAMAS simulations of McCarthy et al. (2017McCarthy et al. ( , 2018) ) is included with the code, and may be raised to a positive power p f as in Sec.4.2 prior to application of SP(k).Secondly, a variant of the power-law baryon fraction approximation of Akino et al. (2022), as implemented in SP(k) by Salcido et al. (2023), is included in hyphi.Its three parameters determine the normalization, mass-dependence, and redshift-dependence of the baryon fraction.
ϕϕ L from each κ map, before summing to obtain the total C ϕϕ L .(iii) Compute a κ map for each of the above z bins, then rotate each κ by a random angle.The rotated z-binned κ maps are then summed and the result used to compute C ϕϕ L .(iv) Compute a κ map for each redshift shell, with ∆z = 0.05 up to z = 3, and then randomly rotate each one.The rotated κ maps are summed and the result used to compute C ϕϕ L .

Figure 3 .
Figure 2. C ϕϕ L from FLAMINGO L2p8_m9, Planck, and PlanckNu0p24Fix simulations (dashed) and their corresponding DMO simulations (solid) computed over the range of redshifts given inTable 1. (Inset) Baryonic suppression is evident at small scales relevant for next-generation probes.

Figure 4 .
Figure 4. Effects of binning and bin/shell rotation on C ϕϕ L .For L5p6_m10_DMO, with a 5.6 Gpc box, methods (ii) (sum of powers of z bins), (iii) (power of sum of rotated z bins), and (iv) (power of sum of rotated z shells) are show divided by the standard method (i), computed over the region z ≤ 2 (dashed) and z ≤ 25 (solid).Also shown are methods (ii) and (iv) divided by (i) for L11p2_m11_DMO, whose large simulation volume makes it immune to volume replication errors.

Figure 5 .
Figure5.Impact of neglecting hydrodynamic suppression of the CMB lensing potential power spectrum for redshifts z ≥ z b .In the L ≤ 25000 range, z b = 1 and 2 lead to errors of 4% and 0.4%, respectively.

Figure 6 .Figure 7 .
Figure 6.Fractional contributions to the total lensing potential power spectrum C ϕϕ L due to clustering in different redshift bins.Low Ls are dominated by low-z clustering, while L ∼ 1000 receives significant contributions from a range of redshifts.

Figure 8 .Figure 9 .
Figure 8.Heat map showing dC ϕϕ L /dη/C ϕϕ L , that is, the fractional contribution of each wave number and scale factor, to the total a = 1 lensing potential power spectrum.(Top) L = 1000.(Bottom) L = 3000.Large scales k < L/χ * do not contribute to C ϕϕ L .

Figure 11 .
Figure 11.Lensing potential power spectrum computed using linear (lin) or Time-RG (TRG) perturbation theories, Halofit (HF), the Mira-Titan IV (MT4) emulator, or Euclid Emulator 2 (EE2), compared with the FLAMINGO power spectrum.HF is used up to z = 3, MT4 to z = 2, and EE2 to z = 10, above which C ϕϕ L is computed using Time-RG.Solid lines compare DMO calculations to the FLAMINGO L5p6_m10_DMO C ϕϕ L computed up to z = 25.Dashed lines, using SP(k) for the baryonic suppression, are compared with the FLAMINGO L2p8_m9 C ϕϕ L computed to z = 5.

Figure 12 .
Figure 12.Accuracy of C ϕϕ L contributions from several redshift bins.Our computation integrates Eq. (12) over the appropriate redshift range, using the Mira-Titan IV Emulator for P m at z ≤ 2 and Time-RG perturbation theory at z > 2. This is compared with the FLAMINGO N-body C ); Krolewski et al. (2021); Chang et al. (2023); Abbott et al. (2023); Wang et al. (2023).
Figure13.Spoon-like feature in the ratios of power spectra of two models, PlanckNu0p24Fix_DMO and Planck_DMO from Table1, differing only in their neutrino mass sums M ν .Halofit has been calibrated to fit low-redshift spoons accurately, while perturbation theory is qualitatively incapable of reproducing this non-perturbative effect.MT4 is limited to z ≤ 2.

Figure 17 .
Figure17.Factorizability of neutrino and hydrodynamic suppressions of the CMB lensing potential power spectrum.FLAMINGO simulations (solid) and MT4+TRG calculations (dashed) of the neutrino suppression factor F ν , the baryon suppression factor F b , the combined suppression factor F ν+b , and the product of the first two, which accurately approximates the third.

Figure 19 .
Figure 19.Error in factorizability |F ν × F b /F ν+b − 1| vs. p f in the generalized feedback model raising fb ( M, z) to the power p f in SP(k).Here, p f < 1 (p f > 1) corresponds to weaker (stronger) feedback than the FLAMINGO fiducial model.Also shown is the FLAMINGO factorizability error.

Figure A1 .Figure A2 .
Figure A1.Numerical convergence of hyphi C ϕϕ L computation.Each C ϕϕ Lcomputation, labeled by its absolute and relative tolerances as well as its running time on a standard desktop computer, is compared to a high-quality run with absolute and relative error tolerances ϵ abs = 10 −20 and ϵ rel = 10 −12 , respectively, whose running time is 26.16 s.