HYMALAIA: A Hybrid Lagrangian Model for Intrinsic Alignments

The intrinsic alignment of galaxies is an important ingredient for modelling weak-lensing measurements, and a potentially valuable cosmological and astrophysical signal. In this paper, we present HYMALAIA: a new model to predict the intrinsic alignments of biased tracers. HYMALAIA is based on a perturbative expansion of the statistics of the Lagrangian shapes of objects, which is then advected to Eulerian space using the fully non-linear displacement ﬁeld obtained from 𝑁 -body simulations. We demonstrate that HYMALAIA is capable of consistently describing monopole and quadrupole of halo shape-shape and matter-shape correlators, and that, without increasing the number of free parameters, it does so more accurately than other perturbatively inspired models such as the non-linear alignment (NLA) model and the tidal-alignment-tidal-torquing (TATT) model.


INTRODUCTION
Light travelling from distant galaxies towards telescopes on Earth is slightly disturbed by the gravitational interaction with the matter distribution in its path (Schneider 2006).This physical process, usually known as weak gravitational lensing, has become a powerful tool for scientific discovery in cosmology (see Refregier 2003;Hoekstra & Jain 2008;Munshi et al. 2008;Bartelmann 2010;Weinberg et al. 2013;Kilbinger 2015;Mandelbaum 2018, for reviews).Weak lensing is currently one of the most informative cosmological probes available, (Li et al. 2023;Dalal et al. 2023;Dark Energy Survey and Kilo-Degree Survey Collaboration et al. 2023;Abbott et al. 2022;Heymans et al. 2021;Asgari et al. 2021), and it is bound to increase in importance with the arrival of data from Stage IV experiments such as the Rubin Observatory's Legacy Survey of Space and Time (LSST) (The LSST Dark Energy Science Collaboration et al. 2018), Euclid (Laureĳs et al. 2011), or the Nancy Grace Roman Space Telescope (NGRST, formerly WFIRST) (Spergel et al. 2015).
The improvement of the quality and quantity of weak-lensing measurements poses the challenge of improving theoretical models and controlling systematical errors, among which can be mentioned photometric redshift errors (Salvato et al. 2019;Fischbacher et al. 2023), baryonic effects (Huterer & Takada 2005;van Daalen et al. 2011;Hearin et al. 2012;Eifler et al. 2015;Chisari et al. 2018;Huang et al. 2019;Chisari et al. 2019;Aricò et al. 2023), and intrinsic alignments (IA) (Joachimi et al. 2015;Kiessling et al. 2015;Kirk et al. 2015).The latter refers to the fact that the shape of galaxies can be ★ E-mail: francisco.maion@dipc.orginfluenced by the matter distribution in their vicinity (e.g.elliptical galaxies tend to align with the principal axis of the local matter tidal tensor), a process by which spatial correlations between these shapes are imprinted.As a consequence, the measurement of galaxy shapes is no longer an unbiased estimator of the cosmic shear, but of its sum with the intrinsic shear.Additionally, IA will also be an important source of systematic errors in spectroscopic surveys, such as DESI, due to target-selection bias (Hirata 2009;Lamman et al. 2023).Having a reliable way to separate IA from the cosmological signal of interest will be essential to get robust constraints on cosmological parameters.
Besides its importance as a systematic effect, the intrinsic alignments of galaxies can be viewed as an interesting cosmological observable by itself.One can explore its sensitivity to specific physical processes (Kogai et al. 2018;Harvey et al. 2021), but also use it as an independent cosmological probe (Schmidt et al. 2015;Chisari et al. 2016;Akitsu et al. 2021a;Tsaprazi et al. 2022;Okumura & Taruya 2022;Kurita & Takada 2023;van Dompseler et al. 2023), which is expected to provide complementary information to more conventional galaxy clustering statistics, making constraints significantly stronger (Taruya & Okumura 2020;Okumura & Taruya 2023).A full exploration of this potential, however, requires fast and accurate modelling of the signal, with a minimal number of free parameters, so as not to dilute the cosmological information present in the data.
Current models developed to describe IA can be separated into two classes.On the more complex side are models which assign shapes and orientations to galaxies placed into gravity-only simulations, which we will refer to as semi-analytic models (Heavens et al. 2000;Heymans et al. 2006;Joachimi et al. 2013a,b;Hoffmann et al. 2022;Alfen et al. 2023); these have the advantage of being physically motivated, but the disadvantage of being computationally costly.Moreover, these models must make assumptions regarding the processes through which alignments are imprinted onto galaxies, that may or may not give an accurate description of the physics at play in the real Universe.In an attempt at being insensitive to details of the galaxy formation processes, models based on perturbation theory (PT) (Catelan et al. 2001;Hirata & Seljak 2004;Bridle & King 2007;Blazek et al. 2019;Vlah et al. 2020Vlah et al. , 2021;;Bakx et al. 2023) or the halo model (Fortuna et al. 2020;Mahony et al. 2022) have also been developed.These have the advantages of being computationally cheap, and being applicable regardless of the specifics of galaxy formation physics; on the downside, their constraining power can depend strongly on the priors given for their free parameters, creating a need for calibration of these priors from simulations and observations.Furthermore, their range of application may be limited, with PT breaking down at small scales when structure formation becomes non-linear, and the halo-model being inaccurate at the transition between the 2-halo and 1-halo terms.
In the case of galaxy clustering, recent works have proposed a way to overcome this limitation of perturbatively inspired models, while retaining the agnostic position relative to galaxy formation.Through the so-called hybrid Lagrangian bias approach, one combines the perturbative bias expansion in Lagrangian space with the fully non-linear displacement field from -body simulations, to obtain a model in Eulerian space which is not subject to the limitations of PT in predicting the non-linear evolution of the dark-matter fluid (Modi et al. 2020).This method has been shown to describe the power spectrum of galaxies in real and redshift space well beyond the limits of PT methods (Zennaro et al. 2023;Kokron et al. 2021;Pellejero-Ibañez et al. 2022), for very different assumptions on the galaxy formation model (Zennaro et al. 2022;Kokron et al. 2022), and it has been applied to real and simulated data to obtain unbiased cosmological constraints (Hadzhiyska et al. 2021;Pellejero-Ibañez et al. 2023); it has also been shown to describe well statistics beyond the 2-point function, such as k-nearest neighbour cumulative distribution functions (kNN-CDFs) (Banerjee et al. 2022).Such results provide strong motivation for applying the hybrid Lagrangian bias method to the modelling of IA: one expects to obtain an accurate and agnostic model that extends well beyond the limits of typical PT models.
In this work, we construct such a model.We develop a model based on a Lagrangian bias expansion of shapes, which is then advected to Eulerian space using fully non-linear displacements from -body simulations; we refer to this model as HYMALAIA -an acronym standing for HYbrid Model Advected from LAgrangian space for Intrinsic Alignments.Besides presenting the model, we test it against measurements of shape power-spectra of halos extracted from -Body simulations, evaluating qualitatively and quantitatively the accuracy with which the model describes them.Tests with measurements of the shape power-spectra of galaxies are deferred to future work.This article is structured as follows.In section 2 we describe the bias expansion formalism; in section 3 we describe our hybrid model, as well as previously existing ones which we will employ for comparison; in section 4 we give a brief account of the simulations used for building our model and for comparing models between each other; section 5 gives an account of the methods we employ to quantitatively compare the performance of different models; we present our results in section 6 and conclude in section 7.

BIAS EXPANSION
Modern theories of structure formation postulate that the abundance of any large-scale structure (LSS) tracer can be written as a combination of three parts, and the subscript  is to emphasize that so far we are working with scalar quantities.The first term, usually referred to as the local part can be described as a linear combination of operators built from the matter field: Here, the constants   encapsulate the complex information coming from the formation process of each tracer, and quantify the response of the tracer density to changes in the basis operators (see Desjacques et al. 2018 and references therein).This form is valid both in Lagrangian and Eulerian space, with the caveat that the bias parameters will have different meanings.The operators entering this expansion will have to satisfy the equivalence principle, be compatible with homogeneity and isotropy of the LSS, and have the correct symmetry properties (i.e. if we wish to perform this expansion for a trace-free rank 2 tensor, then all the operators must also be trace-free rank 2 tensors).Counting the powers of the linear matter density appearing in each of these operators, one can then order them perturbatively; at a fixed order in perturbation theory, there is a finite number of operators contributing to equation ( 2), which can be explicitly written down.Retaining terms up to second order, this expansion reads in which  2 is defined by and    is the traceless tidal field The second term in equation ( 1), usually referred to as the higherderivative part, captures the dependence of the tracer density on non-local effects such as baryonic feedback (Lewandowski et al. 2015), or simply the collapse of matter from a typical region of radius  * , usually denoted as the non-locality scale.These terms involve spatial derivatives of local gravitational observables, and hence are suppressed at large scales by (  * ) 2 , which becomes a new expansion parameter.At order (1 + 1) in   and (  * ) 2 , one can derive the operator contributing to the expansion, given by where  is the dark matter overdensity.Finally, the third term in (1) represents stochastic contributions.These have to be taken into account in order to model effects that small scales have on large, perturbative scales.Here, we will consider only one such term, namely The stochastic field  is uncorrelated with the dark matter density field and has vanishing expectation value.Again, it should have the same tensorial symmetry properties as the quantity for which we are performing the expansion, i.e., it would be a symmetric trace-free tensor field    if the tracer field also has these properties.For the case of a scalar, its two-point correlator in Fourier space takes the form where   are free parameters which cannot be predicted from perturbative techniques.

Eulerian Shape Expansion
Although this formalism has been employed most commonly to expand the density of biased tracers, it has recently been generalized to allow its application in describing tensorial quantities such as the shape field (Schmitz et al. 2018;Blazek et al. 2019;Vlah et al. 2020;Taruya & Akitsu 2021).Let    denote the 3-dimensional shape field, and we define the normalized trace-free part of it as in which  0 = 3 =1   , the trace of the shape tensor.In this case, they find the expression for the second-order bias expansion of galaxy shapes to be in which the operators ( ⊗ )   ,    can be defined as where  is the dimensionless velocity divergence,  = −(    )/(  ).Notice that we denote density bias parameters by the letter  with a subscript indicating which operator it accompanies; for the shape expansion we denote the bias parameters by the letter , with a subscript indicating what is the operator they accompany, in accordance to the notation of Schmitz et al. (2018).

Lagrangian Shape Expansion
The key assumption behind our work is that the shape field will evolve from Lagrangian to Eulerian space only by having its amplitude rescaled by the local volume contraction or expansion, but without suffering any shear (see equation 2.3 of Chen & Kokron (2024)).This can be expressed by the equation meaning that the functional shape of the bias expansion in Lagrangian space is the same as that in Eulerian space, with the difference that now all the operators are computed in Lagrangian space, and therefore the bias parameters have a different physical meaning from the Eulerian ones (Schmitz et al. 2018;Taruya & Akitsu 2021).For the purpose of analyzing actual measurements one needs predictions in Eulerian space, and these can be obtained from    (q) through the equation which is simply a reflection of the conservation law in Eq. ( 12); in this expression,  is the displacement field, obtained through any calculation method available (e.g.: perturbation theory or simulations).
Our goal is to test how well such a model will be able to reproduce realistic shape correlations of haloes, thus implicitly testing the assumptions made earlier.

Density Weighting
Notice, that even though we have written    (q) as if it was a well defined field for each point in Lagrangian space, in reality we can only measure the tracer ellipticities at the positions where tracers are available.Hence, one should define a continuous tracer-shape field in Lagrangian space by in which the sum over  is in fact over all objects available, making it clear that g is in fact the density-weighted reduced shape field (Blazek et al. 2015).Let us assume a halo population with density field 1+ ℎ and reduced-shape field  ℎ   , then the density-weighted reduced shape field is given by g Let us then expand the halo density to first order,  ℎ ≈  1   , and we can write the relevant bias expansion keeping only up to second order terms Notice that even if we were to retain higher order terms in the density bias expansion, they would not contribute if one chooses to keep only up to second order terms in the final, density-weighted, reduced shape bias expansion.We also discard the term    ∇ 2  since it contributes at order (2 + 1) in   and (  * ) 2 respectively, so that we effectively consider it to be a third-order term.Interestingly, this shows that even if only the linear term is maintained in the shape expansion, density-weighting generates a term which is of the form    , and we can redefine the bias parameter associated to this operator as The functional form of this term is fully expected, and was already included in the second-order expansion for the reduced shape.However, taking equation (18) as a working hypothesis, one can then test explicitly whether the recovered value of c  is compatible with  1   , which would be an indication that the linear term for the shape expansion is sufficient.

MODELS
In this section we will describe the different models which will then be used to fit the measured shape power spectra.The following subsection will be dedicated to establishing useful definitions and notation, before we actually plunge into the description of the different models.

Basic Definitions
To study the IA signal we will decompose the reduced shape tensor    into two measures of ellipticity, equivalent to the Stokes' parameters in the study of light polarization, but these definitions still have the disadvantage of being dependent on the particular coordinate system chosen.Fourier transforming these quantities one can define the following combinations (Crittenden et al. 2002), in which   is the angle of the k vector in the 2D projected plane.
Assuming the line of sight to be given by ẑ, this can be defined as This decomposition into  and  modes is coordinate independent, and therefore makes it easier to compare among different reported measurements.
We will generally be interested in computing the auto spectra of the  and  modes, as well as their cross spectra with the density field in which we do not write these expressions for    and    since these spectra are expected to be compatible with zero (Hirata & Seljak 2004), as long as no parity-breaking interactions are involved (Biagetti & Orlando 2020).Notice that the above spectra are not isotropic, but depend on , the cosine of the angle between k and the line of sight direction.We will generally choose to work with the multipoles of this 2-dimensional power spectrum, given by and L ℓ is the Legendre polynomial of order ℓ.Cosmic shear will only generate -modes, while IAs generate -modes which will have a non-zero auto power spectrum, ().Fundamentally, though, it will not generate correlation between  and -modes, which therefore can still be used as a tool to mitigate systematic effects in WL measurements.

Previously Existing Models
In this subsection we will describe previously existing models, which will later be used in comparisons to HYMALAIA.

Linear Alignment
We begin by reviewing the linear alignment (LA) model for IA (Catelan et al. 2001;Hirata & Seljak 2004).This model assumes that the intrinsic shear of galaxy shapes is proportional to the line-of-sight projected tidal tensor, therefore, Fourier transforming this shear field one obtains and we can combine equations ( 21) and ( 22) to compute its  and -mode decomposition leading to the following predictions for the power spectra of -and -modes of the IA field in which    is a constant introduced to model the stochastic noise contribution to the auto spectra (cf.equation ( 8)).

Non-Linear Alignment
When attempting to model the intrinsic shear power spectrum of galaxies, Hirata & Seljak (2004) and later Bridle & King (2007) noticed that substituting all occurrences of the  lin in these expressions by    , the non-linear matter power spectrum, generally was a better description of the measurements; the model resulting from this prescription is generally known as non-linear alignment model (NLA).

Tidal-Alignment-Tidal-Torquing
The Tidal-Alignment-Tidal-Torquing (TATT) model (Blazek et al. 2019) goes beyond the simple assumption of linear alignment, employing a second order Eulerian bias expansion such as the one in equation ( 10), but without the Laplacian term, and neglecting the contribution of the velocity-shear operator.The expression in equation ( 10) is then evaluated retaining up to second-order contributions, and the relevant correlators are computed using standard perturbation theory techniques; notice that, as pointed out already in Blazek et al. (2019), these calculations are not complete up to 1-loop.It is also important to point out that the leading order contributions to this model, which are simply the linear alignment model predictions (see equation ( 28)), are evaluated by substituting the occurrences of the linear matter power spectrum by the non-linear predictions given by Halofit (Smith et al. 2003;Takahashi et al. 2012), in a similar way to what is done in the non-linear alignment model.Finally, an important caveat to mention is that the public implementation of TATT available in FAST-PT (Blazek et al. 2019;Fang et al. 2017) outputs the predictions for the IA terms in the Limber approximation, which makes the assumption that only modes perpendicular to the line-ofsight ( = 0) contribute to the spectra (Limber 1953); for  (ℓ ) the dependence on  is separable, and can therefore be included exactly a posteriori.This is not the case for   , and therefore we include an approximate angle dependence, analogous to the one in equation ( 28).
We refer the reader to (Blazek et al. 2019;Schmitz et al. 2018) for more details regarding the TATT model, and suffice with enumerating its free parameters at fixed redshift,   ,    ,  ⊗ ,    .

Effective Field Theory
As a generalization of LA, one can consider the one-loop effective field theory model for intrinsic alignments (EFT of IA) from Vlah et al. (2020).It is formulated in the spirit of the Effective Field Theory of Large-Scale Structure (EFT of LSS) (see, e.g.: Baumann et al. 2012;Carrasco et al. 2012;Carroll et al. 2014), which has seen great success in describing the clustering properties of galaxies on mildly nonlinear scales.
The first step is to once again expand the shape field of the relevant biased tracer as a linear combination of operators computed from the matter field, such as in equation ( 2).Unlike what was described in sections 2.1 and 2.2, we will now have to perform this expansion considering all operators until 3rd order in the linear matter density; this is required so that the computations of the two-point functions are fully consistent, including all 1-loop diagrams.While these operators are mostly independent at the field level, many contributions to the two-point function become degenerate.As a result, the local part of the bias expansion of    depends on 6 parameters in total, which we will label   ,  2,1 ,  2,2 ,  2,3 ,  3,1 ,  3,2 .The subscripts serve to indicate that at linear order, only one operator appears with corresponding bias parameter   , while at second (third) order three (two) new contributions appear.
However, the deterministic part of the bias expansion is in general insufficient to describe the behaviour of the shape field in the mildly non-linear regime.Concretely, one should allow for (i) higherderivative effects and (ii) stochastic contributions, in similar spirit to the bias models above.Treating the non-locality scale  on a similar footing as the non-linear scale, two new contributions appear, one for the expansion of the density field, and another for the expansion of the shape field where the superscript (1) indicates that we consider only the linear density field.Stochastic contributions must also be taken into account in order to model effects that small scales have on large, perturbative scales.Altogether, the six multipole power spectra of interest now depend on 9 bias parameters in total: six from the local expansion of    , two higher-derivative parameters  ∇ 2 ,  ∇ 2 and one stochastic amplitude    for the monopoles of the E-and B-mode auto-spectra, as in equation ( 28).In the subsequent comparison, we will also consider a 7-parameter EFT model in which the bias parameters  3,1 ,  3,2 are omitted.This is expected to increase the amount of information on the linear alignment parameter recovered by the EFT model (see Bakx et al. 2023).For explicit expressions of the relevant power spectra and further background on the EFT of IA we also refer to Bakx et al. (2023).

HYMALAIA
Given that one knows the shape field in Lagrangian space by specifying it with the Lagrangian bias expansion1 , we need information on the displacement field (q) to evaluate equation ( 14), thus advecting this field to Eulerian space.A comment is in order at this point, regarding the stochastic field introduced in equation ( 13).To be fully consistent with equation ( 14) one would have to define the stochastic field     (q) in Lagrangian space and then advect it to Eulerian space.It is instructive to calculate the advected stochastic field using the Zeldovich approximation for the displacement field, which gives and we choose to keep just the lowest order term in this expression.
The advection of equation ( 31) can be obtained from perturbative approaches (Rampf 2021;Bernardeau et al. 2002), or evaluated directly from -body simulations, obtaining fully non-linear predictions for this field (Modi et al. 2020).
In building our hybrid model, we will follow the second approach, employing the simulations of volume (1440 Mpc/ℎ) 3 described in Table 1 to perform the advection of the Lagrangian fields.Having advected the fields to Eulerian space, one can then compute their auto and cross power-spectra, and finally linearly combine them with the bias parameters to obtain the full model for the biased tracers.In the following sections we expand on each of these steps.

Numerical Implementation
The process which we sketched above can be detailed into a series of steps that we perform to obtain the non-linearly advected Lagrangian operators: (i) We begin by computing the linearly evolved Lagrangian density field and smoothing it at some typical scale Λ ℎ/Mpc, to remove modes with wavenumbers  > Λ, that are not expected to contribute to the physical process at hand; (ii) We compute the Lagrangian fields using the smoothed linear density field; (iii) We subtract the Lagrangian positions of particles from their Eulerian positions at the relevant redshift, thus obtaining the displacement field   = x  () − q  of the -body simulation; (iv) We advect the quantities from Lagrangian to Eulerian space using the discretized version of equation ( 14), and we have denoted by   the region in Lagrangian space such that particles end up in the grid cell x  .
These calculations are performed using the simulations of volume (1440 ℎ −1 Mpc) 3 described in Table 1.These are the largest simulations in the BACCO simulation project, which we will describe briefly in the following section.The Lagrangian fields are computed in a uniform grid of 1080 3 cells, from a damped linear density field, such that   ( > 1ℎ/Mpc) = 0.This damping is necessary to remove modes which are too small, and thus not expected to contribute physically to the formation of the alignments of galaxies at the scales of interest.The number of cells was chosen to match the number of outputted particles in the simulation; even though it is run with 4320 3 particles, only one every 64 are actually stored.The advected fields are interpolated to a grid with the same number of cells using a Cloud-in-Cell (CIC) density assignment scheme.The damping scale   = 1 ℎ/Mpc was chosen so that we only include modes well below the Nyquist frequency    ≈ 2.35 ℎ/Mpc, and are thus not subject to aliasing effects (Orszag 1972).

Power Spectra
Once we have the relevant operators in Eulerian space,    (x), (x), we can build -and -mode fields for each of these operators, and then compute their auto and cross power spectra.Hence, the model for the shape power spectrum multipoles is given by in which  ∈ {, },  O are the bias parameters associated with the operator O, and    is a free constant introduced to model the stochastic noise present in the halo sample.Hence, HYMALAIA at its most complex form has 5 free parameters that need to be adjusted, Figure 1 shows some of the basis power spectra entering the equations above; we chose to display only a subset, focusing on those with the largest amplitudes.

SIMULATIONS
The simulations used in this work are summarized in Table 1, and are part of the BACCO simulations, described in (Contreras et al. 2020;Angulo et al. 2021;Zennaro et al. 2023).These are gravity-only simulations run using a modified version of L-GADGET-3 (Springel 2005;Angulo et al. 2021).The initial conditions are computed at  = 49, and the simulation is evolved until  = 0.At each output redshift we have stored FoF groups and SUBFIND subhalos.These simulations are run using the Fixing and Pairing (F&P) technique (Angulo & Pontzen 2016), with the purpose of reducing the variance in the derived −point functions.This procedure consists in generating initial conditions for a simulation without randomly sampling the amplitudes, but doing so only for the phases; the amplitudes instead are all fixed to the square root of the power spectrum of interest.One then generates a second set of initial conditions with all phases summed of  -this is equivalent to multiplying all Fourier modes by −1; the expected effect is that the deviations of the -point functions from the ensemble average in the two simulations will be anti-correlated, and averaging over their statistics will cancel these deviations exactly.
The effect of this technique has been characterized for a wide range of observables (Villaescusa-Navarro et al. 2018;Chuang et al. 2019;Klypin et al. 2020;Maion et al. 2022), but never for the shape powerspectra.Nevertheless, one can obtain rough expectations about the behaviour of the variance of these spectra from results established in these works.At large scales, Villaescusa-Navarro et al. ( 2018); Maion et al. (2022) have shown that the variance reduction can be as large as several orders of magnitude, and as small as having roughly no effect at all.On small scales, the scenario is more predictable, since non-linearities in the formation of structures will generally erase any memory of this particular choice of initial conditions, making the power spectra uncorrelated at these scales.If this is the case, then the variance reduction is merely a factor of 2, the same as one would obtain by running two independent simulations.Another possible scenario observed in Maion et al. (2022) is that the errors in the paired spectra instead of being anti-correlated are positively correlated.In the limiting case where this correlation coefficient is unity, averaging over these two power spectra will have no variance reduction effect at all, since the power spectra are mathematically the same.
In the remainder of this work we will assume that pairing reduces the variance at all scales by a factor of 2, and will not attempt to model the effect of fixing, but limit ourselves to noticing that values of reduced  2 below 1 at large-scales are most likely due to the variance reduction arising from this procedure.At small scales we can compute the covariance of the shape power spectra using the jacknife technique (see appendix E for further details), and thus investigate the effect of F&P. Figure E3 shows that for all spectra, with the exception of   , the variance is reduced by a factor of 2. For the two exceptional spectra, however, the variance can be reduced by as much as a factor of 5; surprisingly, this effect is seen to increase when going to small-scales, which is opposite to the trends observed in previous works (Angulo & Pontzen 2016;Villaescusa-Navarro et al. 2018;Klypin et al. 2020;Chuang et al. 2019), and to the theoretical arguments developed in (Angulo & Pontzen 2016;Maion et al. 2022).This seems to indicate that the shape-noise is highly dependent on the initial phases, and it can be greatly suppressed by pairing.

Halo Samples
We have split the halo population in this simulation into 4 groups selected by mass.Table 2 displays the information on the selected mass bins and the approximate number density of each type of halo.The smallest halos considered, with mass  ≈ 10 12  ⊙ , will be composed of approximately ∼ 300 particles.The mass-definition used for this selection was the one usually referred to as  200, given by 10 1 10 0 10 1 10 0

B[(s s)], B[(s s)]
Figure 1.Basis spectra employed in the construction of the HYMALAIA model.Some of these spectra are negative, and hence we show their absolute values, for visualization purposes.The part of these spectra which is positive is shown with circular markers, and the part which is negative is shown with triangular markers in a softer color.The top panel shows the monopole of the auto and cross power-spectra computed between the basis operators entering the bias expansion in equation ( 10).Black solid lines show the predictions of NLA for each of the spectra at which they are available; notice that on large-scales they match the lowest order contribution to HYMALAIA.Bottom panel is analogous but displays quadrupole instead of monopole.Not all the spectra that enter the bias expansion are displayed.We have decided to only show the five with largest amplitudes.
Table 1.Parameters of the simulations used in this work.Notice that for each cosmology, we have two simulations, one smaller, with a volume of (512 ℎ −1 Mpc) 3 , and one larger with volume of (1440 ℎ −1 Mpc) 3 .

Shape Definition
For each halo in the simulation, we define its shape through the expression 2 2 This is commonly called the inertia tensor in the literature, and denoted by the letter .We decide to keep the notation consistent with other fields of physics, and reserve the name moment of inertia for the quantity ) , in which  runs through the  particles contained in the relevant object, while the definition in equation ( 39) is denoted by the letter  and called shape tensor.
for which   is the mean of   for the particles belonging to this specific halo, and  is an index that goes over all halo particles.Many definitions of the halo shape are possible (Bett 2012), some of them designed to better capture the expected physical effects relevant to galaxy alignments (Tenneti et al. 2014;Kurita et al. 2020).Nevertheless, in this work we have used the one introduced above because we expect the bias expansion to be general enough to describe measurements with any of the shape definitions (see appendix B of Bakx et al. 2023 for cases where this may not hold).

METHODS
In this section we will describe the methods employed in the analyses to be presented in the following section.

Shape Power Spectrum Estimator
In this section we discuss briefly the methods used to measure the -and -mode shape power spectra of IA (Kurita et al. 2020).Let  be the shape tensor of a halo, then we can define its ellipticity as3 and from this, we can formally define a continuous ellipticity field as In practice, we will interpolate the halo ellipticities onto a regular grid of 384 3 cells, using a cloud-in-cell (CIC) method, giving a Nyquist frequency of  Ny ≈ 2.35 ℎ/Mpc.This allows us to efficiently evaluate these fields in Fourier space using fast Fourier transforms (FFT); once these fields have been computed in Fourier space we can define the -and -mode fields analogously to equation ( 21).
Our estimator for the multipole ℓ of the cross power-spectrum between any two fields  and  is given by in which  = k • n is the cosine of the angle between the wavevector k and the th shell corresponds to the region of a Fourier-space shell with radius in the range

Covariance Matrix
To model the covariance matrix we will make the assumption that the  and -modes of the shape field are gaussianly distributed, which allows us to write the following analytical expression for the covariance matrix of their auto and cross power-spectra in which   is the number of modes falling inside the th shell.
In this expression, we will employ the NLA model to evaluate the power spectra   (k); this choice is motivated by Figure E1, which displays a direct comparison of LA and NLA to a fully numerical calculation of the covariance matrix performed by Kurita et al. (2020) from the Dark Quest simulation suite (Nishimichi et al. 2019).Indeed, one can see from Figure E1 that employing LA greatly underestimates the diagonal terms of the covariance at small scales, while using NLA gives a resonable description for most of the quantities at interest, with the exception of   for which it underestimates the error beyond scales  ≈ 0.5 ℎ/Mpc.
To compute this covariance matrix analytically using NLA, we need to determine the value of the free parameters entering the model, namely   and    .   can be determined from the small-scale limit of  (0)  , where it should be well approximated by a constant matching the value of the stochastic amplitude.Therefore, we take the bins of this power spectrum with wavemodes between 0.9 and 1 ℎ/Mpc, and compute their average and standard deviation, giving us an estimator of    and its error   .If we assume that at large scales the linear alignment model is valid, then the linear bias   can be determined from the expression as can be seen by inspection of equations ( 28), after integrating over  to obtain the monopole.Similarly to what is done for    we compute the mean and standard deviation of the ratio above for  < 0.05 ℎ/Mpc, thus obtaining estimates of   and    .Even though in the majority of the paper we compute the power spectrum in linearly spaced bins of , to compute this ratio we employ the power spectrum computed in logarithmically spaced bins, so that we can access large scales in a more direct way; the precise values obtained can be seen in Table C1.
Once in possession of estimates for   and    , we can evaluate equation ( 43) for each of the mass-bins.As stated earlier, this model does not give a good description to the diagonal of the autocovariances of   at small scales.To account for this, we have numerically estimated the auto-covariance of these spectra at small scales using the jacknife technique.We now parametrize the small-scale behaviour of these covariances for each mass-bin, and add it to the model in equation ( 43).A more detailed account of this procedure can be found in Appendix E.
Finally, a comment is in order regarding the contribution of model error to the covariance matrix.As described in section 3.3, HYMALAIA is built from numerical simulations and hence contains some level of statistical noise, inherently present due to the finite volume of the simulations.This model noise should in principle be added to the covariance matrix.To avoid having to model this term, we compute the model from simulations with volume   = (1440 ℎ −1 Mpc) 3 ; the shape power spectra used as test data are instead computed from simulations with volume   = (512 ℎ −1 Mpc) 3 .This means that the model noise is highly suppressed with respect to the variance of the data, since the former is computed from a volume approximately 20 times larger than the latter.

Reduced 𝜒 2
The reduced chi-squared provides us with a quantitative test to assess the quality of the fits.This quantity is defined by in which ,  range between the types of spectra being fitted, i.e.
,   ,   and  ℓ,ℓ ′ , is the cross-covariance between spectra  ℓ  and  ℓ ′  ; the number of degrees of freedom is defined by  dof =   −  pars − 1, in which   is the number of -bins and  pars is the number of free parameters being fitted; symbols with a hat represent measured quantities, while those without represent the model being adjusted, and Θ are its free parameters.
A comment is in order regarding the interpretation of the values of  2 red to be presented in this work.We expect these values to be generally below 1 for the fits to the dataset described above, since the simulations from which the shape power-spectra are computed were run with Fixed-and-Paired initial conditions.The effect of this technique is to reduce the variance on statistics derived from these simulations.Since the expression for the covariance matrix defined in equation ( 43) assumes Gaussian statistics of the underlying field, it most likely overestimates the variance on the shape power spectra on large scales.As discussed in section 4, the exact effect of this technique on shape power spectra is unknown, but it is reasonable to assume that on small scales the variance will be approximately reduced by half.Hence, to correct for this effect, we use the covariance matrix corresponding to twice the volume of one simulation of  = 512 ℎ −1 Mpc in computing the values of  2 .Given the large uncertainties on the actual value of the covariance matrix at small scales, these values of  2 red should not be interpreted in an absolute manner, but in comparison between different models; a model with a higher value of  2 red is sure to be providing a worse fit, but it is unclear whether this would indicate the breakdown of that particular approach.

Figure of Bias
The figure of bias provides a consistent way to determine whether the model is fitting the data at the cost of returning spurious bias coefficients, or if there is consistency between the values found while using progressively smaller scales.This quantity can be defined as in which  fid  and  fid are the fiducial linear alignment parameter and the error on the determination of this value, respectively.Notice we have chosen not to include   ,  ⊗ nor  ∇ 2 in the calculation of this quantity, and thus will use only   in this calculation.This is the only parameter common to all the models considered, and is also the one expected to be degenerate with cosmological parameters and systematic effects.This is supported by Figure 1, in which we see that the contributions proportional to   are the ones with the highest amplitude, and thus can easily absorb larger parts of the cosmological signal.The contributions proportional to the higher-order bias parameters, on the other hand, are usually highly suppressed.A secondary reason is that we have not renormalized the operators entering our bias expansion, which is generally necessary to correctly interpret the derived values of the bias parameters (McDonald 2006;Assassi et al. 2014;Werner & Porciani 2019).Generally, one would then expect that these bare bias parameters be highly scale-dependent, as they absorb higher order contributions that are degenerate with the -point function they multiply.For the case of   we will empirically show that it does not suffer from this issue in the range of scales analyzed, and up to the statistical precision with which it is determined.

Figure of Merit
The figure of merit (FoM) is a quantitative measure of the amount of information recovered on a chosen set of parameters, and can be defined by the equation in which Θ is the parameter covariance.In this work, however, we will use only   in the calculation of this quantity, so that the expression reduces to in which    is computed after marginalizing over the remaining free parameters.Analogously to what was argued in 5.3.2we do this because   is the parameter expected to be most correlated with cosmological parameters.

Fitting Procedure
We employ the MultiNest algorithm (Feroz & Hobson 2008;Feroz et al. 2009Feroz et al. , 2019) ) through its python interface pymultinest (Buchner et al. 2014) to obtain the best-fitting parameters for each of the models, and explore the multi-dimensional likelihood to obtain confidence regions for each of the parameters of interest.
As in every Bayesian inference method, one must define priors for the free parameters in each explored model.Table D1 displays the ranges for each free parameter in each of the tested models.These prior ranges are chosen so that they are large enough not to have a noticeable effect on the constraints, that is, are uninformative.

RESULTS
In this section, we present the tests made to validate the capability of the model to describe shape power spectra and extract reliable values of the IA biases.These shape power spectra are computed from the set of simulations described in section 4. The validation results will be presented using the results obtained with the Nenya simulations at redshift  = 0.0.We have chosen to display these results because this halo sample has a larger value of the intrinsic-alignment bias parameter, and thus represents a more interesting test to the model, with a larger signal-to-noise ratio.

Validation
The models presented in section (3) were used to fit simultaneously the power-spectrum multipoles (2)  computed from the halo samples described above.An example can be seen in Figure 2, which displays a comparison between measurements and HYMALAIA evaluated at the set of best-fitting parameters for an intermediate-mass bin in the Nenya simulation at  = 0.0.We have chosen to display the result for the mass bin log 10 ( ℎ / ⊙ ) ∈ [12.5, 13], labelled  2 in Table 2, because we expect halos at this mass to be the ones typically populated by galaxies targeted by Stage IV weak lensing surveys such as LSST and Euclid (Korytov et al. 2019).The deviations of the model from the measurements are mostly below 3- and usually below 2- until the maximum scale used in the fit  max = 0.85 ℎ/Mpc, marked by a vertical black line, demonstrating good agreement.
It is also important to contextualize the accuracy of our model, comparing it to the requirements imposed by the precision of Stage-IV survey measurements.Paopiamsap et al. (2024) found that keeping mis-modellings below the level of 10% in the IA contribution to cosmic-shear is sufficient to recover unbiased constraints on cosmology.The lower panel of Figure 2 shows that our models for   are good descriptions of the measured spectra, with deviations at the 10% level in the case of  (0)  , and well beyond that threshold for the other two spectra.These spectra are the ones expected to contribute most relevantly to the cosmic-shear signal and therefore, we can conclude that our simulations can test the intrinsicalignment contribution with a similar precision to that of Stage-IV surveys.

Mass-Samples
Figure 3 shows the values of the reduced  2 ( 2 red ), figure of bias (FoB) and figure of merit (FoM) for the fits obtained with HY-MALAIA for the power-spectrum multipoles computed from the 4 previously defined halo samples.The fiducial linear alignment bias parameters used to compute the FoB can be consulted in Table C1.The first two panels show that the model describes well the halo samples and measures consistent bias parameters until the smallest scales probed,  = 0.85 ℎ/Mpc.In the third panel, one can see that the FoM has an unstable behaviour at the largest scales, but then increases approximately monotonically with the addition of smaller scales beyond  = 0.2 ℎ/Mpc; furthermore, there is no clear behaviour of the FoM with mass.The linear alignment bias of the halo populations increases with the mass, thus increasing the amplitude of the signal and consequently the FoM; however, the number density decreases with mass, thus increasing the level of shape-noise and decreasing the FoM.Since the two effects work against each other, and our mass samples were not chosen to keep one of them fixed, they generate the complex behaviour observed.

Minimal HYMALAIA
Once we have established that HYMALAIA is capable of accurately describing the ensemble of measurements over a wide range of scales, and for several mass-bins, a question naturally arises, whether one can suppress some terms in our model to improve the FoM, while maintaining the accuracy.Figure 4 shows the relevant comparison to answer this question, analyzing the relative importance of the terms entering the bias expansion.From the first panel one can see that the terms accompanying   and    are the most important in the expansion, since the model with only these terms fits the data equally well as its more complex counterparts.In the second panel one can see that, except for the model containing only   , none of the others demonstrate a detectable bias in their recovered values of   .Finally, This scale was chosen since it is the smallest scale we ever use in our fits; this is done to avoid entering a regime which could be afected by the damping of modes  > 1 ℎ/Mpc made to the linear density fields when building HY-MALAIA.Middle Panel Colored lines represent difference between model and data, in units of ; gray-shaded regions indicate 1, 2 and 3- regions.
Lower Panel Shows the fractional difference between model and data.We give special emphasis to these results for  (0,2) and , since these are expected to be the most relevant terms in a realistic 3x2pt analysis; the gray-shaded band shows 10% deviations.
the third panel shows that adding more parameters, in particular  ∇ 2 , can significantly reduce the constraining power of the model.The full picture suggests that the optimal version of the model is the one keeping only   ,    , since its accuracy is the same as the one of the full model, it gives unbiased constraints on   , and is the one with the largest FoM among the models that are still accurate at small scales.This picture is preserved qualitatively for the remaining mass-bins and cosmologies.We shall label this reduced version of the model min-HYMALAIA.

Model Comparison
Having established that HYMALAIA and min-HYMALAIA are capable of accurately describing the set of observables over the probed range of scales, we now wish to compare them to other pre-existing models.Figure 5  better performances than EFT of IA.It also shows that TATT recovers marginally lower values than the 7-parameter EFT of IA beyond  ≈ 0.4 ℎ/Mpc.This is likely due to the fact that the implementation of TATT by Blazek et al. (2019) uses the fully non-linear power spectrum instead of the precise perturbative expressions, so that it extends beyond the 1-loop expansion of the EFTofIA, albeit inconsistently in perturbation theory; Bakx et al. (2023) finds that the 7-parameter EFTofIA always performs better than the pertubatively-consistent implementation of TATT, thus supporting our finding.
The second panel shows that all the models, with the exception of NLA, recover values of   that are less than 1- distant from the fiducial large-scale value.This is particularly important because one generally expects   to be the parameter with the greatest degeneracy with the cosmological parameters of interest, and crucially other systematics such as photo-s (Fischbacher et al. 2023), and thus determining it consistently is essential.
Finally, the third panel displays the values of the figure-of-merit for each model.One can see that HYMALAIA recovers the same information as TATT.Furthermore, at the smallest scales tested, HYMALAIA recovers roughly the same information as LA or NLA on scales  ≈ 0.2 ℎ/Mpc, at which these models start to break down.As for min-HYMALAIA, it recovers as much as twice the information extracted by TATT, depending on the scales of interest, while still maintaining lower values of chi-squared over all the probed range of scales.
It HYMALAIA recovers lower  2 red and higher FOM than TATT.We argue that these most likely arise due to the following motives: (i) HYMALAIA is based on a Lagrangian bias expansion, effectively meaning that higher-order operators than those included are generated by the advection procedure (Schmitz et al. 2018); (ii) the displacements are computed from -body simulations, which give a more accurate description of the non-linear regime than PT.
One final caveat must be taken into account at this point: as mentioned in section 3.2.3, the TATT model employed here is based on the public implementation available in FAST-PT, which outputs the spectra assuming the Limber approximation; the dependence on  can be included exactly for   .To compute the multipoles we then include an approximate angle dependence to these spectra.Nevertheless, we argue this approximation should be accurate, since Bakx et al. (2023) compare the full EFTofIA model to a reduced version of it -equivalent to TATT, but with exact angle dependence -to a very similar conclusion as the one found in this work.

Bias Relations
This sample of halos also allows one to determine a relation between the usual linear halo bias  1 and the linear IA bias,   .In Figure 6 we show the measurements of these values from the set of simulations previously introduced.This points to a strong relationship between  1 and   which furthermore appears to be redshift and cosmology independent, in qualitative agreement with the findings of Akitsu et al. (2021b); a quantitative comparison with the results of that work requires careful analysis of the differences in definition of the bias parameters.Appendix A discusses how one can account for the difference in definition between these works, to make comparable the bias values obtained in the present work, and the ones measured by Akitsu et al. (2021b).This can be seen in Figure 6, in which the solid line in the top panel shows the original fitting function from Akitsu et al. (2021b), and the dashed line is the one made compatible with our definition.We can see this restores concordance between the two measurements, especially at large values of the linear density bias.Notice, however, that the original fitting of Akitsu et al. (2021b) is done over a very large range of values of   1 , and does not extend to values below 1 +   1 ≈ 0.75, hence we expect the fit to be much less accurate in this regime, and the comparison at small values of   1 must be made with care.
The second pannel shows the comparison of the values of c  measured with HYMALAIA, and the prediction for this measure-  ment in the case where the linear Lagrangian model would be valid.One can see that in general we detect a difference between these two values, giving some indication of a non-zero value of the    .

CONCLUSIONS
The results presented in this work show that HYMALAIA is capable of fitting shape power spectra obtained from populations of halos over a wide range of masses, and for three very distinct cosmologies at different redshifts.Particularly remarkable is the comparison of HYMALAIA and TATT, since the two models have the same amount of free parameters in their most complex forms, but the accuracy of the former exceeds that of the latter, even if one restricts HYMALAIA to its optimal case with only three free parameters.
In order to apply the method for the mitigation of IA as a contaminant in WL analyses, it is fundamental to demonstrate that it describes well the signal, so as not to misinterpret part of the apparent alignment signal induced by lensing as being intrinsic.It is also fundamental that it recovers reliable estimates of the bias parameters, particularly   , since this is expected to be degenerate with cosmological parameters and systematic effects of great importance.The results in the previous sections have robustly shown that the values of the linear alignment bias   are consistently recovered, being in excellent agreement with the fiducial values determined using LA at linear scales; besides this, we can see a rough agreement between the values of   found in this work, and the ones reported in (Akitsu et al. 2021b), even though the methodologies used are very distinct, and the focus of that work is on much larger values of the linear Lagrangian bias, so that the comparison at low   1 is very uncertain.
Finally, it is also worthy of notice that, even though other models such as the EFTofIA can extend over a similar range of scales while remaining accurate and unbiased, the small number of parameters of HYMALAIA causes it to recover considerably more information from the same dataset.This can be quantitatively evaluated by the FoM, and results presented previously showed that HYMALAIA has the highest FoM among the methods capable of extending beyond  ≈ 0.2 ℎ/Mpc.Restricting to min-HYMALAIA makes this picture even more optimistic, with FoM values that are ∼ 20% larger than those for HYMALAIA at the smallest scales probed, even if tests must still be made to understand whether min-HYMALAIA provides a good description of galaxy shape correlators, as it has been shown to do for halos.
Since HYMALAIA is based on -Body simulation results, its application to e.g.mitigation of IA in cosmic-shear surveys, would depend on building an emulator for its basis spectra as a function of cosmology, which is beyond the scope of this work.However, even without building such emulators, one can consider using the model for infusing the IA signal into simulations and exploring its effect on higher-order statistics (Harnois-Déraps et al. 2022;Alfen et al. 2023).
in which   is the mean matter density of the Universe, and Δ = 200; this choice is motivated by the fact that this halo boundary definition is the least affected by changes in cosmology(Ondaro-Mallea et al. 2021).

Figure 2 .
Figure 2. Upper Panel Points with error bars indicate the shape power spectrum multipoles measured from the simulations with Nenya cosmology,  = 512 ℎ −1 Mpc, at redshift  = 0. Colored solid lines represent the fits using HYMALAIA.Dashed lines indicate spectra that are originally negative, but are plotted here with the reversed sign for visualization purposes.The black vertical line marks the maximum scale used in the fit,  max = 0.85 ℎ/Mpc.This scale was chosen since it is the smallest scale we ever use in our fits; this is done to avoid entering a regime which could be afected by the damping of modes  > 1 ℎ/Mpc made to the linear density fields when building HY-MALAIA.Middle Panel Colored lines represent difference between model and data, in units of ; gray-shaded regions indicate 1, 2 and 3- regions.Lower Panel Shows the fractional difference between model and data.We give special emphasis to these results for

FoMFigure 3 .
Figure 3. Upper Panel: Values of  2 red obtained from the fits to the multiple halo samples.Different colors indicate the results for the 4 mass bins.A dashed black line marks the value of 1. Middle Panel: Figure of bias values as a function of the maximum scale used in the fit.No clear tendency with mass is detected.Lower Panel: Figure of merit values as a function of the maximum scale used in the fit.The dependency of this quantity with mass is unclear.

FoMFigure 4 .
Figure 4. Figure showing the comparison between different cases of HY-MALAIA, in which we choose to keep only some of the bias parameters free.Blue lines in this plot represent the full model, with all parameters free.Top Panel: Values of  2 obtained from the fits to the intermediate mass halo sample.Labels indicate which bias parameters are used in computing each line.Middle Panel: Figure of bias values as a function of the maximum scale used in the fit.Lower Panel: Figure of merit values as a function of the maximum scale used in the fit.

FoMFigure 5 .
Figure 5.This figure displays comparisons between HYMALAIA and the previously existing models, described in section 3. Top Panel: Values of reduced  2 obtained from the fits to the intermediate-mass halo sample.Middle Panel: Figure of bias values as a function of the maximum scale used in the fit.Bottom Panel: Figure of merit values as a function of the maximum scale used in the fit.

Figure 6 .
Figure6.Top Panel: Coevolution of the linear IA bias parameter with the firstorder density bias parameter of halos.The colorful points show the values obtained directly from applying HYMALAIA to our measurements of the shape power spectra for our three available cosmologies, and the four mass bins within each of them.The black solid line shows the fitting function from(Akitsu et al. 2021b).The black dashed line shows this fitting function after being adapted to be compatible with our definition of the linear alignment bias.The correction is simply to multiply their bias values by 3; see appendix A for more details, especially equation (A8).Bottom Panel: Comparison between measured values of c  , and values expected from assuming only linear Lagrangian alignment, c  =    1 .

Table 2 .
Halo samples utilized in this work.These results are quoted for the Nenya cosmology at redshift  = 0.0.