Impact of tidal environment on galaxy clustering in GAMA

We constrain models of the galaxy distribution in the cosmic web using data from the Galaxy and Mass Assembly (GAMA) survey. We model the redshift-space behaviour of the 2-point correlation function (2pcf) and the recently proposed Voronoi volume function (VVF) -- which includes information beyond 2-point statistics. We extend the standard halo model using extra satellite degrees of freedom and two assembly bias parameters, $\alpha_{\rm cen}$ and $\alpha_{\rm sat}$, which respectively correlate the occupation numbers of central and satellite galaxies with their host halo's tidal environment. We measure $\alpha_{\rm sat}=1.44^{+0.25}_{-0.43}$ and $\alpha_{\rm cen}=-0.79^{+0.29}_{-0.11}$ using a combination of 2pcf and VVF measurements, representing a detection of assembly bias at the 3.3$\sigma$ (2.4$\sigma$) significance level for satellite (central) galaxies. This result remains robust to possible anisotropies in the halo-centric distribution of satellites as well as technicalities of estimating the data covariance. We show that the growth rate ($f\sigma_8$) deduced using models with assembly bias is about 7\% (i.e. $1.5\sigma$) lower than if assembly bias is ignored. When projected onto the $\Omega_m$-$\sigma_8$ plane, the model constraints without assembly bias overlap with Planck expectations, while allowing assembly bias introduces significant tension with Planck, preferring either a lower $\Omega_m$ or a lower $\sigma_8$. Finally, we find that the all-galaxy weak lensing signal is unaffected by assembly bias, but the central and satellite sub-populations individually show significantly different signals in the presence of assembly bias. Our results illustrate the importance of accurately modelling galaxy formation for cosmological inference from future surveys.


INTRODUCTION
On very large cosmological scales the light emitted by galaxies provides an observational window into the underlying structure of our Universe.This large-scale structure (LSS) traced by the distribution of galaxies helps measure cosmological parameters to a very high precision (e.g.Hamana et al. 2020;Alam et al. 2021a;Heymans et al. 2021;Abbott et al. 2022).On large scales, the galaxies tracing the LSS can be treated as point particles sampling the local peaks of the underlying matter distribution, thus tracing the n-point functions of the dark matter distribution up to a hierarchy of galaxy bias coefficients (Bardeen et al. 1986;Cole & Kaiser 1989).In reality, the galaxies themselves are extremely complex objects and there are a number of factors influencing their formation and evolution.One of the important challenges for galaxy formation theory as well as cosmology is to identify the key ingredients influencing the formation and evolution of different types of galaxies, hence allowing a clean interpretation of their n-point correlations.
Past major spectroscopic galaxy surveys (2dFGRS: Colless et al. 2003;6dFGS: Jones et al. 2009) measured ∼ 10 5 redshifts; more ⋆ shadab.alam@tifr.res.in† aseem@iucaa.in‡ jap@roe.ac.uk recent surveys (SDSS-III: Eisenstein et al. 2011;WiggleZ: Blake et al. 2011;DEEP2: Newman et al. 2013;VIPERS: Garilli et al. 2014;GAMA: Baldry et al. 2018b; SDSS-IV: Dawson et al. 2016) have been of the same size or up to a million redshifts; ongoing and future surveys (PFS: Takada et al. 2014;4MOST: de Jong et al. 2019; DESI: DESI Collaboration 2016) will measure 10-50 million galaxy spectra.These data sets will provide exquisitely precise measurements of the two-point clustering and higher order statistics of the galaxy distribution.The improvement in measurements arises not only from the increased volume but also from improved instruments, leading to accurate measurements of small scale statistics (≲ 1 h −1 Mpc) for much fainter galaxies.This is especially important as hierarchical structure formation starts from nearly Gaussian density perturbations, making the largest scales fully described by two-point clustering statistics within linear theory, whereas nonlinear evolution on small scales affects clustering at all orders, beyond simply two-point clustering.Therefore two important challenges need to be overcome so as to extract maximal information from such data sets: (a) Understanding the non-linear impact of gravitational dynamics of dark matter as well as astrophysical processes affecting galaxy formation and evolution at small scales; and (b) Exploiting the additional information on non-linear evolution that is encoded in beyond two-point statistics.
The first challenge in modelling non-linear scales is that we ide-ally require a detailed understanding of galaxy formation physicsand furthermore need to be able to evaluate any such model rapidly and efficiently.Due to the highly non-linear nature of the problem, analytical approaches are limited in scope; it is better to use numerical methods that can accurately track astrophysical hydrodynamics coupled to dark matter through gravity.However, the large dynamic range in the problem requires a number of approximations regarding 'sub-grid' physics to be made, and even then the final solutions are still computationally extremely demanding (e.g.Schaye et al. 2010;Vogelsberger et al. 2014;Dubois et al. 2016;The EA-GLE team 2017).An intermediate approach to this problem is to begin by solving the gravity-only evolution numerically using Nbody simulations; this step can be relatively rapid.One can then build empirical models to connect the gravity-only solution to the galaxy distribution.Again, a number of such empirical methods exist, e.g.semi-analytical galaxy formation models (SAM: White & Frenk 1991); sub-halo abundance matching (SHAM; see e.g.Vale & Ostriker 2004;Conroy et al. 2006); conditional luminosity function (CLF; (Yang et al. 2003;More et al. 2012)); and halo occupation distribution (HOD: Benson et al. 2000;Seljak 2000;Peacock & Smith 2000;White et al. 2001;Berlind & Weinberg 2002;Cooray & Sheth 2002;Zu & Mandelbaum 2015).We adopt the empirical HOD approach in this paper as it has been successfully applied to a number of studies while being efficient enough to be computationally feasible for the current analysis.
A key assumption in the simplest implementation of HOD models is that the distribution of galaxies within a halo only depends on the mass of the host halo.But this may not be valid, and galaxies might preferentially populate certain large-scale halo environments.That would affect summary statistics at all scales, but more strongly on non-linear scales; such environmental effects are known in general as 'galaxy assembly bias'.Note that effects of this sort have two distinct aspects: one is the fact that haloes in different large-scale environments will have different assembly histories and hence will exhibit a different large-scale bias even at fixed mass.This effect is known as 'halo assembly bias' (Wechsler et al. 2002;Sheth & Tormen 2004;Angulo et al. 2009;Hahn et al. 2009;Shi & Sheth 2017), but it is not in any sense a missing ingredient in standard treatments of halo clustering.As originally shown by Kaiser (1984), the bias of objects depends on how rare they are, with the highest peaks being the most strongly biased.Since the highest peaks collapse first, there is obviously a correlation between halo formation redshift and bias, even at a fixed mass scale; but the usual discussion of massdependent halo bias automatically averages over this range of bias values.An issue for the HOD approach therefore only arises when the galaxy populations within a halo also depend on formation time.But there is every reason to expect some effect of the latter sort: galaxy formation processes such as the ability to transport cold gas or supernovae feedback might have different efficiencies at different times.Since haloes in regions of higher large-scale overdensity collapse earlier, there is then indeed scope for galaxy properties to be correlated with large-scale environment (Montero-Dorta et al. 2021;Donnan et al. 2022).These assembly bias effects are important for two reasons.Firstly, measurements of such effects can help us improve our understanding of galaxy formation physics; and secondly, LSS modelling that aims to infer unbiased cosmological parameters risks being systematically in error if we do not properly allow for assembly bias in the analysis (e.g.Zentner et al. 2014).The aim of this paper is therefore to extend the HOD model by equipping it with degrees of freedom that allow us to produce model galaxy distributions that incorporate assembly bias.
The second challenge of accessing the considerable information encoded in beyond two-point clustering can be addressed using a number of summary statistics.The first natural extension is to use three-point clustering, which has been studied in simulations and observed with high precision by several groups (e.g.Szapudi 2004;Gaztañaga et al. 2009).But the measurement of higher n-point statistics is computationally expensive and very quickly becomes a bottleneck (although see Slepian & Eisenstein 2015).Other higher order statistics include counts-in-cells (CIC: Colombi et al. 1995;Bernardeau et al. 2002;Friedrich et al. 2018;Gruen et al. 2018;Uhlemann et al. 2020;Repp & Szapudi 2021); the void probability function (VPF : White 1979;Fry 1986;Maurogordato & Lachieze-Rey 1987;Elizalde & Gaztanaga 1992;Vogeley et al. 1994;Sheth 1996;Croton et al. 2004;Fry & Colombi 2013) ).All of these access the beyond-Gaussian information and in principle depend on all the higher order n-point functions.Recently, PA20 have proposed that the VVF is sensitive to a unique combination of cosmological and galaxy formation physics.The VVF is defined as the distribution of cell volumes in the Voronoi tessellation of a set of galaxy positions observed in redshift space.As discussed by PA20, the VVF is closely connected to the formalism underlying CIC and especially the VPF mentioned above.However, the highly constrained nature of the definition of a Voronoi cell means that non-linear information is packaged by the VVF in a complex manner that cannot easily be disentangled analytically.Using N -body simulations, PA20 showed that the VVF is highly sensitive to the non-linear clustering properties of samples chosen using different selection criteria.Moreover, the measurement of the VVF requires no input in addition to the redshift-space galaxy positions, along with the survey completeness footprint, already routinely used for two-point function analyses.Therefore, we have chosen to use the VVF to include beyond two-point information in our analysis.
In a recent paper (Alam et al. 2021b, henceforth, A21), some of us have studied in detail the redshift-space clustering of galaxies in the GAMA survey (Baldry et al. 2018b).We showed that the largescale signature of the fluctuation growth rate is fairly robust against the details of small scale galaxy physics.We also showed that the distribution of satellite galaxies in a halo display deviations from the distribution of dark matter in a halo (NFW profile), with the effect being larger for for fainter galaxies.The purpose of the present study is to extend the analysis in A21 in three ways.(a) We include VVF measurements from GAMA to include beyond two-point information.(b) We introduce new assembly bias parameters in the model, permitting additional correlation between the tidal anisotropy of the halo (defined later) and the occupation of central and satellite galaxies at fixed halo mass.(c) We use an improved estimate of the covariance matrix based on new simulations, as compared to the conservative covariance used in previous work.The main focus of this work is to look for robust signatures of beyond halo mass effects in the galaxy distribution and their impact on growth rate measurements.
The paper is organized as follows.We first describe the details of our modelling methodologies including assembly bias parameter to capture new aspects of galaxy-halo connection in section 2. We then introduce the GAMA data and how we create magnitude limited samples in section 3.In section 4, we describe our measurements from GAMA data and estimates of covariance matrices.Our results are presented in section 5 and a summary is in section 6.

Number of realizations 1 3 1
Table 1.Specification of the three simulations used in this analysis.We used MDPL2 and L300 simulations for the covariance matrix estimation and Bolshoi for modelling the properties of the observed galaxy sample.

MODELLING NON-LINEAR SCALES
Perturbation theory describing the growth and evolution of large scale structure (LSS) has been successfully used to model linear and quasi-linear scales (Scoccimarro 2004;Taruya et al. 2010;Okamura et al. 2011;Reid & White 2011;Crocce et al. 2012;Vlah et al. 2012).
The only fully reliable way to model non-linear scales, however, is to solve for the exact dark-matter dynamics via N -body simulations.We extend the model described in A21 in this analysis.Below we summarise the base model briefly and refer readers to A21 for more details.

Simulation and HOD model
We assume a flat ΛCDM cosmology with Ωm = 0.27, Ω b = 0.0469, h = 0.7, ns = 0.95 and σ8 = 0.82.This cosmology is motivated by the fiducial cosmology adopted in the N -body simulation (Bolshoi; Klypin et al. 2011) that we employ in our HOD models.
Our model predictions are obtained by populating the dark-matter halo catalogue with galaxies using an extended Halo Occupation distribution model.We use a halo catalogue constructed via the ROCK-STAR1 halo finder (Behroozi et al. 2013), which was applied to the snapshot at redshift z = 0.1 from the publicly available Bolshoi2 simulation (Klypin et al. 2011).Bolshoi is a dark matter only N -body simulation that uses the Adaptive-Refinement-Tree (ART) code (Kravtsov et al. 1997).The simulation covers a periodic box of side 250 h −1 Mpc with 20483 particles.As stated above, it assumes a flat ΛCDM cosmology with Ωm = 0.27, Ω b = 0.0469, h = 0.7, ns = 0.95 and σ8 = 0.82.Additionally, for covariance matrix calculations we will use two more simulations.The first includes three realisations of a 300h −1 Mpc box at z = 0, simulated with 1024 3 particles and a flat ΛCDM cosmology with Ωm = 0.276, Ω b = 0.045, h = 0.7, ns = 0.961 and σ8 = 0.811.Further details of this simulation, which we refer to below as L300, can be found in ?.The second is from the MultiDark simulation suite, specifically the run MDPL2.This uses a periodic box of 1000 h −1 Mpc on a side with 3840 3 particles and assumes a flat ΛCDM cosmology with Ωm = 0.31, ns = 0.96, h = 0.67 and σ8 = 0.82.Both of these simulations were analysed using the ROCKSTAR halo finder.All the simulations are summarised in Table 1.
We populate the halo catalogues in the simulations using an extended HOD framework as described in A21.This model goes beyond the standard 5-parameter HOD, allowing an additional freedom for satellites to populate the dark matter haloes.Such nonstandard degrees of freedom are particularly important when study-ing small-scale redshift space clustering, as they allow for more realistic galaxy populations and may introduce degeneracy with the cosmological parameters for high-order statistics.Below, we briefly summarise the details of the HOD model: where ⟨Ncen⟩ M gives the occupation probability of central galaxies in a halo of given mass M and average number of satellite galaxies is given by ⟨Nsat⟩ M .The central galaxies are placed at the centre of dark matter haloes and given the velocity of dark matter haloes scaled by a free parameter γHV.The satellite galaxies are populated using the NFW profile (Navarro et al. 1996).The satellite distribution has three additional free parameters fc, fvir and γIHV, where fc scales the concentration of satellites, fvir scales the maximum radius of the satellites population in unit of rvir, and γIHV scales the velocity dispersion of satellite galaxies in unit of the halo velocity dispersion.

Assembly bias parameters
One of the main characteristics found in hydrodynamical simulations of galaxy formation is that the galaxy population in a dark matter halo depends on its full evolution history, which is sensitive to the halo environment (Gabor & Davé 2012).There are several secondary variables beyond halo mass shown to be connected to halo assembly bias, such as concentration, tidal environment, local overdensity etc. (e.g.Sheth & Tormen 2004;Ramakrishnan et al. 2019).
Recently Paranjape et al. (2018a) proposed a new secondary variable called tidal anisotropy (αR) which is sensitive to halo assembly bias.The tidal anisotropy is defined as where δR is the dark matter overdensity and q 2 R is the tidal shear defined on the scale of R. Both of these are defined in terms of the eigenvalues (λ1, λ2, λ3) of the tidal tensor smoothed on a scale R with a Gaussian filter, using As demonstrated by Paranjape et al. (2018a) and Ramakrishnan et al. (2019), choosing 3 R = 4R 200b / √ 5 maximises the correlation between the tidal anisotropy αR and large-scale halo bias, as well as between αR and internal halo properties.In fact, as shown by Ramakrishnan et al. (2019), αR thus defined is the primary indicator of halo assembly bias, in that the correlation of large-scale halo bias with a large number of secondary halo properties can be understood as arising from their individual correlations with αR.We use this definition of tidal anisotropy and refer the reader to Paranjape et al. (2018a) for full details concerning the measurement of αR using simulated haloes.
The tidal anisotropy (using a constant smoothing scale) has been measured in data from the Sloan Digital Sky Survey (SDSS) and ; and the right panel is for the model with a reasonable value of assembly bias that is currently allowed by data (αcen = −0.71and αsat = 1.44, see section 5.1).The main feature to see here is that models with assembly bias differ from the base model in that the void regions have fewer galaxies and also that filamentary regions (especially very close to massive objects) are more densely occupied.This is because the assembly bias model with αcen < 0 makes it less likely for the galaxies to reside in haloes with tidally isotropic environments, while making it more likely for haloes of similar mass in tidally anisotropic filamentary regions to be occupied.
shown to generate a large variation in bias, independent of local over-density (Paranjape et al. 2018b;Alam et al. 2019).Note that perturbation theories (Chan et al. 2012;Baldauf et al. 2012) as well as the effective field theory of large scale structure (e.g., Senatore 2015) also include such tidal bias terms.But these are proportional to the tidal shear: for a Gaussian random field, this is independent of over-density but for the non-linear dark matter density field it shows strong correlations with over-density.This is one of the key reasons for us to use tidal anisotropy, which is approximately independent of the over-density field (the primary driver of gravitational growth).As mentioned above, the tidal anisotropy is a strong indicator of halo assembly bias.To investigate whether galaxies inherit any of this environmental dependence, we introduce two new HOD parameters (αcen and αsat), which modify the occupation of dark matter haloes depending on tidal anisotropy.Following Xu et al. (2020), our parametrization of assembly bias is as follows: where α rank R is the rank of tidal anisotropy in fine bins of halo mass divided by the total number of haloes in the respective mass bins, so that α rank R varies between 0 and 1.The above equations modify the Mcut and M1 HOD parameters as a function of tidal environment.For a positive value of the parameters αcen and αsat, this will mean that the haloes with more tidally anisotropic environment will have a higher mass limit for assigning central galaxies and fewer satellites compared to haloes in tidally isotropic environment.Using the ranks of tidal anisotropy rather than the actual values makes us less sensitive to the exact definition and allow us to probe the effect across the whole mass range with a simple mass independent parameterization.It is also important to emphasise that the rank of tidal anisotropy is assigned in narrow mass bins, thus ensuring that any non-zero assembly bias signature is distinct from a model allowing a more complicated mass dependence.
We show a small sub-volume of the simulated galaxy catalogue in Figure 1 in order to build a more intuitive feeling for the assem-bly bias model.The panels show (left to right) a mass only HOD model; then a model with large assembly bias; and finally a model with values of the assembly bias parameters that are consistent with current data.We see that including assembly bias in this case causes void regions to have fewer galaxies, while filamentary regions (especially very close to massive objects) become more densely occupied.This is mainly because we are assuming negative values of αcen: the tidally isotropic regions around void centres will then show a larger cut-off mass for assigning central galaxies.As a result, the probability of having central galaxies in low mass haloes is reduced, and since these haloes predominate in voids the centre of voids will thus have fewer central galaxies.

DATA
Galaxy and Mass Assembly (GAMA) is a flux limited spectroscopic survey described in Liske et al. (2015) and Baldry et al. (2018a).GAMA provides the redshift measurement of approximately 300,000 galaxies selected from SDSS imaging (Abazajian et al. 2009) with target selection defined in Baldry et al. (2010).It covers a total sky area of 230 deg 2 with 98% redshift completeness down to r-band Petrosian magnitude 19.8.We use three 5 × 12 deg 2 GAMA equatorial regions G09, G12 and G15, centred on 9h, 12h and 14.5h in right ascension.We create a magnitude limited sample with Mr < −19 after applying the k-correction and evolution correction (Blanton et al. 2003;Loveday et al. 2012Loveday et al. , 2015)).We use the DR3 data release (Baldry et al. 2018c) in this analysis.We refer the reader to A21 for more details about the sample selection.

Redshift space galaxy clustering
We measure the galaxy auto-correlation function using the minimum variance Landay-Szalay estimator (Landy & Szalay 1993).We  3).The bottom panel shows the ratio of red and black lines, with an error estimated from jackknife sampling, reflecting the noise for a survey of volume [250 h −1 Mpc] 3 .then project the two-dimensional galaxy correlation into two different statistics: the projected correlation function, wp(r ⊥ ), and multipoles (ξ ℓ=0,2,4 ).The projected correlation function is obtained by integrating the correlation function along the line-of-sight at fixed perpendicular separation (see equation 19 in A21).For the projected correlation function we adopt a maximum integration limit of |πmax|= 40 h −1 Mpc.We note that in the limit of πmax → ∞ the projected correlation function becomes independent of redshift space distortions (RSD), which is attractive from a modelling point of view.Since we use a finite πmax, limited by the considerations of signal-to-noise and computing time needed for the model evaluation, some RSD remain in the projected correlation function once we reach rp comparable to πmax, but our model and covariance matrix both account for the finite integration length.The multipoles are ob-tained by integrating over the angle from the line-of-sight, weighted with the appropriate Legendre polynomial (see equation 20 in A21).
The projected correlation function was measured in 25 log-spaced bins between 0.01 h −1 Mpc -30 h −1 Mpc.The multipoles were measured with two components, first the small scale component was measured as three wedges with five log-spaced bins between 0.01 h −1 Mpc -2 h −1 Mpc and the larger scales were measured with six log-space bins centred at 3 h −1 Mpc -30 h −1 Mpc.

Voronoi Volume Function (VVF)
The measurement of the VVF requires us to divide the volume covered by the observed sample of galaxies using a Voronoi tessellation.In principle one could do this using geometric algorithms, but in real surveys one also needs to account for masked regions as well as incomplete regions.These effects are usually accounted for by first generating spatially uniform random points and then applying the survey mask and incompleteness to this original uniform sample.Therefore, another way to measure the VVF is simply by counting the number of randoms associated with each galaxy (Alam et al. 2019;PA20).This is defined by linking each random point to its nearest galaxy, then shifting focus to the galaxies and counting the number of randoms νran(g) linked to each galaxy g.We can estimate the volume V (g) of the Voronoi cell associated with each galaxy as V (g) = Vtotνran(g)/Nran.Where Nran and Vtot are the total number of randoms and total volume of the region we are analysing.The volume associated with each galaxy scaled by the average volume can be written as V (g)/⟨ V ⟩ = νran(g) N gal /Nran, where N gal is the total number of galaxies in the considered volume.This raises the question of the optimum number of random points one should use in performing such measurements.Using too many random points wastes computing time, but having too few randoms will introduce large errors into our estimates of the VVF, especially of its low percentiles, which one needs to avoid.PA20 reported the results of convergence tests for the ratio Nran/N gal .Based on these and our own tests, in this work we set Nran/N gal = 500.The volume assigned to galaxies at the boundary of masks and edge of survey will be incorrectly estimated.Appendix B2 of PA20 quantified this effect and showed that even in the presence of 10% masked area the effect of such errors on the overall VVF percentile is negligible at the current precision.Such effects, however, will need to be dealt with more exactly for future larger samples such as DESI, PFS and 4MOST.
The properties of the VVF for halo populations and abundance matched galaxy catalogues is discussed in detail in our earlier work, PA20.The properties of the VVF as a function of cosmology and HOD model are discussed in Liya et.al. in prep. .Figure 2 shows the distribution of the VVF for different galaxy populations.The black line represents the base model with no assembly bias.The red line is for a model with a degree of assembly bias currently allowed by GAMA, and the blue line is a toy model with unrealistically large assembly bias.These black and red populations are generated such that the two point function as well as the number density are consistent within the GAMA errors in each case, but one uses a mass-only HOD model whereas the other includes additional parameters for assembly bias, as described in section 2, with αcen = −0.71and αsat = 1.44 from Table 3.Note that the overall VVF distribution is very similar in these two models constrained by the number density and redshift space correlation function.In detail, though, we find that the bottom plot shows significant differences for galaxies having large Voronoi volumes (corresponding to underdense, void-like environments).The model with assembly bias produces more galaxies with these larger volumes, so that voids will tend to be emptier.Figure 5. Ratio of signal to noise estimated from mocks to the one estimated from jackknife resampling of the data.The four panels from left to right represent wp, ξ 0 , ξ 2 and ξ 4 .The solid lines shows the estimates from the variance of the mocks and the points with errors show the estimates from mean of jackknife resamples from the mocks, with error bars indicating the variance of the jackknifes between mocks.The red colour is for the mocks using the L300 simulation and the blue corresponds to the one using the MDPL2 simulations.The distribution of the VVF is difficult to use in a likelihood analysis, as accounting for cross-covariance with two-point statistics becomes a computational challenge.Therefore, we reduce the entire distribution to six numbers giving the scaled Voronoi volume (V (g)/⟨ V ⟩) at the 2.5 th , 16 th , 50 th , 84 th and 97.5 th percentiles of the VVF (with the Q th percentile denoted 'yQ' and the collection denoted 'VVFp'), as well as its standard deviation (denoted by 'σVVF').Broadly speaking, lower percentiles such as y2.5 and y16 probe small-scale, highly clustered regions (e.g., satellites in groups or clusters) while higher percentiles such as y97.5 probe large, underdense regions such as voids.
Figure 3 shows the VVF percentile and standard deviation for the same models as in Figure 2. Note that the model parameters are chosen such that the level of assembly bias is allowed by the current data to emphasise that we are looking for small and subtle effects.To highlight the differences between the (very precise) VVF measurements in the two models, the right panel shows the ratio of measurements in the two models along with jackknife errors.We note that the model with assembly bias gives larger values for y2.5 and y97.5 compared to the case without assembly bias.Effectively the VVF distribution is skewed towards higher V /⟨ V ⟩, which also raises σVVF.We also want to highlight that these subtle differences between the VVF distributions of the two models are easily captured in the percentiles and width, and hence these constitute an effective compression of the full VVF distribution.Note that the error bars in Figures 2 and 3 represent the noise in an ideal survey with volume [250 h −1 Mpc] 3 .
We will exclude the y16 percentile from our current analysis, as we have found that it is highly correlated with y84 and y2.5, making the inverse of the covariance matrix (discussed below) ill-defined.

Covariance matrix
The covariance matrix quantifying the correlation between various observed statistics is a crucial component in the measurement of various physical parameters.It is also crucial for detecting any beyond halo mass effects in the observed data.To estimate the covariance matrices we first produce two different sets of mock catalogues mimicking the GAMA survey geometry.We then measure the jackknife covariance matrix from the observed GAMA data and compare it to the mock and jackknife covariance matrices estimated from the two sets of covariance mocks.Salcedo et al. (2022) have recently shown that cosmic variance is important and thus that the jackknife covariance might not be sufficiently precise, providing further motivation for the tests in this section.
To generate mock galaxy catalogues with the GAMA survey geometry, we first take the periodic box simulations and remap them to a cuboid following the algorithm proposed in Carlson & White (2010).The transformations are done such that the z-axis corresponds to the size of the survey along the light-of-sight for the Mr < −19 sample.The ratio of side lengths along the x and y axes is kept as close to the actual value for the GAMA fields as possible.We then divide the cuboid into separate pieces, each presenting one realisation of one of the GAMA field.The fields are also aligned in a close packing by alternating the line-of-sight from the positive to the negative z-axis in consecutive regions.Since the three GAMA fields are equivalent we first count the total number of fields that can be generated from the cuboids and use consecutive regions to generate the same field.This is to avoid generating the different fields by using neighbouring regions of the same realisation and thus introducing a spurious larger scale correlation between the mock fields.We use two sets of mocks that were created via this procedure: the first set consists of 50 realisations using the L300 simulation (with a WMAP7 cosmology); and the second set consists of 267 realisations using the MDPL2 simulation (with a Planck cosmology).We populate the halo catalogue in each simulation using the best fit HOD model obtained in A21 by fitting the clustering statistics.
Figure 4 shows the correlation matrices obtained using the two sets of mock catalogues L300 and MDPL2, as well as the jackknife covariance estimated from data.The two mock covariance matrices are the mean of the jackknife covariance matrices from individual realisations.The jackknife covariance estimated from the observed data is noisier compared to the mock covariances due to the limited number of jackknife regions.The mock covariances using L300 and MDPL2 are fairly consistent with each other, despite using completely different inherent simulations and slightly different cosmological parameters.We can therefore be reassured that the covariance matrices are robust and do not require an unrealistically precise knowledge of the true cosmological model.
Figure 5 shows the ratio of signal-to-noise estimated using mocks to the one estimated using data jackknife, for our various clustering statistics.The columns from left to right show wp, ξ0, ξ2 and, ξ4.We use the ratio of signal-to-noise in this illustration as this will be insensitive to any small differences in the signal itself.We find that the signal to noise estimated from the mean jackknife of the mocks is consistent with direct mock-based estimates.However, both the mock based estimates show larger noise for wp and the monopole at large scales, compared to the one estimated using the jackknife of the data.Figure 6 again shows a ratio of signal-to-noise as in Figure 5, but now for the VVF percentile and galaxy number density.We note that in this plot the signal-to-noise ratio estimated by the variance between mocks disagrees with the one estimated via the mean jackknife.We find that quantities such as number density and σVVF show a larger dispersion between the mocks compared to the noise estimated from jackknife sampling.This might be because the number density in a small volume such as GAMA is sensitive to the presence of any massive structure which can appear and disappear among mock realisations.Since these massive structure are rare in small volumes, they possibly violate the independence assumption for jackknife regions.We therefore felt that it was important to make sure that our covariances reflected this larger variance derived from the dispersion between different mocks.But at the same time, it is also important to keep the noise in the covariance as low as possible.We see no obvious differences between the correlation matrix (Rij = Cij/[CiiCjj] 1/2 ) estimated using different simulations or different methods (e.g. with or without jackknifes).Our covariance matrix is given by the following equation: We therefore decided to adopt the correlation matrix estimated from the mean of the jackknifes for MDPL2 mocks (R MDPL2 ij ), since this appears to have the lowest noise.Our full covariance matrix (C final ij ) is then derived by scaling this correlation matrix using the diagonal terms estimated from the variance between mocks (C MDPL2 ii ) -with a final additional scaling by the ratio of the signal between data (S data ) and the MDPL2 mocks (SMDPL2), to account for any differences in overall fluctuation amplitude.

Analysis methods
We start with the ROCKSTAR halo catalogue of the Bolshoi N-body simulation.We first measure the tidal anisotropy for each host halo and then obtain the rank of tidal anisotropy in fine mass bins as described in section 2.2.The full parameter space used in this analysis is given in Table 2.We sample this parameter space via an MCMC analysis using emcee.We first assign the number of central and satellite galaxies using equation 1 with Mcut and M1 redefined in equations 5 & 6 to include the dependence on tidal anisotropy ranks.We use the centre of the halo as the position for the central galaxy, with velocity given by the halo's core velocity scaled by the parameter γHV.The satellite galaxies are assumed to follow the NFW distribution given by host halo parameters but the concentration of the distribution is re-scaled by the parameter fc.The velocities are sampled from the Gaussian distribution with mean set to the halo velocity and dispersion given by host halo velocity dispersion multiplied by the parameter γIHV.The redshift-space position of each galaxy is obtain by adopting the plane parallel approximation with the z-axis of the periodic box being the line-of-sight .
We now treat this dataset as an observed catalogue and measure all of our observables, namely number density, wp, ξ0,2,4and VVF.These results are then used as our model prediction.We then calculate the likelihood by estimating the χ 2 which results in the posterior

Parameters Description prior
Mcut Halo mass at which probability of having central galaxy is 0.5.
10 11 -10 15 σ M scatter in the halo mass to model the given absolute magnitude limited sample.This should be related to scatter in halo mass and absolute magnitude of galaxies.
0-8 κ This determines the mass at which haloes have no satellite galaxies in units of Mcut This determines the scaling of number of satellite galaxies with halo mass.
10 11 -10 15 α The power law index of number of satellite as the function of halo mass.

0-3 fc
The distribution of galaxies might follow a different concentration than dark matter itself.This parameter scales the concentration of the dark matter halo to determined the concentration of the satellite galaxies by scaling the scale radius Rs of the halo.
γ HV This parameters scales the inter-halo velocity to allow an additional degree of freedom as the growth rate of structure.

0-3
γ IHV This scales the velocity dispersion of the dark matter halo in order to allow the satellite galaxy velocity distribution to be different from dark matter.

0-3 f vir
This scales the maximum distance up to which satellite galaxies are distributed in unit of virial radius of the halo.A corresponding velocity dispersion is also estimated based on the according to the distance.

0.1-5
αcen Assembly bias for central galaxy, correlates the occupation probability with host halo's tidal anisotropy.
−2.0-2.0 αsat Assembly bias for satellite galaxy, correlates the number of satellites with host halo's tidal anisotropy. −2.0-2.0 Table 2. Description of model parameters and their prior range.
distribution of our parameters given the data and covariance matrix.
Our main analysis shows results for four cases as follows: (i) wp + ξ0,2,4: In this case we run chains excluding VVF measurements and fix the tidal anisotropy parameters to zero (αcen = αsat = 0).Hence, no assembly bias is used in this model.
(ii) wp + ξ0,2,4 + (αcen, αsat): Same as the first case, but allowing both tidal anisotropy parameters to be free and hence allowing for assembly bias in the model.
(iii) wp + ξ0,2,4 + VVF: Same as the first case, but including VVF in the measurement.In this case we use the chains from the first case and importance sample them due to the expensive VVF calculation.
(iv) wp + ξ0,2,4 + VVF + (αcen, αsat): We use all the measurements and allow all parameters including assembly bias parameters to be free.In this case we use the chains from the second case and importance sample them due to the expensive VVF calculation.

RESULTS
We measure clustering and VVF statistics for the magnitude limited (Mr < −19) galaxy sample from the GAMA survey.We closely follow the model developed in A21 and extend it to include the VVF statistic, along with assembly bias parameters (see equation 6).Table 3 provides the constraints on all the parameters for different choices of statistics and model parameters.Figure 7 shows the twoand one-dimensional constraints on the base HOD parameters.We only show the Mcut − σM space for the two-dimensional likelihood, in order to highlight the effect of assembly bias; other planes were unaffected and can be seen in Figure 7 of A21. Figure 8 similarly shows the two-and one-dimensional constraints on the extended set of HOD parameters, including assembly bias.We again show only fc − fvir and αcen − αsat space to keep the focus on interesting spaces, while showing one-dimensional likelihoods for all the parameters.In both Figures 7 and 8 the solid and dashed contours represent the 1σ and 2σ confidence limits respectively.The magenta and cyan colours show the fit to only clustering statistics (i.e.wp and ξ0,2,4) with and without assembly bias parameters, respectively.The red and blue contours show the combined fit to clustering and VVF statistics -again, respectively with and without assembly bias parameters.Figure 9 shows the clustering and VVF measurements from the GAMA data along with the best-fit models for different cases considered.We note that the model and measurement may sometimes appear a few sigma apart based on diagonal errors, but a more formal goodness of fit accounting for covariances is given in Table 3.

Assembly bias and HOD parameters
The HOD parameters given in the top block of Table 3 were largely not influenced by whether we include VVF results or allow assembly bias parameters to be free, except for the Mcut − σM plane.As shown in the top panel of Figure 7, there is a degeneracy between Mcut and σM .This has to do with the fact that the model can be adjusted to have the same density of galaxies and amplitude of clustering by increasing the cut-off mass (Mcut) for central galaxy occupation probability while making the cut-off shallower (i.e.increasing σM ).The contours obtained by using clustering only, but with and without assembly bias parameters in the model, follow this degeneracy and overlap with each other.Adding VVF information to the analysis shrinks the contours in this degeneracy direction.Interestingly, adding VVF results without assembly bias parameters slightly displaces the contours towards lower values of Mcut and σM .When VVF data are added with assembly bias freedom, however, the contours shrink towards higher values of Mcut and σM .Therefore, the model with assembly bias prefers a shallower but higher cut-off in the occupation probability for central galaxies, while the one without assembly bias prefers a sharper but lower cut-off.We also note that, in the presence of assembly bias, the cut-off in the central galaxy occupation probability is also a function of tidal environment.Hence the Mcut − σM plane should be inferred with the caution that this is the median relation over tidal environment in presence of assembly bias.
Figure 8 shows the extended HOD parameters related to the phase space distribution of the satellite galaxies and the assembly bias.The top left panel shows the posterior in the fc − fvir space, where the parameter fc indicates the concentration of the radial distribution of satellites relative to the dark matter in the host halo and fvir indicates the maximum distance to which the satellites should be populated in units of the virial radius.The two parameters fc and fvir allow satellite galaxies to deviate from the NFW halo profile, capturing possible effects from baryonic processes.We find that the satellite galaxies prefer to have half the concentration of dark matter haloes and extend up to twice the virial radius, consistent with Alam et al. (2021b).This result is independent of the statistical data used and of whether or not assembly bias parameters are included.However, the posterior is wide for the clustering-only analysis (see the cyan and magenta contours) and shrinks significantly after adding VVF information, making these deviations in the distribution of satellite galaxies highly significant (see blue and red contours).This is consistent with the expectation from PA20 that the VVF is sensitive to galaxy formation physics and has interesting implications for constraining models of galaxy formation if confirmed in forthcoming surveys with much larger volume.Another freedom we have allowed for in modelling satellite galaxies is in terms of γIHV, quantifying the velocity dispersion of the satellite galaxies in units of the halo velocity dispersion.We find the model does prefer a slightly higher velocity dispersion for the satellite galaxies compared to dark matter haloes, but the posterior is wide even after including VVF data and hence any such difference cannot yet be considered statistically significant.
The top right panel in Figure 8 shows the two-dimensional likelihood of our two assembly bias parameters (i.e.αcen and αsat).We find that if only clustering data are used, then the assembly bias parameters have a wide posterior, with best fit values away from no assembly bias, but where the 2σ contour encloses the no assembly bias model.Adding VVF data to the analysis shrinks this contour significantly, with a slight shift in the best fit parameters away from the no assembly bias model, which is now well outside the 2σ confidence region.The one-dimensional likelihoods for these two assembly bias parameters are shown in the bottom of the Figure 8.We find αcen = −0.79+0.29  −0.11 and αsat = 1.44 +0.25 −0.43 when using both clustering and VVF statistics.This is a detection of non-zero assembly bias for both central (2.4σ) and satellite (3.3σ) galaxies.
Figures 10 and 11 illustrate the assembly bias signal in terms of galaxy occupation.We first show the distributions of halo mass and tidal anisotropy for the parent dark matter haloes in Figure 10, which contain the same information as Figure 7 of Paranjape et al. (2018a).
Recall that low values of αR (where R = 4R 200b / √ 5: see section 2.2) correspond to isotropic tidal environments, while large values indicate anisotropic environments.Massive haloes, which dominate their environments, tend to have smaller αR, with a narrow distribution peaking at lower values.Less massive haloes that are isolated would have similarly small αR, but a large fraction of lowmass objects reside near larger haloes or in filaments, and would hence have larger αR.Thus, the αR distribution at low mass is broader and peaks at higher values than that at high mass.
This inherent mass dependence of αR is removed in our model, which uses the ranks of αR in narrow bins of halo mass (see equation 6).The assembly bias parameters in equation ( 6) mean that the occupation of central and satellite galaxies depend not purely on halo mass, but also on tidal anisotropy.Therefore, we show the twodimensional occupation probability of galaxies in Figure 11 for the best fit model in different scenarios.The top row shows the occupation probability of central galaxies and the bottom row shows the mean number of satellite galaxies per halo.The first two columns are without assembly bias (αcen = 0 = αsat) and hence show that the occupation is a function only of halo mass.The first column using clustering shows a slightly higher but shallower cutoff for central galaxy occupation as compared to the second column which also includes the VVF.The last two columns have best fit models with non-zero assembly bias and therefore the occupation is a function of both halo mass and tidal anisotropy.
For the central galaxies, the assembly bias parameter αcen does not have any effect for the most massive haloes, because the occupation probability is always unity and αcen only affects the cut-off mass. 4The best-fit values of the parameter αcen (which are negative in our case) introduce a negative correlation between tidal anisotropy and Mcut.This is visible as the negative correlation for intermediate and low mass haloes.At fixed mass (M 200b ≲ 10 13.5 h −1 M⊙), it is therefore more likely for haloes in tidally anisotropic environments to host a galaxy than those in isotropic environments.This could arise because these intermediate and low mass haloes in tidally anisotropic environments are likely to be connected with the filamentary structure of the cosmic web, providing highways to supply the material for the formation and evolution of galaxies.For the satellite galaxies, the values of tidal anisotropy αR are inherited from their respective host dark matter haloes.This may not accurately represent the tidal environments of satellite galaxies, which is still an open question (see, e.g., Zjupa et al. 2020), with a full treatment being beyond the scope of this analysis.So, following the simplification that satellites have the same tidal environment as their respective host dark matter halo, we show the mean number of satellite galaxies as a function of halo mass and tidal environment in the bottom row of Figure 11.The best fit model using both clustering and VVF shows strong positive correlation of the satellite occupation with the parent halo's tidal environment.This means that, at fixed halo mass, haloes in more tidally anisotropic environments have fewer satellite galaxies compared to haloes in tidally isotropic environments.This suggests that it is harder for satellite galaxies to accrete onto haloes residing in highly anisotropic environments.

Assembly bias and redshift-space distortions
One of the important routes to measuring cosmological information and testing relativistic gravity on large scales is through RSD ob-servations (Peebles 1980;Kaiser 1987;Hamilton 1992).It has been suggested by a number of authors that the presence of assembly bias can possibly affect the RSD measurement and potentially bias the cosmological constraints (Obuljen et al. 2019).We therefore look at the effect on the measurement of the growth rate with and without the assembly bias parameters introduced in this analysis.We do this using the parameter γHV which parameterises the ratio of measured growth rate to the true growth rate in the standard model.This is generally a good approximation at large scales and useful for this preliminary work.Ideally for future studies it will be important to use simulations with different f σ8 in place of using this simple scaling of halo velocities.We also remind the reader that our model includes a tidal anisotropy dependence of the galaxy occupation, but that this is part of the simulation-based modelling: we do not need to estimate the tidal anisotropy of the observed galaxies.
The one-dimensional likelihood of γHV is shown in Figure 8.It shows that the posteriors obtained in our analysis using various combinations of measured statistics and with and without assembly bias change the preferred value of γHV by less than 2σ.The value γHV = 0.9 ± 0.06 from clustering and without assembly bias com- pared to γHV = 0.83 ± 0.07 when allowing for assembly bias are consistent with each other within 1σ.Similarly when comparing the combined fit from clustering and VVF data, the resulting figures of γHV = 0.94 ± 0.05 without assembly bias and γHV = 0.86 ± 0.05 with assembly bias are consistent within 2σ.
We note that γHV, which multiplies all the galaxy peculiar velocities, is the ratio of velocities in the true Universe to the one predicted by simulations.In our case, assuming that velocities are in the linear regime, this translates to a ratio of f σ8 values given by γHV = [f σ8]true/[f σ8] simulation .We then convert the measure- We note that the distribution of tidal anisotropy α R for massive haloes peaks rather sharply at low values, implying that massive haloes dominate their tidal environments, causing them to be isotropic.Conversely, the less massive haloes show a wide distribution of tidal environments.ments of γHV in terms of f σ8 as reported in Table 3: these figures can be directly compared with the Planck prediction, and are found to be between 2-4σ lower (Planck Collaboration 2018).We convert the measurement of f σ8 to constraints in the Ωm-σ8 plane assuming ΛCDM-GR and fixing everything else to the Planck best fit values.This is shown in Figure 12, comparing the RSD constraint from this work with the Planck constraint and the constraint obtained using the combination of CMB-lensing and galaxy clustering in Hang et al. (2021).This illustrates that already for GAMA including assembly bias or not can make a difference between weak (≈ 2σ) tension without assembly bias to strong (≈ 4σ) tension when including assembly bias.Therefore, in order to avoid biases, it will be crucial to look at such details about the galaxy population while interpreting cosmological results.While the effect of assembly bias is at the level of a 1.5σ shift in f σ8, this can lead to significant inconsistency with other experiments as shown.This level of effect for future experiments such as DESI (DESI Collaboration 2016) will be highly significant as the statistical errors are expected to be smaller than for GAMA by a factor of 8-10.Hence such effects may prove to be challenging for RSD studies using deep low redshift samples such as the Bright Galaxy Survey (BGS; Hahn et al. 2022) and the 4MOST Hemispheric Survey (4HS)5 .We note that for future studies it will be important to test any biases in cosmology and the ability to infer the assembly bias parameters on high precision simulated galaxy catalogues in the presence of such assembly bias effects.

Assembly bias and gravitational lensing
We do not use measurements of gravitational lensing in this analysis, although such data can add useful information at small scales with respect to the impact of assembly bias.Therefore, we want to understand if the assembly bias effect we have observed can potentially show a signature in weak lensing observables.In order to do Figure 11.The distribution of galaxies in M halo vs α R space.The top row is for the central galaxies and the bottom row is for the satellite galaxies.The first two columns are for a model without assembly bias parameter but respectively without and with VVF statistics.The last two columns are for the models with assembly bias parameters but without and with VVF statistics.The colours in the top row show the probability of hosting a central galaxy, whereas the colours in the bottom row shows the expected mean number of satellite galaxies.so we measure cross-power spectra between the galaxies with and without assembly bias and the dark matter density field as shown in Figure 13.The top panel shows the cross power spectrum for galaxies without assembly bias in cyan colour and galaxies with assembly bias in red.The dashed, dotted-dashed and solid lines represents the satellite galaxies, central galaxies and central + satellite galaxies cross-power spectrum.The bottom panel shows the ratio of cross power spectrum of galaxies with assembly bias and without assembly bias.We find that the overall galaxy sample with and without assembly bias provides consistent cross-power spectra and hence assembly bias will not show any lensing signature.The lensing around central galaxies, however, shows a ∼ 20% higher amplitude in the model with assembly bias compared to the one without assembly bias.It will be interesting to see if we can identify the central galaxies robustly enough to measure the lensing power spectrum in real observations to constrain the assembly bias effect measured in this paper more precisely.One way to approach this is to use lower mass groups, which for our best model might show a ∼ 20% inconsistency at linear scales between clustering and lensing observables.

Robustness against the shape of dark matter haloes
Our default model assumes that the satellite galaxies are distributed within haloes with spherical symmetry.But it is well known that, in cold dark matter (CDM) N -body simulations, the dark matter haloes are triaxial systems supported by anisotropic velocity disper-sions (Frenk et al. 1988;Jing & Suto 2002;Allgood et al. 2006;Hayashi et al. 2007;Bett et al. 2007).It is then important to consider the effect on the assembly bias parameters if the satellites are assumed to be distributed according to the triaxiality of haloes.On the other hand, we also know that baryonic processes generate additional forces in the system which tend to make the haloes closer to spherical.Figure 4 of Abadi et al. (2010) illustrates this clearly, comparing halo shapes with and without hydrodynamical effects and showing how constant density and potential contours around haloes becomes rounder in the presence of galaxy formation processes.Therefore, in this paper we extend our model to include the shape of haloes in the satellite galaxy distribution.The triaxial shape of the dark matter haloes can be defined by the combination of the major axis ua and the two axis ratios c/a and b/a, which we have measured for individual haloes using the method described in Behroozi et al. (2013).We include the shape of haloes in the satellite distribution by introducing a new parameter (S flat ), which scales the two axis ratios between spherical and halo shape as follows: Here, subscripts gal and halo refer to the shape parameters corresponding to the satellite galaxy distribution and the host halo distribution.The new parameter S flat allows the distribution of satellites to vary between spherical symmetry (i.e S flat = 1) and halo shape (i.e S flat = 0 ).To include this parameter for any chosen value of Figure 12.The effect of assembly bias in the Ωmσ 8 plane.The contours contain 68% and 95% of the total probability, which is equivalent to 1σ and 2σ.The black contours are from Planck 2018, purple is from using the combination of galaxy clustering and CMB lensing from the DESI Legacy Survey as reported in Hang et al. (2021).The red and blue contours are redshift space distortion models with and without assembly bias parameters, using the GAMA sample combining two-point clustering with VVF.We note that the model without assembly bias (blue contours) shows acceptable overlap with Planck, but the model with assembly bias (red contours) is preferred by the GAMA data and shows significantly disjoint posteriors.
S flat , we first determine the shape parameters for the galaxy distribution.We then distribute the satellite galaxies spherically with unit radius following an appropriate NFW profile, which is then transformed to the ellipsoidal distribution using the two axis ratio and assuming the x-axis as the major axis of the ellipsoid.We then apply a rotation matrix to the satellites such that major axis of the ellipsoid aligns with ua, which results in the final satellite positions around the centre of the host halo.
We re-run our full analysis with this additional parameter (S flat ) to allow for freedom in the shape of satellite galaxy distribution.We find that most parameters of the models are unaffected beyond a slight increase in the error.The new constraints on assembly bias parameters while only using clustering are αsat = 0.87 +0.66 −0.51 , αcen = −0.17+0.28 −0.29 , whereas including VVF information improves the constraints to αsat = 1.11 +0.46  −0.33 , αcen = −0.60 +0.22 −0.30 .If we compare these to the case where satellite galaxies are distributed spherically, then we find that allowing the flatness parameter to be free does not affect the 3.3 σ detection significance of the satellite assembly bias parameter (αsat), while the detection significance of the parameter for centrals (αcen) increases slightly from 2.4 σ to 2.7 σ.This shows that our detection of assembly bias signature necessarily requires VVF information and is independent of a simple extension in the model to account for the anisotropy in the satellite spatial distribution.We note, however, that a more complete model which also allows for a velocity anisotropy for the satellite galaxies would be interesting to study, but is beyond the scope of the present analysis.We also note that the posterior of the new flatness parameter (S flat ) appears to be multi-modal, and that the mean and standard deviation of S flat are 0.29 ± 0.22 without VVF and 0.48 ± 0.28 when VVF Figure 13.Cross power spectrum of galaxies and the dark matter density to illustrate the effect of assembly bias on gravitational lensing.The HOD model with no assembly bias is shown in cyan and the one with assembly bias is shown in red.The solid, dashed and dotted-dashed lines show all galaxies; centrals only; and satellites only.We note that the overall cross-power in the two models with and without assembly bias agree at all scales, whereas the contribution of centrals and satellites are very different in the two models.The bottom panel shows the ratio of the cross-power spectrum in the model with assembly bias to the one without assembly bias.information is included.This suggests that the satellite galaxies are less triaxially distributed than the dark matter in their respective host haloes (S flat → 0) while also not preferring a completely spherical distribution (S flat → 1).Future data sets might be able to constrain this parameter more precisely.S flat can potentially help probe the triaxiality of haloes in the presence of galaxy physics, providing interesting constraints on galaxy formation process by comparing with full hydrodynamical simulations.

Robustness against Assembly bias dependent covariance matrix
The covariance matrix is a key component in determining the posterior distributions of our parameters.Therefore, it is important to test if assembly bias has any impact on the covariance matrix, and whether that could reduce the significance of our assembly bias signal.Our default covariance matrix used in the analysis is estimated using mocks without any assembly bias.Therefore we generate a new set of mock catalogues with our best fit parameters including assembly bias.Figure 14 compares the correlation matrix used in the initial analysis with the one estimated using the best fit parameters.We observe significant differences in the structure of the correlation matrix, which will affect the likelihood evaluation as a function of the assembly bias parameters.Therefore, we re-run our analysis using the new co-variance matrix with the aim of understanding whether this can reduce the assembly bias signal by a significant amount.We then find a revised constraint on assembly bias parameters: (αsat = 1.13 +0.74 −0.02 , αcen = −0.80+0.01 −0.38 ).This gives much more asymmetric errors and actually increases the detection significance of the assembly bias signature by a significant amount compared to the fiducial covariance matrix.In principle, the ideal scenario would be to include the parameter dependence of the covariance matrix in the likelihood, but we consider this to be beyond the scope of current work, although it could be an interesting topic to explore further in the future.The main point for the present work is that the detection significance of assembly bias does not reduce while using a covariance matrix that allows for assembly bias.

SUMMARY AND DISCUSSION
We have carried out a parameterised search for quantities other than halo mass that can affect the properties of galaxies in the observed Universe.This is one of the fundamental questions facing the current frontier of cosmology in two seemingly different sub-fields: (a) Cosmology: Unprecedented precision in cosmological measurements has been achieved by studying the spatial distribution of galaxies, but most models are validated on simulations the mass of the host halo is assumed to be the only relevant parameter.If this is not the case then we might misinterpret the cosmological nature of the true Universe, possibly leading to biased conclusions regarding central issues such as the evolution of dark energy.
(b) Galaxy formation: Detailed galaxy formation models are highly non-trivial to solve numerically with current computing capabilities and hence must assume a number of 'sub-grid' approximations.A constraint on such beyond halo mass effects can enlighten us about the possible dominant terms shaping the evolution of galaxies and hence potentially improve the approximations made in hydrodynamical simulations.
In order to detect 'assembly bias' effects that go beyond a simple dependence on halo mass, we have extended the analysis presented in A21 in three ways.(1) We have further extended the model to have additional freedom of assembly bias introducing two additional parameters (αsat, αcen), which correlate the occupation number of central and satellite galaxies with the rank of the tidal anisotropy (α rank R ) environment (see equation 6).We also introduce a third parameter (S flat ) as given in equation 9, which allows the satellite distribution to vary between a spherical form to having the host halo's ellipsoidal shape.(2) We improved the conservative covariance matrix used in A21, by generating a number of simulated galaxy catalogues using the best-fit parameters obtained in the previous analysis and hence obtain a more accurate covariance matrix, with a smaller error compared to the previous analysis.(3) We include the measurements of the Voronoi Volume Function (VVF: PA20) as a way to include information beyond the two-point function, which is especially valuable for assembly bias parameters.
We first present results for four different analysis scenarios: (a) without VVF data and without assembly bias parameters; (b) without VVF data but with two free assembly bias parameters; (c) with VVF data and without assembly bias parameters; and (d) with VVF data and with two free assembly bias parameters.The constraints for these four scenarios are shown in Figures 7 and 8 along with Table 3.We can draw the following conclusions based on these results: (i) Inclusion of VVF information significantly improves the constraints on the assembly bias parameters (αsat, αcen) as well as some of the satellite degrees of freedom such as satellite velocity dispersion (γIHV) and satellite virial radius (fvir).It also marginally reduces the posterior volume for other parameters without affecting the central values.
(ii) Using VVF information along with two-point clustering, we obtained the following constraints on assembly bias parameters: αcen = −0.79+0.29  −0.11 and αsat = 1.44 +0.25 −0.43 .This is a 3.3σ detection of assembly bias for satellite (αsat) and a 2.4σ detection for central galaxies (αcen).Another way to view this is via a likelihood ratio: if we compare the two cases using VVF data but with and without assembly bias, we find ∆χ 2 = 14 for 2 additional degrees of freedom.This has a highly significant p-value of 0.0009.
(iii) Our measurement of the growth rate from the GAMA data is f σ8(z = 0.18) = 0.39 ± 0.02 or f σ8(z = 0.18) = 0.42 ± 0.02 for the models with and without assembly bias, respectively.The growth rate inferred in our analysis depends weakly on the assembly bias but consistently remains lower than the Planck prediction.We note that our constraint assumes a cosmological geometry and thus can only really be used as consistency test.It is interesting to note that our constraint on f σ8 if projected onto the Ωm-σ8 plane shows that the model without assembly bias gives results consistent with Planck, but with assembly bias included it then shows a strong deviation from Planck (see Figure 12).Note that this discrepancy with Planck will be smaller when all other cosmological parameters are varied.
(iv) We also studied the effect on the lensing power spectrum by estimating the cross-correlation of galaxies with dark matter particles.We showed that the overall lensing power spectrum for our best fit model with and without assembly bias is very similar, as shown in Figure 13, but the lensing properties of central and satellite galaxies appear very different.Therefore, simply adding lensing observables in our analysis will not add any additional overall sensitivity to assembly bias, but if it were possible to separate central and satellite galaxies in the observed data (such as galaxy groups: see Yang et al. 2005;Robotham et al. 2011;Tinker 2020;Yang et al. 2021) then lensing can potentially bring in additional information.
The detection of assembly bias implies that HOD modelling should henceforth depend on tidal environment in addition to halo mass.To illustrate this, in Figure 11 we visualised what our best fit assembly bias parameters mean for occupation numbers.We then highlighted that a potential systematic might arise from our assumption that satellite galaxies are distributed spherically around the host halo centre.To test this, we repeated our analysis using a parameter S flat , which allows satellite galaxies to be asymmetrically distributed, finding that this does not affect the inferred assembly bias signals (section 5.4).We ignored the possibility of velocity anisotropy in the satellite distribution (which can potentially be strongly correlated with the tidal anisotropy αR: see Ramakrishnan et al. 2019) and consider it as important ingredient for any future analysis.We also note that we have performed our analysis with a fixed cosmology, and it would be interesting to re-examine the detection of assembly bias when cosmology is also allowed to be free.
Finally, we noted that the covariance matrix of our observables can potentially depend on the assembly bias effect.We therefore repeated the exercise of estimating the covariance matrix by generating new realisations of mock galaxies using the best fit assembly bias parameter values.The two correlation matrices with and without assembly bias are compared in Figure 14.We found differences in the correlation structure; but importance sampling of our MCMC chains using the new covariance matrix indicated that these differences did not alter the measured assembly bias (see section 5.5).We also highlight that, for a small volume survey such as GAMA, a full parameter dependent covariance might be important to consider for any future work.
Ongoing surveys such as DESI (DESI Collaboration 2016) will produce a sample similar to GAMA but with an area ≈ 80 times larger.Therefore the assembly bias signatures that we have detected here are very likely to have an important impact on the inferred fundamental cosmological parameters from this dataset.In the future, surveys such as Euclid, WAVES, 4HS from the 4MOST consortium and PFS on Subaru will increase the statistical precision of the measurements still further, so that our models will need continual improvement in order to confront such samples.ship for Advanced Supercomputing in Europe (PRACE, www.prace-ri.eu)for funding the MultiDark simulation project by providing computing time on the GCS Supercomputer SuperMUC at Leibniz Supercomputing Centre (LRZ: www.lrz.de).

Figure 1 .
Figure 1.Visualisation of galaxies in a 3 h −1 Mpc × 140 h −1 Mpc × 140 h −1 Mpc slice, centred at the galaxy with the largest assigned Voronoi volume in the base model.The y and z axes are shown, while we project along the x axis.Circles indicate central galaxies, with colour indicating the host halo mass; the sizes of the circles are also scaled according to the host halo mass.The red stars indicate the satellite galaxies.The left panel is for the mass-only HOD model; the middle panel is for the model with large assembly bias (αcen = −10 and αsat = 0); and the right panel is for the model with a reasonable value of assembly bias that is currently allowed by data (αcen = −0.71and αsat = 1.44, see section 5.1).The main feature to see here is that models with assembly bias differ from the base model in that the void regions have fewer galaxies and also that filamentary regions (especially very close to massive objects) are more densely occupied.This is because the assembly bias model with αcen < 0 makes it less likely for the galaxies to reside in haloes with tidally isotropic environments, while making it more likely for haloes of similar mass in tidally anisotropic filamentary regions to be occupied.

Figure 2 .
Figure 2. Distribution of relative volume for galaxies, V (g)/⟨V ⟩, from three different simulated galaxy catalogues.The black line in the top panel is for a galaxy catalogue with a mass only HOD model; the blue line is for a model with large assembly bias (αcen = −10 and αsat = 0); and the red line is for a model that depends on both mass and tidal anisotropy (αcen = −0.71and αsat = 1.44 from Table3).The bottom panel shows the ratio of red and black lines, with an error estimated from jackknife sampling, reflecting the noise for a survey of volume [250 h −1 Mpc] 3 .

Figure 3 .
Figure 3.Comparison of VVF percentile and width for the three models, one with a mass only HOD (shown in black) and an alternative in which the occupation also depends on tidal anisotropy (shown in red).The blue points show the model with large assembly bias (αcen = −10 and αsat = 0).The right panel shows the ratio of the two models (red and black) with errors from jackknife resampling for a survey of volume [250 h −1 Mpc] 3

Figure 4 .
Figure 4. Correlation matrices estimated using three different methods.The left and middle panels shows the correlation matrix estimated using the mean of jackknifes of mocks for L300 and MDPL2 simulation and the rightmost panel shows the one estimated using the jackknife of data.The different diagonal blocks are indicated by the black dashed lines.The first blocks shows the VVF in order of σ VVF , y 2.5 , y 16 , y 50 , y 84 , y 97.5 and n.The second block shows the projected correlation function wp and the next three blocks show the multipoles (i.e ξ 0 , ξ 2 and ξ 4 ).

Figure 7 .
Figure 7.The two-dimensional and one-dimensional constraints on the base HOD parameters for the Mr < −19 galaxy sample in GAMA.The The solid and dashed contours of given colour represent the 1σ and 2σ confidence limits respectively.The magenta and cyan colours indicate the fit to only clustering (i.e.wp and ξ 0,2,4 ) with and without assembly bias parameters respectively.The red and blue colours indicate the fit to clustering and VVF statistics (i.e.wp, ξ 0,2,4 and VVF percentiles) with and without assembly bias parameters, respectively.In the two-dimensional contours we only show the Mcut − σ M plane to highlight the main effect; other parameters are consistent with the results in Figure 7 of A21

Figure 8 .
Figure8.The same as Figure7, but going beyond the base HOD parameters.The one-dimensional likelihoods for the two assembly bias parameters (αcen and αsat) do not show much preference for non-zero values in clustering only fits (magenta dashed line), but when including VVF in the fit αsat shows a strong preference for non-zero positive values and αcen for non-zero negative values.

Figure 9 .
Figure 9. Clustering measurements for Mr < −19 galaxy samples in GAMA, together with best fit models.The measurements are shown as points with error bars.We note that the first four bins plotted in the multipole plots are wedges.The best fit models are shown with lines as indicated in the legend.

Figure 10 .
Figure10.The distribution of dark matter haloes in M halo vs α R space.We note that the distribution of tidal anisotropy α R for massive haloes peaks rather sharply at low values, implying that massive haloes dominate their tidal environments, causing them to be isotropic.Conversely, the less massive haloes show a wide distribution of tidal environments.

Figure 14 .
Figure 14.Correlation matrices .The first blocks shows the VVF in order of σ VVF , y 2.5 , y 16 , y 50 , y 84 , y 97.5 and n.The second block shows the projected correlation function wp and the next three blocks shows the multipoles (i.e ξ 0 , ξ 2 and ξ 4 ).