-
PDF
- Split View
-
Views
-
Cite
Cite
Ginevra Favole, Antonio D Montero-Dorta, M Celeste Artale, Sergio Contreras, Idit Zehavi, Xiaoju Xu, Subhalo abundance matching through the lens of a hydrodynamical simulation, Monthly Notices of the Royal Astronomical Society, Volume 509, Issue 2, January 2022, Pages 1614–1625, https://doi.org/10.1093/mnras/stab3006
- Share Icon Share
ABSTRACT
We use the IllustrisTNG100 hydrodynamical simulation to study the dependence of the galaxy two-point correlation function on a broad range of secondary subhalo and galactic properties. We construct galaxy mock catalogues adopting a standard subhalo abundance matching scheme coupled with a secondary assignment between galaxy colour or specific star formation rate and the following subhalo properties: starvation redshift zstarve, concentration at infall, overdensity |$\delta _R^{\rm env}$|, tidal anisotropy αR, and tidal overdensity δR. The last two quantities allow us to fully characterize the tidal field of our subhaloes, acting as mediators between their internal and large-scale properties. The resulting mock catalogues overall return good agreement with the IllustrisTNG100 measurements. The accuracy of each model strongly depends on the correlation between the secondary galaxy and subhalo properties employed. Among all the subhalo proxies tested, we find that zstarve and cinfall are the ones that best trace the large-scale structure, producing robust clustering predictions for different samples of red/blue and quenched/star-forming galaxies.
1 INTRODUCTION
Connecting the properties of galaxies to those of dark-matter (DM) haloes is a fundamental step in the extraction of cosmological information from measurements of galaxy clustering (Wechsler & Tinker 2018) along with a necessary validation for theories of galaxy formation inside haloes. It is today accepted that the mass of the hosting halo correlates with the stellar mass of the galaxy located at its centre (Behroozi, Conroy & Wechsler 2010; Guo et al. 2010; Moster, Naab & White 2013; Matthee et al. 2017), with its size (Rodriguez et al. 2020), and with the total galaxy content or occupation of the halo (e.g. Berlind & Weinberg 2002; Zehavi et al. 2005; Artale et al. 2018; Bose et al. 2019; Hadzhiyska et al. 2020; Xu, Zehavi & Contreras 2021), to name but a few. These and other correlations are nothing but a measurable manifestation of the multiple physical processes that take place inside haloes, which shape the evolution of its baryonic content.
One of the main methods to perform the aforementioned connection is the subhalo abundance matching (SHAM) technique (Conroy, Wechsler & Kravtsov 2006; Behroozi et al. 2010; Trujillo-Gomez et al. 2011; Chaves-Montero et al. 2016; Favole et al. 2016a,2017; Guo et al. 2016a,b; Rodríguez-Torres et al. 2016, 2017; Contreras, Angulo & Zennaro 2021a, b; Hadzhiyska et al. 2021). In SHAM, galaxies from an observational (or synthetic) data set are linked to haloes from an N-body numerical simulation by matching their number densities, assuming a one-to-one correspondence between primary halo and galaxy properties. For haloes, either a halo mass or a velocity-related quantity (in their multiple forms) is assumed, on the basis that they are the main determinants of halo clustering. Analogously, either stellar mass or luminosity is chosen for the galaxy set, since they are easy to measure and are the properties that are known to correlate better with halo mass/velocity. To this simple prescription, a parametrized scatter in the halo–galaxy correspondence is added in order to account for the stochasticity in the way that galaxies populate haloes. The SHAM technique has been employed successfully in a number of clustering works across different redshifts and for several galaxy populations (Favole et al. 2016a, 2017; Guo et al. 2016a; Rodríguez-Torres et al. 2016, 2017; Granett et al. 2019; Jullo et al. 2019).
As mentioned above, the majority of the SHAM modelling is performed on the basis of a single halo/galaxy property. However, at fixed halo mass, the clustering of haloes is known to depend on secondary halo properties such as formation redshift (which encodes the assembly history of haloes), concentration, and spin (see e.g. Gao, Springel & White 2005; Wechsler et al. 2006; Angulo, Baugh & Lacey 2008; Dalal et al. 2008; Salcedo et al. 2018; Johnson et al. 2019; Sato-Polito et al. 2019; Mansfield & Kravtsov 2020; Tucci et al. 2021). This effect, broadly referred to as secondary halo bias, is expected to have a manifestation on the galaxy population (see discussion in e.g. Zhu et al. 2006; Miyatake et al. 2016; More et al. 2016; Montero-Dorta et al. 2017, 2020b; Niemiec et al. 2018; Zehavi et al. 2018; Obuljen, Percival & Dalal 2020; Salcedo et al. 2020), which has forced halo–galaxy connection models to adjust.
Additional dependences of galaxy and halo clustering have already been implemented into the SHAM formalism. Namely, in Hearin & Watson (2013), the SHAM modelling of the differential clustering of red and blue galaxies is performed by including a secondary dependence on the ‘starvation redshift’, related to the formation redshift of haloes. This incarnation of SHAM is called age matching, in reference to the fact that an additional matching at fixed halo mass (or maximum rotational velocity Vmax) is performed based on a halo-age-related quantity.
In the context of secondary halo bias, several works have attempted to provide an explanation for the physical mechanisms behind this convoluted set of trends (e.g. Dalal et al. 2008; Hahn et al. 2009; Borzyszkowski et al. 2017; Musso et al. 2018; Paranjape, Hahn & Sheth 2018; Ramakrishnan et al. 2019; Mansfield & Kravtsov 2020; Paranjape & Alam 2020; Ramakrishnan & Paranjape 2020; Zjupa et al. 2020; Tucci et al. 2021). As a result of these efforts, a common picture is starting to emerge, where halo assembly bias (the secondary dependence on halo accretion history) might be intimately connected to environmental processes that take place in the cosmic web. In essence, assembly bias might be the result of the truncation of accretion history in a population of smaller mass haloes, which could be more likely in certain environments (i.e. filaments) than others (nodes); see an illustrative description of these processes in Borzyszkowski et al. (2017) and Musso et al. (2018).
The cosmic web environment can be characterized in multiple ways. One simple option is to measure the halocentric density in spheres of a certain radius (i.e. ∼ 5 h−1 Mpc). Also informative (and in a sense complementary) is the tidal anisotropy parameter αR, which can be determined from the eigenvalues of the tidal tensor. The tidal tensor describes the gravitational effect exerted by the global distribution of matter around a point, so its anisotropy allows us to characterize the cosmic web: Large values of αR correspond to filaments and sheets, whereas smaller values are associated with regions where matter is accreted from all directions, i.e. nodes. Importantly, in a series of works (Paranjape et al. 2018; Ramakrishnan et al. 2019; Paranjape & Alam 2020; Ramakrishnan & Paranjape 2020), it has been recently claimed that assembly bias correlates directly with αR, which is likely a reflection of the environment-related truncation of accretion mentioned above (Hahn et al. 2009; Borzyszkowski et al. 2017; Musso et al. 2018).
Hydrodynamical simulations are laboratories for galaxy-formation physics, which are ideal to test halo–galaxy linking techniques, since both the DM and the baryonic components of haloes in detail. In this context, the recently released IllustrisTNG (Pillepich et al. 2018; Nelson et al. 2019) suite of hydrodynamical simulations offers some advantages, such as the large size of some of their boxes (up to a side length of 205 h−1 Mpc). IllustrisTNG has already shed light on to crucial aspects of the connection between galaxies and haloes. Montero-Dorta et al. (2020b) showed how secondary halo bias would manifest itself in the clustering of the central galaxy population when this is selected on the basis of several galaxy properties. Bose et al. (2019) and Hadzhiyska et al. (2020) addressed halo occupation in IllustrisTNG, demonstrating that the basic (mass-based) halo occupation distribution ansatz underpredicts the real-space correlation function in the largest IllustrisTNG box. They also discussed several ways of ‘augmenting’ the modelling by including secondary halo dependences, a work that was subsequently extended in Hadzhiyska et al. (2021). Contreras et al. (2021a) used IllustrisTNG to test a novel and flexible modification of SHAM where an arbitrary amount of assembly bias can be incorporated into the modelling.
The main goal of this paper is to build on previous efforts and revisit the SHAM method using IllustrisTNG. The philosophy behind our approach follows that of Hearin & Watson (2013), in that we are interested in modelling galaxy populations selected by colour and star formation rate by including both secondary dependences on halo and galaxy properties. We also test the inclusion of new physically motivated halo properties within the formalism, including the anisotropy tidal parameter αR and the density of the environment around haloes.
The paper is organized as follows. Section 2 provides a brief description of the TNG100 simulation data analysed in this work, including halo/galaxy properties (Section 2.1) and sample selection (Section 2.2). The environmental properties (DM density contrast and tidal-field measurements) are described in Section 2.3. The methodology used to measure galaxy clustering in TNG100 and to estimate the associated uncertainties is detailed in Section 3.1. Our SHAM implementation is described in Section 3.2. Section 4 presents the main results of our analysis: the correlations between secondary halo and galaxy properties (Section 4.1) and the clustering outcomes (Section 4.2). In Section 5, we summarize and discuss our findings.
The IllustrisTNG simulations adopt the standard Λ cold dark matter cosmology (Planck Collaboration XIII 2016), with parameters Ωm = 0.3089, Ωb = 0.0486, |$\Omega _\Lambda = 0.6911$|, and |$H_0 = 100~\, h\, {\rm km\, s^{-1}\,Mpc^{-1}}$| with h = 0.6774, σ8 = 0.8159, and ns = 0.9667.
2 SIMULATION DATA
In this paper, we use the magnetohydrodynamical cosmological simulations IllustrisTNG (Weinberger et al. 2017; Marinacci et al. 2018; Naiman et al. 2018; Nelson et al. 2018; Pillepich et al. 2018; Springel et al. 2018). The IllustrisTNG represent a revised version of the Illustris simulations (Genel et al. 2014; Vogelsberger et al. 2014a, b), run with the arepo moving-mesh code (Springel 2010). Among others, the updates on IllustrisTNG include magnetic fields, a revised scheme for galactic winds, and a new model for active galactic nucleus (AGN) feedback. The simulations include subgrid models that account for star formation, chemical enrichment from SNII, SNIa, and asymptotic giant branch stars, stellar feedback, radiative metal-line cooling, and AGN feedback.
The IllustrisTNG suite includes different sizes and resolutions. In this work, we made use of the IllustrisTNG100-1 (hereafter TNG100) run, and its DM-only counterpart IllustrisTNG100-1-DMO (hereafter TNG100-DMO). These are the second largest simulated boxes and with the highest resolution available on the data base.1 The TNG100 and TNG100-DMO represent a cubic box of side |$L_{\rm box}=75\, h^{-1}$| Mpc, with periodic boundary conditions. The TNG100 run follows the dynamical evolution of initially 18203 gas cells of mass |$9.4 \times 10^5 \, h^{-1} \, {\rm M_{\odot }}$|, and DM particles of mass |$5.1 \times 10^6 \, h^{-1} \, {\rm M_{\odot }}$|. TNG100-DMO was run with Np = 18203 DM particles of mass |$m_{\rm p}=6 \times 10^6 \, h^{-1} \, {\rm M_{\odot }}$|.
The choice of TNG100 is motivated by its high resolution, as compared to the larger TNG300 box (|$L_{\rm box}=205\, h^{-1}$| Mpc). We have in fact checked that although the TNG300 volume is beneficial in terms of the computation of clustering, some of the halo properties that we determine here can be severely affected by its lower resolution.
The TNG and TNG-DMO simulations have been already employed to investigate several features on the galaxy–DM halo connection (Bose et al. 2019; Contreras et al. 2021a; Gu et al. 2020; Hadzhiyska et al. 2020, 2021; Montero-Dorta et al. 2020a, b; Shi et al. 2020) proving that they are a suitable tool for the goal of this paper. In the next section, we describe the set of properties implemented.
2.1 Halo and galaxy properties
We use the galaxy and DM catalogues from TNG100 and TNG100-DMO available on the data base. The DM haloes (also referred to as ‘groups’ in TNG) are identified with a friends-of-friends algorithm using a linking length of 0.2 times the mean inter-particle separation (Davis et al. 1985), while the gravitationally bound substructures (also called ‘subhaloes’ in TNG) are identified using the SUBFIND algorithm (Springel, White & Hernquist 2001; Dolag et al. 2009). The TNG subhaloes can be either central or satellite structures.
In order to develop the SHAM modelling, we combine the subhalo properties from the TNG100 hydrodynamical simulation with those belonging to its DM-only counterpart, TNG100-DMO. In TNG100, subhaloes with non-zero stellar mass component are defined as galaxies. Each DM halo can contain a central galaxy and several satellites, depending on the size and mass of the halo. We use the following properties from the subhaloes of the TNG100 simulation:
M*(|$h^{-1} \, {\rm M_{\odot }}$|): stellar mass computed as the total mass of the stellar particles bound to each subhalo.
(g − i): intrinsic galaxy colour from the IllustrisTNG data base. We note that it does not include the attenuation produced by dust.
SFR (M⊙ yr−1): galaxy star formation rate. It is defined as the sum of the star formation rate of the gas cells in each subhalo.
sSFR (yr−1): specific star formation rate, computed as sSFR = SFR/M*.
From TNG100-DMO, the following properties are employed:
Vpeak (km s−1): maximum circular velocity estimated over the entire history of the subhalo.
Mvir (|$h^{-1} \, {\rm M_{\odot }}$|): virial mass computed as the total mass of DM particles2 within a radius Rvir, where the enclosed density equals 200 times the critical density.
Msubhalo (|$h^{-1} \, {\rm M_{\odot }}$|): subhalo mass, computed as the total number of DM particles times the mass of each individual particle in the subhalo.
zchar: characteristic redshift defined as the redshift at which the subhalo first reaches a mass of 1012 (|$h^{-1} \, {\rm M_{\odot }}$|). For haloes that never attain this mass, zchar = 0.
z1/2: formation redshift, computed as the redshift at which, for the first time, half of the halo mass at |$z$| = 0 has been accreted into a single subhalo.
zacc: accretion redshift defined as the redshift after which a subhalo always remains a subhalo. For parent haloes, zacc = 0.
- zstarve: starvation redshift defined as in Hearin & Watson (2013):(1)$$\begin{eqnarray*} z_{\rm starve} = {\rm Max}(z_{\rm 1/2},z_{\rm char},z_{\rm acc}). \end{eqnarray*}$$
- cinfall: concentration at the time of infall, i.e. when a halo falls within the virial radius of a larger halo, thus becoming a subhalo. As shown by Bullock et al. (2001) and Gao & White (2007), a good proxy for concentration is provided by the ratio between the maximum and virial circular velocities of the halo. We therefore define this asThe concentration at infall is preferable to the virial quantity as it is a better proxy for Vpeak, correctly tracing the halo–subhalo hierarchy.(2)$$\begin{eqnarray*} c_{\rm infall} = \frac{V_{\rm max}}{V_{\rm vir}}. \end{eqnarray*}$$
|$N_{\rm p}^{\rm {(halo)}}$|: number of particles per resolved halo. This corresponds to the ‘SubhaloLen’ TNG100-DMO property.
2.2 Sample selection
Our TNG100 galaxy and subhalo samples are selected by imposing two minimal cuts: |${\log (M_\star /{h^{-1}}\, \mathrm{M}_{\odot })}\,\gt \,8.75$|, |${\log (M_{\rm vir}/{h^{-1}}\, \mathrm{M}_{\odot })}\,\gt \,9.7$|. These conditions eliminate the low-mass end of the subhalo (galaxy) distribution, which is not interesting for the current analysis, making the final catalogue more manageable. They also contribute to discarding a fraction of DM subhaloes that are below the resolution limit we adopt in Section 2.3.2 to calculate the halo tidal properties. We further remove from the resulting sample all subhaloes with non-physical values of the concentration at infall, i.e. cinfall = 0 and ∞. These are spurious objects representing less than 0.2 per cent of the selection.

TNG100 stellar mass function and corresponding Schechter fit given in equation (3).
As Fig. 1 shows, the Schechter fit provides a good analytical description of the TNG100 stellar mass function. This model will be useful in the context of the SHAM prescription implemented in this work. The fit shown here is not smooth as it is the sum of three Schechter functions with parameters given in equation (4). This specific shape has been chosen to maximize the agreement with the TNG100 stellar mass function.
Another important element in our analysis is the distribution of galaxy colours in TNG100. The top panel in Fig. 2 shows the (g − i) colour distribution of the TNG100 galaxy sample with the characteristic blue (left-hand side) and red (right-hand side) peaks. In order to study the galaxy clustering dependence on colour, in the analysis we separate the two populations by imposing a (g − i) = 0.85 cut. Note that this cut is solely based on the TNG100 colour distribution of Fig. 2.

Top: TNG100 colour distribution. The vertical line at (g − i) = 0.85 denotes the cut that we apply to separate the red (right-hand side) from the blue (left-hand side) galaxy population. Bottom: sSFR distribution with the |$\rm log(sSFR/yr^{-1})=-10.7$| cut, which divides star-forming (right-hand side) from quenched (left-hand side) galaxies.
About 23 per cent of the TNG100 galaxies are quenched with |$\rm SFR=0$|. This produces a singularity in the |$\rm log(sSFR/yr^{-1})$| distribution, which hinders the analytical fitting and subsequent modelling shown in Section 3. To circumvent this problem, we randomly scatter the null SFRs using a Gaussian distribution with |$\sigma =4\times 10^{-4}\, ({\rm {M_{\odot }\, yr^{-1}}})$|. This specific value has been chosen so that it returns a quenched peak in the |$\rm log(sSFR/yr^{-1})$| distribution with no overlap with the star-forming one. The final sSFR distribution including the scatter is shown in the bottom panel of Fig. 2. We investigate the clustering dependence on sSFR by separating the quenched from the star-forming population at |$\rm log (sSFR/yr^{-1})=-10.7$|. This value is similar to previous demarcations employed in the context of IllustrisTNG (see e.g. Donnari et al. 2019).
2.3 Environmental properties
One of the main goals of this work is to test the inclusion of additional dependences of subhalo clustering into the SHAM ansatz. Motivated by recent secondary halo bias results (e.g. Paranjape et al. 2018; Ramakrishnan et al. 2019), we choose to focus on halo environment. In particular, we test different prescriptions of environment, based on both the subhalo occupancy number within a certain radius and the halo virial mass. In this section, we describe how to measure such environmental properties, which have been computed by using the TNG100-DMO simulation.
2.3.1 Subhalo overdensity
A fundamental quantity used to characterize the environment of a subhalo is its overdensity |$\delta _R^{\rm env}$|. This is computed as the number density of subhaloes within a sphere of radius R, normalized by the total number density of subhaloes in the box (e.g. Artale et al. 2018; Bose et al. 2019). The subhalo overdensity is also a biased tracer of the DM density contrast.
We compute such density by adopting periodic boundary conditions for three different radii, namely 3, 5, and 8 h−1 Mpc. This environmental property will be used as a secondary subhalo property in the analysis.
Fig. 3 shows the distribution of the subhalo overdensity at three different radii as a function Vpeak, colour-coded by zstarve. The first thing to notice is that, while low-Vpeak (typically low-mass) subhaloes live in all types of environments, there is a preference for higher-Vpeak (typically higher-mass) subhaloes to inhabit slightly denser environments. This trend is progressively washed out as we increase the radius. Regarding the secondary dependence on zstarve, Fig. 3 seems to indicate that higher-Vpeak subhaloes suffer the truncation of their accretion earlier (they are older), as compared to their lower-Vpeak counterparts. At fixed Vpeak, Fig. 3 does not display a visible correlation between zstarve and the subhalo overdensity.

Subhalo overdensity as a function of Vpeak, colour-coded with zstarve for the subhaloes in TNG100-DMO. From left to right, we show the results with a radius of 3, 5, and 8 h−1 Mpc. Here we show the average |$z$|starve value in 30 bins of Vpeak and subhalo overdensity.
The top panel in Fig. 4 displays the spatial distribution of the subhaloes in a TNG100-DMO slice |$10\, h^{-1}$| Mpc thick. The colour code represents the subhalo overdensity for a radius of 3 h−1 Mpc. Subhaloes located in knots and filaments are in redder colours, representing the densest regions of the cosmic web, while those in lower-density regions (i.e. voids) are in blue.

A slice of TNG100, 10 h−1 Mpc thick, colour-coded using the environment overdensity at 3 h−1 Mpc (top), the tidal anisotropy parameter αR (centre), and the tidal overdensity δR (bottom). We show only the low-mass subhaloes, i.e. M|$_{\rm sh}\lt 10^{12}\, h^{-1}$| M⊙. The big points in the bottom panel indicate the most massive (M|$_{\rm sh}\gt 10^{13}\, h^{-1}$| M⊙) knots.
2.3.2 Tidal environment
Following Paranjape et al. (2018), Ramakrishnan et al. (2019), and Zjupa et al. (2020), we smooth the gravitational potential using a range of 15 fixed Gaussian filters log-spaced in |$2\le R\le 5\, h^{-1}\, {\rm Mpc}$|. The minimum value of this range is safely above the lower resolution limit of the TNG100 subhalo size, |$R_{\rm res}=L_{\rm box}/N_{\rm g}^{1/3}=73\,h^{-1}$| kpc, and was chosen to maximize the contrast between filaments and knots in the αR map shown in the middle panel of Fig. 4. We then interpolate the potential in configuration space at each subhalo location (xsh, ysh, |$z$|sh) and the smoothing scale at each subhalo radius, Rsh, to create a subhalo-by-subhalo catalogue of tidal tensor estimates.
In the middle panel of Fig. 4, we present a slice of the TNG100-DMO simulation, 10 h−1 Mpc thick, colour-coded by the anisotropy parameter of the subhalo tidal field, αR. Here we show only the lower-mass subhaloes, i.e. M|$_{\rm sh}\lt 10^{12}\, h^{-1}$| M⊙. The filaments and sheets characterized by higher anisotropy values are shown in typically bluer colours. The knots, which are the densest and most isotropic regions of the large-scale structure (LSS), are highlighted in redder colours. We overplot as big points the most massive (i.e. M|$_{\rm sh}\gt 10^{13}\, h^{-1}$| M⊙) haloes. Despite the differences in the simulation, grid, and sample selection, our result is overall consistent with fig. 9 in Paranjape et al. (2018). The biggest discrepancy is that, in our case, the field is characterized by higher anisotropy values, which might be explained by the different configuration, resolution, and range of smoothing radii adopted.
In the bottom plot, we show the same map colour-coded using the tidal overdensity, δR, which traces particularly well the high-density regions around the knots. Blue and light-blue colours mark the higher-density filaments and sheets, whereas galaxies in the field are, as expected, associated with low values of δR.
3 METHODOLOGY
3.1 Galaxy clustering measurements
3.2 Multipopulation SHAM
The SHAM is a straightforward prescription used to connect galaxies with their host DM subhaloes based on the simple assumption that more massive (or luminous) galaxies reside in more massive subhaloes. The standard SHAM assignment is performed by rank-ordering galaxies and subhaloes according to specific primary properties and by matching their cumulative number densities. In this analysis, we adopt as properties the galaxy stellar mass and the subhalo maximum circular velocity over its entire history, Vpeak, which has been shown to perform better than other subhalo proxies (Contreras et al. 2021a; Hadzhiyska et al. 2020).
In order to ensure that our model is physically plausible, we allow for a constant Gaussian scatter σV in the M⋆–Vpeak relation. In practice, the procedure consists in randomly sampling galaxies from the TNG100 cumulative stellar mass function (see Fig. 1) convolved with a Gaussian probability distribution function (PDF) with fixed amplitude σ = 0.125 dex. This specific value was chosen to match the results obtained by Contreras et al. (2021a).
The aforementioned standard SHAM is extended in order to incorporate secondary subhalo and galaxy properties, following a similar methodology to that laid down by Hearin & Watson (2013). We call this methodology multipopulation SHAM. First, we divide the TNG100 galaxies into three bins of stellar mass designed to have a high enough number density: |$8.75\le {{\rm log}(M_{\star }/h^{-1}\, {\rm M}_{\odot })\,\lt\,9.3}$|, |$9.3\le {{\rm log}(M_{\star }/h^{-1}\, {\rm M}_{\odot })\,\lt\,10.0}$|, and |$10.0\le {{\rm log}(M_{\star }/h^{-1}\, {\rm M}_{\odot })\,\lt\,12.5}$|. In each bin, we define the conditional PDF for a galaxy with a given stellar mass to have a specific secondary property, |$P(X|M_{\star })$|, where X is the secondary property considered. The secondary properties we explore are (g − i) colour and sSFR.
Fig. 5 presents the TNG100 PDFs (solid black lines) in bins of stellar mass for each one of the galaxy secondary properties. Both the colour and sSFR distributions exhibit a clear bimodality with a red (quenched) and a blue (star-forming) peak. We will use these trends in the analysis to split the full sample into two populations with different colour/sSFR and to study the dependence of galaxy clustering on such properties.

Conditional probability distribution of the TNG100 galaxies (solid black lines) as a function of the secondary galactic properties in bins of stellar mass. The dashed magenta curves are the draws for the mocks. From top to bottom, we show the PDFs for the galaxy colour and sSFR.
The question is whether the secondary galaxy dependence introduced above can be accounted for by a secondary subhalo property. In order to test this, we implement the secondary matching through the following steps:
Fit the TNG100 PDFs using composite Gaussian functions and derive the analytic PDFs.
Split the mock catalogue obtained from the basic SHAM into the three stellar mass bins defined above and, in each bin, rank-order the mocks according to the secondary subhalo property we want to match. These are zstarve, cinfall, |$\delta _R^{\rm env}$|, αR, and δR.
For each mock galaxy, we draw a secondary galaxy property from the analytic PDFs defined above.
Rank-order the draws (dashed magenta lines in Fig. 5) and assign them to the mocks. In this way, the correlation between galaxy and subhalo secondary properties at fixed stellar mass is preserved.
4 RESULTS
4.1 Correlations between secondary properties
We study how the TNG100 galaxy and subhalo secondary properties correlate as a function of Vpeak. This quantity is fundamental for our analysis, as it will be used as the main proxy for the subhaloes in the SHAM assignment (Section 3.2).
Fig. 6 summarizes our findings, together with Table 1, where we report the correlation coefficients at fixed Vpeak. We observe a strong correlation between the galaxy colour (g − i) and the subhalo zstarve, with redder galaxies undergoing starvation at higher redshifts. A weaker correlation is observed between the sSFR (i.e. SFR per unit stellar mass) and zstarve. Here we see that starvation happens at lower (higher) redshifts for star-forming (quenched) galaxies.

Secondary properties of the TNG100 galaxies (i.e. colour and sSFR) as a function of Vpeak, colour-coded with the secondary halo properties (i.e. zstarve, cinfall, |$\delta _{\rm 3Mpc/h}^{\rm env}$|, αR, and δR). For the tidal properties, we consider only subhaloes above the resolution limit |$R_{\rm G,eff}^{ (4R_{\rm 200b})}\ge 81\,h^{-1}$| kpc (see Section 2.3.2). Here we show the average value of the secondary halo property in 30 bins of Vpeak and the secondary galaxy property.
Correlation coefficients between the secondary subhalo and galaxy properties at fixed Vpeak in the range 100–200 km s−1, where 68 per cent of the subhaloes lie.
. | (g − i) . | log (sSFR/yr−1) . |
---|---|---|
|$z$|starve | 0.56 | −0.36 |
cinfall | 0.49 | −0.30 |
|$1+\delta _{\rm 3Mpc/h}^{\rm env}$| | 0.46 | −0.26 |
αR | −0.22 | 0.11 |
1 + δR | 0.46 | −0.25 |
. | (g − i) . | log (sSFR/yr−1) . |
---|---|---|
|$z$|starve | 0.56 | −0.36 |
cinfall | 0.49 | −0.30 |
|$1+\delta _{\rm 3Mpc/h}^{\rm env}$| | 0.46 | −0.26 |
αR | −0.22 | 0.11 |
1 + δR | 0.46 | −0.25 |
Correlation coefficients between the secondary subhalo and galaxy properties at fixed Vpeak in the range 100–200 km s−1, where 68 per cent of the subhaloes lie.
. | (g − i) . | log (sSFR/yr−1) . |
---|---|---|
|$z$|starve | 0.56 | −0.36 |
cinfall | 0.49 | −0.30 |
|$1+\delta _{\rm 3Mpc/h}^{\rm env}$| | 0.46 | −0.26 |
αR | −0.22 | 0.11 |
1 + δR | 0.46 | −0.25 |
. | (g − i) . | log (sSFR/yr−1) . |
---|---|---|
|$z$|starve | 0.56 | −0.36 |
cinfall | 0.49 | −0.30 |
|$1+\delta _{\rm 3Mpc/h}^{\rm env}$| | 0.46 | −0.26 |
αR | −0.22 | 0.11 |
1 + δR | 0.46 | −0.25 |
A good correlation is observed also between the galaxy colour and the subhalo concentration at infall, with reddest galaxies in the low-Vpeak end showing the highest concentrations. A similar but weaker trend is also found between sSFR and concentration in the low-Vpeak regime.
A similar correlation, lower than in the previous cases, is appreciable between the galaxy colour and both halo overdensities |$\delta _{\rm 3Mpc/h}^{\rm env}$| and δR, confirming that redder galaxies preferentially occupy densest regions of the cosmic web. This trend is progressively washed out as we increase the radius R at which |$\delta _R^{\rm env}$| is computed. A weaker correlation is observed between sSFR and both subhalo overdensities.
Between the explored subhalo properties, the tidal anisotropy αR is the one that correlates the least with both the galaxy colour and the sSFR. Some trend is visible, but weaker compared to the rest of subhalo proxies. Note that due to the stochasticity in the galaxy-formation process and in the connection between haloes and their LSS environments, a very high level of correlation between the properties under analysis is, of course, not expected. We are, however, looking for signs that could eventually lead to further refinements of the SHAM procedure.
Our results tell us that, overall, there is a tendency for quenched, redder galaxies to inhabit the denser regions in the cosmic web, which are usually more isotropic. The correlations observed between the TNG100 galaxy and subhalo secondary properties, even if mild, will play a crucial role in our SHAM modelling, as they will be used as drivers to correctly draw the secondary properties of the mock galaxies.
4.2 Galaxy clustering results
In what follows, we present the 2PCFs of the TNG100 galaxies in bins of stellar mass modelled using the secondary matching explained in Section 3.2 on top of a standard SHAM based on M⋆ and Vpeak.
Fig. 7 presents the 2PCF results with secondary matching in colour. In the top row, we show the clustering of the full TNG100 galaxy sample (points) modelled using standard SHAM (lines) and, in the rest of panels, the results for the subsamples separated in colour and modelled with a secondary matching with the following subhalo properties: starvation redshift zstarve, concentration at infall cinfall, subhalo overdensity |$\delta _{\rm 3Mpc/h}^{\rm env}$|, tidal anisotropy αR, and tidal overdensity δR.

2PCFs of the TNG100 galaxies (points) and mocks (lines) in three stellar mass bins (columns). In the top row, we show the full galaxy population in each mass bin against the mock resulting from basic SHAM. In the other panels, we show the results for the red/blue galaxy samples and the SHAM including secondary matching performed between the galaxy colour and the halo properties indicated in the first panel of each row. From top to bottom, we show zstarve, cinfall, |$\delta _{\rm 3Mpc/h}^{\rm env}$|, αR, and δR. The shaded error bars are estimated performing 27 jackknife resamplings in TNG100 data.
Even if apparently mild, the correlations observed in Fig. 6, between galaxy and subhalo secondary properties, return good agreement in the clustering amplitude of the red/blue TNG100 populations and our mock catalogues over the entire stellar mass range. In all the cases explored, we are able to recover a clear separation in the amplitude of the red and blue models, consistent with the TNG100 fiducial data sets.
It is noteworthy that in the lowest M⋆ bin, all the full red and blue models underestimate the TNG100 small-scale (|$r\lesssim 0.2\, h^{-1}$| Mpc) clustering amplitude, due to a lack of satellite subhaloes. This effect is also present, even if reduced, in the intermediate-mass bin, in particular in the full and red mock catalogues.
Those subhalo secondary properties more tightly correlated with the galaxy colour and sSFR, e.g. |$z$|starve, cinfall, and δR, perform better in reproducing the 2PCF separation. The tidal anisotropy αR behaves very well in reproducing the clustering of the blue population, which is the densest one, in all three mass bins. The red model performs less well, highlighting the transition between the one- and two-halo term with a pronounced bump and, on larger scales, its fluctuations denote a lack of massive objects. Overall, both subhalo tidal properties demonstrate to be valuable tracers of the LSS. Interestingly, the SHAM correlation functions cross over on large scales for the highest mass bin, where bluer objects are predicted to be more tightly clustered than redder objects. This intriguing result will be investigated in more depth in follow-up work.
It is not surprising that the clustering models based on |$\delta _R^{\rm env}$| are similar but not identical to those based on δR, as the two subhalo overdensities are defined in different ways. In fact, while |$\delta _R^{\rm env}$| is obtained by counting subhaloes within a fixed smoothing scale (we tested the values |$R=3,5,\ {\rm and} \ 8\, h^{-1}$| Mpc), the tidal quantity δR is inferred by diagonalizing the subhalo tidal tensor interpolated at the subhalo positions, and smoothed at the Gaussian equivalent of 4R200 for each subhalo (see Section 2.3.2).
Fig. 8 presents the TNG100 clustering results as a function of sSFR. Also in this case, the subhalo secondary properties that correlate the most with sSFR, and hence return the best clustering models, are zstarve, cinfall, and δR. Overall, the subhalo properties perform equally well when coupled to galaxy colour or sSFR.

Same results as in Fig. 7, but with sSFR as a secondary galaxy property.
5 SUMMARY
In order to model the clustering of multiple populations of galaxies, the standard SHAM prescription needs to be modified to account for the different distributions of their galaxy properties at fixed stellar mass. In this work, we use the TNG100 hydrodynamical simulation to test the SHAM performance in bins of stellar mass as a function of different galaxy and subhalo properties.
First, we have implemented a standard SHAM on the TNG100 galaxy population split into three stellar mass bins using M⋆ and Vpeak as primary proxies for galaxies and subhaloes, respectively. We have chosen Vpeak, that is, the maximum circular velocity over the entire history of the subhalo, because it is proven to perform better as a subhalo proxy in SHAM compared to the halo maximum circular velocity or the infall velocity (Chaves-Montero et al. 2016; Hadzhiyska et al. 2020).
In each stellar mass bin, we have divided the TNG100 galaxies into two colour (red/blue) and sSFR (quenched/star-forming) subsamples. For these subsamples, we have measured the clustering and modelled the results by implementing a decorated SHAM assignment capable of coupling secondary galactic and subhalo properties. This secondary matching is an extension of the age matching prescription by Hearin & Watson (2013), but including several other secondary properties: galaxy sSFR, halo cinfall, |$\delta _{3,5,8{\rm Mpc/h}}^{\rm env}$|, αR, and δR.
As a result, we have overall found good agreement between our mocks and the TNG100 observations. In particular, the accuracy of the models depends on the galaxy and subhalo secondary properties adopted for the matching. We summarize our findings as follows:
Among the secondary subhalo properties studied, at fixed stellar mass, we find that the starvation redshift zstarve and the concentration at infall cinfall qualitatively provide the best clustering results. Our zstarve outcome confirms previous results from Hearin & Watson (2013).
Other physically motivated subhalo properties, such as the tidal overdensity δR and the subhalo overdensity |$\delta _R^{\rm env}$|, perform also well, with slightly larger deviations from the fiducial data set. The tidal anisotropy αR performs well in reproducing the subpopulations with higher number density, while for the lower-density sample, the disagreement with TNG100 is more pronounced.
The accuracy of our clustering models obtained through secondary matching improves when the secondary subhalo and galactic properties are tightly correlated.
Although we find interesting signs of correlation, it is still unclear in light of our results how the tidal anisotropy αR can be efficiently introduced into a SHAM modelling scheme.
The decorated SHAM presented in this work enables robust clustering predictions for different samples of red/blue and quenched/star-forming galaxies. Continuing the development of this methodology is particularly relevant for next-generation surveys, such as DESI4 or Euclid,5 which will collect samples of hundreds of millions of galaxies at high redshifts. Different galaxy populations act as different tracers of the LSS, hence the importance of adapting our techniques to the new era of multitracer cosmology. Our model can be easily extended to match other galaxy/subhalo properties to achieve a more complete vision of the LSS dynamics.
ACKNOWLEDGEMENTS
The authors are thankful to the referee, A. Paranjape, for insightful comments that have been determinant to improve this analysis.
GF acknowledges financial support from the SNF 175751 ‘Cosmology with 3D Maps of the Universe’ research grant. AMD thanks FAPESP and Fondecyt (Fondecyt Regular 2021 grant 1210612) for financial support. MCA acknowledges financial support from the Austrian National Science Foundation through FWF stand-alone grant P31154-N27. SC acknowledges the support of the ‘Juan de la Cierva Formacion’ fellowship (FJCI-2017-33816) and ERC Starting Grant number 716151 (BACCO). IZ acknowledges support by NSF grant AST-1612085.
DATA AVAILABILITY
This work made use of the IllustrisTNG data base available at www.tng-project.org.
Footnotes
For the TNG100 simulation, which includes baryons, the virial mass is computed accounting for all the total mass, i.e. DM, gas, and stars.