Effectiveness of halo and galaxy properties in reducing the scatter in the stellar-to-halo mass relation

The stellar-to-halo mass relation (SHMR) is a fundamental relationship between galaxies and their host dark matter haloes. In this study, we examine the scatter in this relation for primary galaxies in the semi-analytic L-Galaxies model and two cosmological hydrodynamical simulations, \eagle{} and \tng{}. We find that in low-mass haloes, more massive galaxies tend to reside in haloes with higher concentration, earlier formation time, greater environmental density, earlier major mergers, and, to have older stellar populations, which is consistent with findings in various studies. Quantitative analysis reveals the varying significance of halo and galaxy properties in determining SHMR scatter across simulations and models. In \eagle{} and \tng{}, halo concentration and formation time primarily influence SHMR scatter for haloes with $M_{\rm h}<10^{12}~\rm M_\odot$, but the influence diminishes at high mass. Baryonic processes play a more significant role in \lgal{}. For halos with $M_{\rm h}<10^{11}~\rm M_\odot$ and $10^{12}~\rm M_\odot<M_{\rm h}<10^{13}~\rm M_\odot$, the main drivers of scatter are galaxy SFR and age. In the $10^{11.5}~\rm M_\odot<M_{\rm h}<10^{12}~\rm M_\odot$ range, halo concentration and formation time are the primary factors. And for halos with $M_{\rm h}>10^{13}~\rm M_\odot$, supermassive black hole mass becomes more important. Interestingly, it is found that AGN feedback may increase the amplitude of the scatter and decrease the dependence on halo properties at high masses.


INTRODUCTION
In the standard ΛCDM cosmological framework, galaxies are firstly formed via the condensation of gas that has been cooled into the potential well of the host haloes (White & Rees 1978).Consequently, various properties of a galaxy such as the stellar mass, luminosity, or star formation rate are expected to exhibit significant correlations with the characteristics of the host dark matter halo.
One of the most fundamental galaxy-halo connections is the stellar-to-halo mass relation (hereafter referred to as SHMR) that connects the galaxy stellar mass to the virialized dark matter halo mass.This relation implies that the stellar mass of a galaxy is predominantly determined by its host halo mass.Numerous empirical observations and simulations substantiate this proposition (e.g.Zheng et al. 2007;Guo et al. 2010;Wang & Jing 2010;Yang et al. 2012;Moster et al. 2013;Behroozi et al. 2019;Zhu et al. 2020;Girelli et al. 2020;Niemiec et al. 2022;Shuntov et al. 2022).
Nevertheless, the SHMR exhibits non-negligible scatter at a given halo mass, particularly for haloes with lower masses (e.g.Moster et al. 2013;Behroozi et al. 2013; Rodríguez-Puebla et al. ★ E-mail: guoqi@nao.cas.cn2015; Matthee et al. 2017;Kulier et al. 2019;Cui et al. 2021).This means that using halo mass along could not predict the accurate stellar mass, and suggests the existence of secondary properties that might contribute to (at least part of) the scatter in the SHMR, such as halo formation time (e.g.Matthee et al. 2017).
The objective of this study is to explore the secondary properties contributing to the scatter of the SHMR.Many studies have focused on two closely related properties, halo concentration and formation time (Wechsler et al. 2002;Zhao et al. 2009;Jeeson-Daniel et al. 2011;Ludlow et al. 2014;Correa et al. 2015).Haloes with high concentration have deep potential wells which can maintain more gas, thereby promoting star formation in their associated galaxies.Additionally, halos that formed earlier have had more time for accretion and star formation.For example, Matthee et al. (2017) found that in the EAGLE simulation (Schaye et al. 2015), concentration and halo formation time substantially contribute to the scatter in the SHMR, particularly within the low halo mass regime.At fixed halo mass, haloes characterized by earlier formation times and higher concentrations tend to host more massive galaxies.The impact of halo concentration and formation time on the scatter around the SHMR is also supported by other studies using either semi-analytic models (SAM) (Wang et al. 2013;Tojeiro et al. 2017;Zehavi et al. 2018;Lyu et al. 2023) or hydrodynamical simulations (Artale et al. 2018;Cui et al. 2021).Some works find that other properties related to the formation time may also be able to modulate the SHMR, such as the environment (Tojeiro et al. 2017;Zehavi et al. 2018;Zhang et al. 2021b), with denser environments exhibiting more massive galaxies at fixed halo mass, especially the low mass haloes; and the relaxation status (Golden-Marx & Miller 2018), with a higher magnitude gap between the brightest central galaxy (BCG) and their brightest neighbours resulting in a massive BCG at fixed halo mass.On the other hand, Matthee et al. (2017) found that halo properties unrelated to halo concentration and formation time (e.g., spin, sphericity, triaxiality, and substructure) demonstrate negligible impact on the scatter around the SHMR.
In addition to the halo properties, galaxy properties that are sensitive to certain formation processes could also affect the scatter of the SHMR.For example, the SHMR for star-forming (blue) and passive (red) galaxies could be different, though a consensus has yet to be reached (see Section 6.1 of Wechsler & Tinker 2018, for a review).Several studies have observed that at a fixed halo mass, red galaxies tend to have larger stellar mass (Lim et al. 2016;Zu & Mandelbaum 2016;Moster et al. 2018;Cowley et al. 2019;Scholz-Díaz et al. 2022), while other investigations have suggested that blue galaxies are more massive (More et al. 2011;Wojtak & Mamon 2013;Rodríguez-Puebla et al. 2015;Mandelbaum et al. 2016;Behroozi et al. 2019;Correa & Schaye 2020;Cui et al. 2021).Hudson et al. (2015) and Taylor et al. (2020) have instead argued that no clear relationship exists between colour and the scatter in the SHMR.Some other dependence of the scatter around the SHMR is also proposed, such as galaxy age, (Kulier et al. 2019;Scholz-Díaz et al. 2022;Oyarzún et al. 2022), metallicity (Scholz-Díaz et al. 2022;Oyarzún et al. 2022), cold gas (Cui et al. 2021), and morphology (Taylor et al. 2020;Correa & Schaye 2020).However, there is still no conclusive agreement on these dependencies.Challenges in reaching definitive conclusions are twofold: observationally, the direct measurement of halo mass is still challenging, and different observational endeavours typically rely on distinct halo mass estimation techniques, each accompanied by its own set of caveats and limitations (e.g., weak lensing (Zu & Mandelbaum 2016), satellite kinematics (More et al. 2011), and group and cluster catalogues (Correa & Schaye 2020;Scholz-Díaz et al. 2022)); theoretically, the treatment of sub-grid baryonic physics within numerical simulations is nontrivial, as the implementation, therefore, can significantly impact the interaction between galaxies and haloes.
Although many efforts have been made to study the link between stellar mass and various halo and galaxy properties, the quantitative evaluation and comparison of their impact on reducing scatter around the SHMR is lacking.In most previous studies such an analysis was conducted focusing solely on halo properties (Matthee et al. 2017;Martizzi et al. 2020;Bradshaw et al. 2020).Here, we perform a comprehensive analysis not only on the general trend between the stellar mass and different halo and galaxy properties but also to evaluate and compare how effectively these properties can minimize scatter around the SHMR.Additionally, we will examine the discrepancies among the advanced semi-analytic model L-Galaxies (Henriques et al. 2015) and two cosmological hydrodynamical simulations, the Evolution and Assembly of Galaxies and their Environments (EAGLE, Schaye et al. 2015;McAlpine et al. 2016) and The Next Generation Illustris Simulations (IllustrisTNG, Pillepich et al. 2018b;Nelson et al. 2018;Naiman et al. 2018;Springel et al. 2018;Marinacci et al. 2018).The analysis focuses on the influence of different halo and galaxy characteristics such as halo concentration, formation time, major halo mergers, star formation Table 1.Simulations and sample selection.The columns from left to right represent: the name of the simulation, the number of central galaxies, halo mass cut, dark matter particle mass, and baryonic particle mass.
MR 531,953 10 12 1.4 × 10 9 -MRII 74,489 10 10.6 1.14 × 10 7 -EAGLE 21,387 10 10.6 1.81 × 10 8 9.7 × 10 6 TNG100-1 31,990 10 10.6 1.4 × 10 8 7.5 × 10 6 rate, black hole mass, cold gas mass, and others.Specifically, we examine the impact of AGN feedback, crucial for regulating galaxy formation in higher masses, by implementing L-Galaxies on Nbody cosmological simulations without AGN feedback to evaluate its effect on the SHMR.We employ traditional statistical techniques and machine learning to systematically investigate the correlation coefficients between the scatter in SHMR and halo/galaxy properties.This paper is organized as follows.Section 2 reviews the simulations used in this work and describes our sample selection; Section 3 presents our results on the dependence of various halo/galaxy properties on the scatter in SHMR; Section 4 discusses the influence of AGN feedback; we conclude with a summary and discussion in Section 5.

SIMULATION DATA AND METHODS
We employ a semi-analytic galaxy catalogue, and two cosmological hydrodynamical simulations, EAGLE (Schaye et al. 2015) and TNG100-1 (Weinberger et al. 2017), to explore the origin of the scatter in SHMR.The semi-analytic galaxy catalogue is generated by implementing the galaxy formation model L-Galaxies (Henriques et al. 2015) onto merger trees extracted from two large dark matter and gravity only -body cosmological simulations, the Millennium Simulation (MR; Springel et al. 2005) and the Millennium-II Simulation (MRII; Boylan-Kolchin et al. 2009).A summary of the simulations is provided in Table 1.In both semi-analytical and hydrodynamical models, the gas condenses in the centre of the hierarchically assembled dark halos through shock heating and cooling.The mass exchange between stellar components and condensed gas is regulated by star formation and various feedback.In general, galaxies can grow through in situ star formation and mergers.Star formation has been shown to be the primary growth mechanism in most galaxies except for the very massive ones, where star formation is suppressed by AGN feedback.For the most massive galaxies, mergers and merger-induced starbursts may be the sole channel for the growth of the galaxy.
The properties of the explored galaxy are closely linked to SFR, cold gas, and AGN feedback.We provide a summary of the simulations and their treatment of SFR and AGN feedback.

L-Galaxies semi-analytic model
The semi-analytic model of galaxy formation, L-Galaxies, was initially introduced in Springel et al. (2001) and subsequently refined in a series of versions (De Lucia et al. 2004;Croton et al. 2006;Guo et al. 2011).These models share similar philosophies which include physical prescriptions for baryonic processes such as shock heating, gas cooling, star formation, supernova feedback, formation and growth of supermassive black holes, AGN feedback, metal enrichment, etc.In this study, we adopt the model developed in Henriques et al. (2015), which introduced modifications to the treatment of the reincorporating wind ejecta, star formation thresholds, environmental stripping, and the radio mode feedback.The model is fitting on the observed galaxy stellar mass function and passive fractions of galaxies at redshift 0 ≤  ≤ 3.They execute the model on the merger trees extracted from the Millennium Simulation (MR; Springel et al. 2005)  Both MR and MRII trace 2160 3 dark matter particles from redshift ∼56.4 to 0. The MR and MRII were carried out in periodical boxes of 480.279Mpc/h and 96.0558 Mpc/h on each side after rescaling, respectively.The corresponding dark matter particle masses are 9.61 × 10 8 M ⊙ /h and 7.69 × 10 6 M ⊙ /h.Details about the simulations and semi-analytical models can be found in Henriques et al. (2015) and reference therein.
In L-Galaxies, There are two modes to form stars: the quiescence mode and the burst mode.In the quiescence mode where no merger occurs, the star formation rate is proportional to the amount of cold gas mass when its surface density surpasses a specific threshold, where  SF = 0.030 is a free parameter,  gas is the total mass of cold gas, and  dyn,disk is the dynamical time of the disk. crit is a threshold mass converted from the critical surface density to form stars (Kauffmann 1996): where  crit ,0 = 0.24 × 10 10 M ⊙ pc −2 is another free parameter,  200c is the virial velocity of the halo, and  gas is the cold gas disk radius.In the burst mode where merger occurs, the stellar mass formed is determined by the "collisional starburst" formulation (Somerville et al. 2001): where  1 <  2 is the baryonic mass of the two galaxies, and  cold is total cold gas mass. SF,burst = 0.72 and  SF,burst = 2.0 are two free parameters.AGN feedback plays a crucial role in suppressing star formation in massive galaxies, influencing the shape of the SHMR in the highmass regime.In the L-Galaxies model (Henriques et al. 2015), a black hole with mass 0 is seeded when a halo is formed.The black hole can then grow through two modes: Quasar mode and Radio mode.The former, a primary SMBH growth mechanism, occurs during galaxy mergers, where violent merger processes funnel a substantial amount of cold gas into the black hole (Zhang et al. 2021a).In contrast, the Radio mode refers to Bondi accretion (Bondi & Hoyle 1944) of hot gas from the host halo.The mass accretion rate is determined both by the SMBH mass and the hot gas mass: where  AGN = 5.3 × 10 −3 ,  hot and  BH are the hot gas mass and the central SMBH mass, respectively.Although Radio mode accretion contributes minimally to the final BH mass, it is assumed to be strongly correlated to the AGN feedback.The thermal energy from AGN feedback is given by: where  = 0.1 is the efficiency parameter in the fiducial model and  is the speed of light.The feedback prevents gas from cooling onto the galaxies and consequently suppresses further star formation.
To examine the impact of AGN feedback, we rerun the model presented in Henriques et al. (2015) on both MR and MRII simulations with AGN feedback switched off, which is referred to as LGalw/oAGN.The simulation with default L-Galaxies model is referred to as LGal.Note that due to the limited box size of MRII and the lower resolution of MR, we choose halos of mass [10 10.6 , 10 12 ] M ⊙ from MRII and [10 12 ,∞] M ⊙ from MR for the analysis.
LGal and LGalw/oAGN refer to galaxy catalogues on these combined halo merger trees, unless stated otherwise.The galaxy formation model parameters are adjusted to reproduce the stellar mass function from z = 0 to 3, both at high masses based on MR and at low masses based on MRII.However, it should be noted that the galaxy formation model and parameters are not calibrated according to the scatter around the SHMR.(see Section 2.4 for details).

EAGLE simulation
EAGLE project constitutes a suite of hydrodynamical simulations designed to track the formation of galaxies and supermassive black holes, following the evolution of gas, stars and dark matter in cosmologically representative volumes within a standard ΛCDM universe (Schaye et al. 2015;McAlpine et al. 2016).These simulations utilized a modified version of the N-Body Tree-PM smoothed particle hydrodynamics (SPH) code gadget 3 (Springel 2005).The sub-grid physics includes element-by-element radiative cooling for 11 elements, star formation, stellar mass loss, energy feedback from star formation, gas accretion and mergers of supermassive black holes, and AGN feedback.These subgrid models were calibrated to reproduce the stellar mass function, galaxy sizes, and the relation between galaxy stellar masses and supermassive black hole masses at the present day.A more detailed description of the implemented baryonic processes in EAGLE is provided in Schaye et al. (2015).
In the EAGLE simulation, the star formation rate is determined by pressure rather than density, following the equation: where  g is the mass of a gas particle,  = 5/3 is the ratio of specific heats,  is the gravitational constant,   is the mass fraction in gas, and  is the total pressure. = 1.515 × 10 −4 M ⊙ yr −1 kpc −2 and  = 1.4 are two free parameters, while for  H > 10 3 cm −3 ,  = 2.The probability of a gas particle converted to a star particle during a time step Δ is given by min(  * Δ/ g , 1).Additionally, a metallicity-dependent density threshold and a temperature threshold for gas particles are imposed to regulate star formation.
In the EAGLE model, a black hole with seed mass  BH,seed = 10 5 M ⊙ /h is placed in the centre of the host halo when its mass exceeds 10 10 M ⊙ /h for the first time.BH growth can occur through BH-BH mergers and gas accretion.The gas accretion rate is given by:  acc = min(  Edd ,  bondi × min( −1 visc ( s /  ) 3 , 1)), ( 7) where  bondi is the Bondi accretion rate and  Edd is the Eddington rate.The factor ( s /  ) 3 / visc is the ratio between Bondi and the viscous time-scales (Rosas-Guevara et al. 2015).Here   is the rotation speed of the gas around the BH and  visc is a free parameter.The thermal energy injection rate is assumed to depend on the gas accretion rate: where  f =0.15 and  r = 0.1.The details of the model are referred to Schaye et al. (2015).

IllustrisTNG simulation
IllustrisTNG comprises a suite of cosmological hydrodynamical simulations focused on galaxy formation.They were carried out using the moving-mesh code AREPO (Springel 2010) with updated treatments of the formation, growth and feedback of black holes, galactic winds, stellar evolution and chemical yields and enrichment.In addition, they included magnetic fields based on the previous ideal magneto-hydrodynamics (Pakmor et al. 2011;Pakmor & Springel 2013;Pakmor et al. 2014).A comprehensive description of the full physics involved can be found in Weinberger et al. (2017) and Pillepich et al. (2018a).In this study, we utilize the TNG100-1 simulation for comparison with L-Galaxies and EAGLE simulations.TNG100-1 traces 2 × 1820 3 dark matter and baryon particles within a periodic box of  = 75 Mpc/h on each side.The mass resolution is 9.4 × 10 5 M ⊙ /h and 5.1 × 10 6 M ⊙ /h for dark matter and baryonic particles, respectively.They adopted cosmological parameters from more recent Planck Collaboration (Planck Collaboration et al. 2016), which is different to those adopted by L-Galaxies and EAGLE:  8 = 0.8159,  0 = 67.74km s −1 Mpc −1 , Ω Λ = 0.6911, Ω m = 0.3089, Ω b = 0.0486 and  = 0.9677.In the TNG100-1 simulation, stars form stochastically in gas particles when their densities exceed a star formation threshold of  H ≈ 0.1cm −3 .The star formation rate is given by : where  c is the density of cold clouds,  = 0.1 is the mass fraction of short-lived massive stars, and  ★ = 2.2 Gyr is the time-scale for star formation.In TNG100-1 simulation, an initial mass of  BH,seed = 8 × 10 5 M ⊙ /h is placed into a host halo when its mass reaches  host = 5 × 10 10 M ⊙ /h for the first time.The SMBH accretion rate in the model is the minimum value between the Eddington accretion rate and the Bondi-Hoyle-Lyttleton accretion rate: (10) The model incorporates two AGN feedback modes which depend on the accretion rate.At low-accretion rates ( =  bondi /  Edd ≤ 0.1), the AGN-driven wind feedback takes effect by blowing out the surrounding gas kinematically: where a maximum value of  f,kin = 0.2 is set.This involves adding momentum in a random direction by kicking each gas particle cell in the feedback region.At high-accretion rates ( ≥ 0.1), thermal feedback is activated with an injection rate: where  f =0.1 and  r = 0.2, slightly larger than the parameters used in EAGLE.For more details of the model, readers are referred to Weinberger et al. (2017).Previous works reveal that although both hydrodynamical simulation and L-Galaxies can reproduce the overall galaxy stellar mass function, they might differ in detail.For example, Guo et al. (2016) investigate the galaxies in EAGLE and L-Galaxies and find that L-Galaxies predict more star-forming galaxies and higher star formation rate density than EAGLE.By running L-Galaxies on the DMO merger trees of TNG100-1, Ayromlou et al. (2021) find significant differences in the sSFR and cold gas component between semi-analytical model and hydrodynamical simulation, which are primarily due to the different stellar feedback and AGN feedback.
In all these three simulations, the halo virial mass is defined as the total mass enclosed within a sphere whose mean density is 200 times the critical density of the Universe, denoted as  200c .It is worth noting that in L-Galaxies, the dark matter properties are directly inherited from dark matter only (DMO) merger trees, which are unaffected by baryonic processes.However, in EAGLE and TNG100-1, the dark matter properties are derived from hydrodynamical simulations, rather than the DMO versions used in Matthee et al. (2017).Therefore, these halo properties could be influenced by baryonic processes.Galaxy properties such as total stellar mass, star formation rate (SFR), and black hole mass are determined by corresponding quantities within subhalos, accessible directly from the catalogue.The cold gas mass can also be obtained from the LGal and LGalw/oAGN output catalogue.In EAGLE, we use the fitting formula from Appendix A in Rahmati et al. (2013) to define cold gas mass, linking cosmological simulations with full radiative transfer calculations to predict the neutral gas fraction.In TNG100-1, cold gas mass represents the total mass of star-forming particles plus other gas particles adjusted by the "NeutralHydro-genAbundance" factor.
The merger trees in these simulations are all constructed following the method outlined in Springel et al. (2005).In practice, at each snapshot, the FOF and SUBFIND algorithms (Springel et al. 2001) are used to identify groups and subhalos.Each subhalo is associated with one and only one descendant.Then the merger trees are constructed by linking each subhalo with its descendant at the following snapshot.For further details regarding the construction of merger trees, we refer readers to Springel et al. (2005) and the references therein.
We focus on central galaxies in this study.Selecting for host halo masses above  h > 10 10.6 M ⊙ , we find a total of 21,387 central galaxies in EAGLE and 31,990 in TNG100-1at z=0.Central galaxies are chosen from LGal (LGalw/oAGN) on MRII and MR based on their host halo masses of [10 10.6 , 10 12 ] M ⊙ and [10 12 ,∞] M ⊙ , resulting in 74,489 and 531,953 central galaxies at z = 0, respectively.

Stellar mass vs. halo mass in different simulations
The top panel of Figure 1 depicts the median stellar mass as a function of halo mass for central galaxies in L-Galaxies, EAGLE, and TNG100-1 simulations.The central galaxies in L-Galaxies consist of those in low mass haloes ( h < 10 12 M ⊙ ) in MRII (L-Galaxies on MRII) and those in higher mass haloes ( h > 10 12 M ⊙ ) in MR LGal LGalw/oAGN EAGLE TNG100-1 Guo+2010 Moster+2018 Behroozi+2019 11.0 11.5 12.0 12.5 13.0 13.5 14.0 LGal on MRII Yang+2009 LGalaxies with (without) the inclusion of the AGN model.Blue and cyan lines show the results from hydrodynamical simulations, EAGLE and TNG100-1, respectively.For comparison, the dashed line shows the result from the semi-analytical model by Guo et al. (2010) and the results (shaded regions) obtained from abundance matching techniques (Moster et al. 2018;Behroozi et al. 2019).We only show the data points with their mass bins containing at least 20 galaxies.Bottom panel: the 1 scatter of stellar mass as a function of halo mass.The red dotted line shows the result from LGal_on_MRII.Observational results (Yang et al. 2009;More et al. 2009;Zu & Mandelbaum 2015) are presented for comparison.
(L-Galaxies on MR).Notably, the SHMR in LGal with AGN feedback overlaps with the result in LGalw/oAGN without AGN feedback for low-mass haloes ( h < 10 12 M ⊙ ).However, the difference in SHMR between these two models increases with increasing halo mass, with the stellar mass of galaxy clusters ( h ∼ 10 14 M ⊙ ) in LGalw/oAGN being 1 dex more massive than their counterparts in LGal.This result is consistent with previous works such as Hlavacek-Larrondo et al. (2022), suggesting the efficiency of AGN feedback in suppressing star formation is the strongest in galaxy groups or galaxy clusters.
Comparing the semi-analytic catalogue and hydrodynamical results, it is evident that the SHMR in hydrodynamical simulations (EAGLE and TNG100-1) are consistent with LGal predictions at  h ∼ 10 12 M ⊙ , but systematically higher than LGal both at higher and lower mass ranges.At low masses ( h < 10 11 M ⊙ ), the excess could be attributed to some backsplash galaxies that once resided in more massive haloes thus their halo being disrupted.The discrepancy is more extreme in the high-mass regime.For example, for a given halo with  h ∼ 10 13 M ⊙ , the stellar mass in TNG100-1 or EAGLE is roughly 3 times larger than LGal predictions.This discrepancy suggests that AGN feedback models employed in hydro simulations are less effective for cluster-mass haloes than LGal.Despite such differences, SHMR in all simulations aligns with those obtained using the abundance matching method (Moster et al. 2018;Behroozi et al. 2019).
The bottom panel of Figure 1 illustrates the 1 scatter of the SHMR for LGal, LGalw/oAGN, TNG100-1, EAGLE and abundance matching methods.Generally, the scatter varies with the host halo mass at low masses and becomes more constant at  h > 10 12 M ⊙ .For comparison, many abundance matching methods assume a constant scatter (e.g.Guo et al. 2010;Leauthaud et al. 2012;Moster et al. 2013;van Uitert et al. 2016).Detailed comparison shows that in TNG100-1 and EAGLE, the scatter decreases with increasing host halo masses, while in LGal it increases with halo mass, reaching the maximum value  h ∼ 10 12.2 M ⊙ , and then decreases.The decreasing scatter feature with halo mass in hydrodynamical simulations is not influenced by the limited size of simulation boxes (refer to details in the Appendix C).The scatter is larger in LGal at high masses and smaller at low masses compared to TNG100-1 and EAGLE.The large scatter at low masses could be attributed to baryonic effects on the dark halo in TNG100-1 and EAGLE, which are absent in semi-analytical approaches.At high masses, small scatter in TNG100-1 and EAGLE could result from less effective AGN feedback, as observed in LGalw/oAGN.We note that the scatter in L-Galaxies on MRII as denoted by the red dotted curve is much smaller than that of L-Galaxies on MR at high masses and is more in line with the scatters found in TNG100-1 and EAGLE.This could be attributed to different SMBH growth histories which are closely related to mergers and thus sensitive to the resolution of the simulation.Such reliance on the resolution of the scatter is not present in TNG100-1 (Appendix C).In the mass ranges between 10 11 M ⊙ and 10 12 M ⊙ , where AGN feedback is not effective and the MR resolution is marginally sufficient, the scatter is indeed very similar between the L-Galaxies on MR and L-Galaxies on MRII (see Appendix A.).Since L-Galaxies are calibrated using MR for massive objects, high mass galaxies from MRII are not included in this study.In the absence of AGN feedback, the SHMR and its scatter are similar between LGalw/oAGN on MR and MRII across the entire mass ranges.
The model scatter is in good agreement with various observational results for most simulations.The only exception is the results based on MR, as shown by the red curve at high masses, which is much larger than the observed ones.However, the results of L-Galaxies on MRII are in good agreement with Zu & Mandelbaum (2015).This demonstrated that the diversity of SMBH growth in L-Galaxies on MR might have been overpredicted due to the poor resolution.The scatter in the SDSS group catalogue (Yang et al. 2009) and in groups whose mass is estimated with satellite kinematics (More et al. 2009) has a mean value of 0.15 dex, similar to those used in the abundance matching method (Moster et al. 2018;Behroozi et al. 2019), and to those predicted in TNG100-1 and EA-GLE.In LGal the scatter with MRII resolution is slightly higher than those in TNG100-1 and EAGLE.However, it is important to note that selecting complete samples based on halo mass observationally is very challenging.For instance, in Yang et al. (2009), the group mass may be underestimated or overestimated, especially when the satellite member is small.
In the next section, we further investigate the relation between the scatter in SHMR and general halo/galaxy properties, such as halo concentration, formation time, merger history, the mass of the black hole ( BH ), the mass of cold gas ( coldgas ), the star formation rate (SFR), and the mass-weighted age.We particularly focus on the effect of AGN feedback on the SHMR in simulations.

RESULTS
To enhance the precision of estimating the impact of various properties, we employ the residual Δ * -halo mass relation instead of the traditional SHMR.This is achieved by fitting the SHMR in log-log space with three parameters for each simulation: Recognizing that low-mass galaxies predominate over massive ones, we opt to stratify our samples based on the mass of their host haloes, employing a bin size of 0.1 dex.Bins containing fewer than 10 galaxies are excluded.Subsequently, we utilize the median stellar mass and halo mass within each bin to fit Eq.13.The resulting fitting parameters are enumerated in Table 2. Δ * is defined as the deviation of the stellar mass from the fitted value at any given halo mass.
Consequently, Δ * signifies the deviation around the median value of SHMR.Within each halo mass bin, we compute the standard deviation of Δ * , denoted as (Δ * ( h )) (hereafter referred to as (Δ * ,h )), which represents the scatter of the SHMR.We also try to quantitatively assess the contributions of each property.Within each halo mass bin, we establish the relationship between Δ * and halo/galaxy properties as follows: where Prop refers to halo or galaxy properties, such as halo concentration, formation time, star formation rate, etc.It is noteworthy that the results obtained through linear fitting are comparable to those obtained using higher-order polynomial fittings.The residual scatter remaining after considering secondary properties is expressed as: The standard deviation of Δ * ,prop is then presented as (Δ * ,prop ), providing a quantitative measure of the scatter after accounting for the influence of the secondary property.Smaller (Δ * ,prop ) indicates a higher dependence on this property.
In each subsection, we initially examine whether Δ * is an increasing, decreasing function, or independent of the properties.We subsequently measure the effectiveness of these properties in reducing (Δ * ).

Dependence on halo properties
In this subsection, we investigate the impact of different halo properties on the scatter around the SHMR.All parameters depend on the total mass, total mass distribution and evolution within R 200 , if not stated otherwise.

Halo concentration and formation time
The most heavily explored quantities as the candidates of secondary dependence of the SHMR include halo concentration and formation time (e.g.Hearin & Watson 2013;Hearin et al. 2015;Rodríguez-Puebla et al. 2015;Saito et al. 2016;Matthee et al. 2017).We compare the impact of halo concentration and formation time using consistent definitions across various simulations.Following the method in Prada et al. (2012), we refer to  =  max / vir as the concentration of the halo, where  max is the maximum rotational velocity of the halo and  vir is the virial velocity.
Figure 2 shows the Δ * -halo mass relation for the four simulations.We categorize galaxies based on their halo masses using a bin size of 0.1 dex.Within each halo mass bin, we further divide their delta stellar mass into a bin size of 0.1 dex, and then the cells are colour-coded based on the average concentration of the galaxies in this cell, as illustrated by the colour bar.This approach effectively compresses the dynamical range of concentration and highlights the dependence on concentration in each halo mass bin.The solid/dashed/dotted lines in each panel represent the 50% / 16%(84%) / 2%(98%) values of the Δ * - h relation.For a fixed halo mass, highly concentrated haloes tend to host more massive galaxies than haloes with low concentrations.This trend is relatively strong at low masses in all the simulations and is weak at high masses except for LGalw/oAGN.
The formation time ( f ) is also found crucial in determining galaxy properties.Indeed, previous studies (Gao et al. 2004;Wechsler et al. 2002;Zhao et al. 2009;Jeeson-Daniel et al. 2011;Ludlow et al. 2014;Correa et al. 2015) have demonstrated a robust correlation between halo formation time and concentration, where higher concentration corresponds to earlier formation time.We find a similar overall trend on  f as concentration, as shown in the Appendix Figure B1, that early-formed haloes host more massive galaxies at fixed halo mass in all simulations and across all halo mass range.
The reliance on halo concentrations and formation time is based on physical reasons: i) host haloes that form earlier will have more time to gather gas and create stars; and ii) highly concentrated haloes have steeper potential wells that can retain more material against feedback, potentially increasing thermal and kinetic energies.
At low masses, similar dependencies on halo concentration and formation time have been reported in many previous works, utilizing semi-analytical models (e.g.Wang et al. 2013;Tojeiro et al. 2017;Zehavi et al. 2018) or hydrodynamical simulations (e.g.Matthee et al. 2017;Artale et al. 2018;Xu & Zheng 2020;Montero-Dorta et al. 2020;Cui et al. 2021), indicating that haloes which are highly concentrated and formed earlier tend to host more massive galaxies.However, the dependency at high masses is still a topic of debate.Zu et al. (2021) found that for galaxy clusters with  h ∼ 10 14.24 M ⊙ , samples with higher stellar mass also have higher halo concentration.In our study, at high masses, the dependency on halo concentration and formation time is overshadowed by AGN feedback.However, Bradshaw et al. (2020) reported that the impact of formation time continues to rise, being most significant at highest halo masses in empirical models like the UniverseMachine (UM, Behroozi et al. 2019).All detailed differences rely on the different treatments of baryonic processes employed in various models and simulations.
We further show the standard deviation of Δ * as function of halo mass in the bottom panel of LGal LGalw/oAGN EAGLE TNG100-1 mass to fit stellar mass.The solid lines show the standard deviation of Δ * ,prop from Eq. 16, which takes both halo mass and secondary properties, concentration, into consideration.The greater the difference between solid and dashed lines, the stronger the dependence on this property.We observe that removing the dependency on concentration could significantly decrease scatter around the SMHR.At low masses, around  h ∼ 10 11 M ⊙ , the scatter could potentially be decreased by about 40%.At higher masses, the effect varies among simulations, with more pronounced impacts seen in EAGLE and TNG100-1 compared to LGal.This aligns with findings in Matthee et al. (2017) which report a stronger correlation with halo concentration when utilizing simulations incorporating full baryonic processes.This is because including baryonic processes tends to steepen the density profile (Schaller et al. 2015), resulting in higher concentrations compared to DMO simulations.This amplifies the reliance on halo properties in hydrodynamical simulations.
Interestingly, we observe a notable difference of the halo concentration dependence at high masses,  h > 10 12 M ⊙ in simulations with and without AGN feedback.LGalw/oAGN shows a stronger dependence on concentration at high masses, similar to that at low masses, while other simulations show a much weaker dependence at high masses.Detailed comparisons in the bottom panel of Figure 2 show that for LGalw/oAGN, scatters are reduced by 40% at high masses, much higher than those in EAGLE, TNG100-1, and LGal.Further, it suggests that the relatively strong effect of halo concentration on scatters around the SHMR is probably attributed to the less effective AGN feedback adopted in EAGLE and TNG100-1.This is because, in simulations with AGN feedback, the feedback effect is a highly nonlinear process that effectively modifies star formation activities in massive haloes.Some galaxies experiencing strong AGN feedback are quenched and cease further star formation, leading to a constant stellar mass since the quenching 11.0 11.5 12.0 12.5 13.0 13.5 14.0 14.5 LGal LGalw/oAGN EAGLE TNG100-1 event.However, the host haloes of these galaxies continue to grow, causing the stellar mass to decouple from the halo mass and other halo properties.In LGalw/oAGN, where there is no AGN feedback, all galaxies naturally grow following the halo properties, resulting in a stronger dependence on halo properties and a smaller scatter compared to LGal at high masses.Such dilutions are also observed in the dependence on other halo properties.

Environmental overdensity
The influence of large-scale environments on galaxy formation is widely acknowledged (e.g.Peng et al. 2010;Liao & Gao 2019;Xu et al. 2020).While the environmental effect may be linked to halo concentration and formation time, we do not intend to distinguish their relative impacts, as measuring the environments is easier than the latter two.To explore its impact on the SHMR, we define  = / ρ − 1 as the local overdensity for each halo in the simulations, where  represents the local density and ρ is the mean density of the Universe.In practice, we initially partition the simulation box into cubes with a side length of 1 Mpc/h and calculate the mean density for each cubic cell.Subsequently, following the cloud-incell method (Sefusatti et al. 2016), the overdensity of each halo at a given position is computed by using the surrounding cells with a Gaussian smoothing kernel that has a half-height full width of 2.1 Mpc/h.We have investigated cell sizes of 0.5 Mpc/h, 1 Mpc/h, 2 Mpc/h, and 5 Mpc/h, and our primary findings remain unaffected by this choice.
In Figure 3, we utilize a colour-coded scheme based on the average  to depict the SHMR.The broad patterns resemble those observed for halo concentration at low masses.A discernible trend is apparent across all simulations, indicating a tendency for massive galaxies to reside in regions of higher density at a fixed halo mass.At higher masses, all galaxies tend to inhabit dense regions, exhibiting a little trend at a fixed halo mass.This finding suggests that a denser environment can promote star formation slightly, with the influence of the environment saturating at approximately 10 12 M ⊙ .At higher masses, the local environment is sufficiently dense so that a higher overdensity contributes minimally to the stellar mass.
Utilizing data from the GAMA survey (Liske et al. 2015), Tojeiro et al. (2017) found that at low masses, galaxies in voids have a lower stellar-to-halo mass ratio than those in knots, while this trend reverses at high masses.Zhang et al. (2021b) also find this reverses trend by combing SDSS DR7 data with ELUCID simulation.Wu et al. (2024) report that using the information about the local environment will significantly increase the accuracy for predicting stellar mass with machine learning in TNG300.
When AGN feedback is removed, as illustrated for LGalw/oAGN, this positive relationship with environments remains up to higher halo mass levels of 10 13 M ⊙ , stronger than those in other simulations at high masses.The impact of AGN on environmental dependence exhibits a resemblance to that of concentration/formation time, indicating that the inclusion of AGN breaks the link between the growth of stellar components and halo accretion.
A detailed examination of the dependence on environments is presented in the bottom panel of Figure 3.It shows that even at the lowest masses, the scatter is only reduced by 5%, while at high masses the influence of density almost vanishes.The slight effect, compared to the visible distinctions as indicated by the coloured pixels in the upper panels, is because within one sigma ranges, the density does not significantly contribute to the 1 scatter although it becomes more noticeable at a 2  level.Such weaker dependence could be further reduced as demonstrated by Matthee et al. (2017) that the dependence on the environment diminishes when considering the role of concentration/formation time.In conclusion, the reliance on environments is notably weaker compared to that on halo concentration and halo formation time.

Halo major merger
Halo mergers play an important role in galaxy formation and evolution, particularly when a galaxy devours a massive satellite.Here, we focus on haloes that have undergone major mergers throughout their entire assembly history, where a major merger event is defined as when a satellite falls into a host halo with a mass ratio exceeding 1/3.We denote  major as the sum of the infall mass of all major-merged satellites for each host.
In Figure 4, the scatter of the SHMR is colour-coded by their LGal LGalw/oAGN EAGLE TNG100-1 average  major / h,0 in each cell, where  major / h,0 represents the total major merger halo mass normalized to the host halo mass at z∼0.The influence of major mergers appears to be relatively weak if any.In general, most haloes tend to have a similar fraction of major merger halo mass at approximately 10% to 20% of the total halo mass at redshift 0 despite the halo mass range in all the simulations.Previous studies (e.g., Guo & White 2008;Oser et al. 2010;Rodriguez-Gomez et al. 2016;Qu et al. 2017;Pillepich et al. 2018b;Moster et al. 2018;Bradshaw et al. 2020) find that mergers have much less impact on galaxy growth at low masses compared to star formation.Therefore, a clear dependence on mergers is not anticipated.At low masses, Figure 4 shows a slightly negative correlation within the 1 region, which might reverse for outliers in EAGLE and TNG100-1.In LGal, there is a positive relationship between stellar mass and accreted mass fraction below 10 10 M ⊙ within 1 ranges, but this changes at higher masses.The trends in the outskirts are more intricate.At high masses, there is a weak positive association between  major / h,0 and scatter in all models.This is expected because mergers could play a significant role in the growth of galaxies at these masses in simulations and models (Guo & White 2008;Qu et al. 2017;Pillepich et al. 2018b;Moster et al. 2018;Bradshaw et al. 2020).
The quantitative impact of  major is presented in the bottom panel of Figure 4. Overall, halo major mergers do not significantly affect the scatter in the SHMR at low masses for all simulations.The largest difference in scatter is only 5% for LGal in the mass range 10 12 <  h < 10 13 M ⊙ .
In addition to  major / h,0 , the last major merger redshift,  major could be relevant. major represents the redshift when it underwent its last major merger.A larger  major indicates that the galaxy underwent its last major merger earlier and has had a fairly steady growth history since then.This could be linked to the halo formation time. Figure B2 demonstrates that at lower masses, as anticipated, massive galaxies exhibit a higher  major .The influence of  major weakens significantly at high masses due to AGN feedback.Detailed discussion about the influence of  major is presented in Appendix B2. 11.0 11.5 12.0 12.5 13.0 13.5 14.0 14.5 LGal LGalw/oAGN EAGLE TNG100-1

Dependence on galaxy properties
Star formation is not solely determined by the potential of the host dark matter halo, but also depends on more complex baryonic physics closely related to halo formation history and interactions between dark matter and baryons.In this section, we investigate how the SFR, cold gas mass ( coldgas ), black hole mass ( BH ) and mass-weighted age contribute to the scatter in the SHMR.

Star formation rate
The star formation rate is closely linked to the stellar components within galaxies.Figure 5 depicts the SHMR colour-coded by the SFR.In LGal and LGalw/oAGN, a positive correlation between SFR and stellar mass is observed within a specific halo mass range across all halo mass categories.At higher masses, this relationship is enhanced by AGN feedback when comparing LGal and LGalw/oAGN.A similar positive correlation between SFR and stellar mass is also found in the EAGLE and TNG100-1 simulations for low mass systems.However, this trend weakens significantly for  h > 10 12  ⊙ , indicating a complex interplay with halo accretion history and baryonic processes.A mildly positive relationship with SFR is identified at  h ∼ 10 12.8 M ⊙ in TNG100-1, possibly influenced by the AGN feedback model.Lyu et al. (2023) reports that at fixed halo mass, star-forming galaxies have larger stellar mass than passive galaxies in L-Galaxies.Wang et al. (2023) also finds that star-forming galaxies are more massive at fixed halo mass based on stellar mass from Bell et al. (2003) and halo mass from Yang et al. (2007).These are consistent with what we found.Nevertheless, a recent study by Scholz-Díaz et al. (2024) suggests that, at fixed halo mass, massive galaxies exhibit lower SFR compared to smaller galaxies in the CALIFA survey (Walcher et al. 2014).
The bottom panel of Figure 5 shows the quantitative impact of SFR.The correlation between SFR and the scatter around the SHMR varies with halo masses.At  h ∼ 10 12.2 M ⊙ , the scatter significantly decreases when incorporating SFR in LGal, indicating a strong dependence of SFR on Δ * .This dependency weakens as we move towards both higher and lower halo masses and eventually disappears at the extreme mass ends in LGal.At the very massive ends, AGN feedback predominates in nearly all cases, mitigating the impact of SFR.Conversely, in both EAGLE and TNG100-1, a clear correlation with SFR is noticeable at lower masses and disappears for haloes exceeding ∼ 10 11.5 M ⊙ .The different dependence on SFR may be attributed to discrepancies in the treatment of star formation and AGN feedback among different simulations.Notably, in LGalw/oAGN, the SFR continues to be strongly associated with the scatter around the SHMR towards the very massive end without AGN feedback.

Age
A closely related galaxy property to the whole star formation history is the mass-weighted stellar age. Figure 6 depicts the SHMR colour-coded by age.At low masses, all simulations indicate a positive correlation between age and stellar mass that older galaxies are more massive.Galaxies formed earlier tend to have more time to accumulate stellar mass.Such dependence in EAGLE is rather weak.On the contrary, at high masses, all simulations with full physics demonstrate an inverse relationship between stellar mass and age, indicating that more massive galaxies tend to be younger.This could be explained by that in this mass range, galaxies quenched earlier are older and less massive, while galaxies quenched later or remaining active are younger and more massive.When AGN feedback is deactivated and galaxies do not get quenched, as in LGalw/oAGN, galaxy stellar mass growth in massive halos follows similar formation mechanisms as those in low mass halos.Therefore, older galaxies have had more time to form stars, leading to a consistent correlation between galaxy age and the scatter around the SHMR across all mass ranges.
The relationship between age and stellar mass is still far from conclusive in previous literature.Several studies (Gallazzi et al. 2021;Shuntov et al. 2022;Scholz-Díaz et al. 2024) have reported that galaxies with older stellar populations are more massive at fixed halo mass.However, Oyarzún et al. (2022) found that low-mass galaxies are older at  h < 10 12 M ⊙ , while high-mass galaxies are older at large halo masses, which is contrary to our predictions.Kulier et al. (2019) found that age does not correspond to the scatter in the SHMR; instead, it is related to the scatter in the stellar mass- max relation.
The age dependence of the scatter around the SHMR is depicted in the bottom panel of Figure 6.In LGal, including age leads to a significant decrease in scatter across all halo mass ranges except for haloes at 10 12 M ⊙ .This decrease amounts to up to 30% at low masses and 15% at high masses, indicating a tight correlation between age and scatter.In LGalw/oAGN, age contributions decrease with increasing halo masses, down to approximately 5% of the scatter at high masses.Age merely contributes to the scatter in EAGLE but has a stronger effect in TNG100-1 by up to 20% in TNG100-1, indicating the strong influence of different sub-grid physics.

Cold gas mass
Another galaxy property of interest is cold gas.The star formation rate in all simulations is related to the cold gas mass, although different thresholds and formulas are adopted.Figure 7 depicts the SHMR colour-coded by  coldgas .It is important to note that the definition of cold gas differs among simulations, which may introduce additional differences in cold gas dependence.At low masses, almost all simulations predict no cold gas mass dependence for galaxies with fixed halo mass while EAGLE shows a slightly positive trend.At high masses, galaxies below the median stellar mass exhibit significantly lower  coldgas in LGal, in line with those found by Cui et al. (2021) in the SIMBA simulation.Galaxies with lower cold gas masses also exhibit lower SFR, as illustrated in Figure 5, suggesting a connection between the decline in SFR and cold gas caused by AGN feedback efficiency.In contrast, in EAGLE and TNG100-1 no evident dependence of the scatter on cold gas is observed, which indicates the inefficient AGN feedback in these two simulations.This is further supported by the vanishing relation between the scatter and the cold gas in AGN feedback-free simulations as shown for LGalw/oAGN.
The bottom panel of Figure 7 provides an assessment of the scatter in the SHMR as it relates to cold gas.The degree of scatter exhibits variability across different simulations.In the LGal simulation, the scatter is significantly reduced, reaching up to 15% at intermediate and high masses.Conversely, in the EAGLE and TNG100-1 simulations, the reduction in scatter is less pronounced, amounting to less than 10%, and is only observable at the extreme ends of the SHMR.These differences may stem from different star formation recipes, given the varying relationships between SFR and cold gas in each simulation.Additionally, it is important to note that we employ slightly different definitions of cold gas among these simulations, thereby introducing additional complexities.

Black hole mass
We have undertaken a similar analysis on the Supermassive Black Holes (SMBHs) in the simulations, considering the close relationship between SMBH mass and AGN feedback in all simulations (see Section 2 for details).Figure 8 shows the SHMR colour-coded by  BH in LGal, EAGLE and TNG100-1.It does not include results from LGalw/oAGN because  BH = 0 for all galaxies in this simulation.In LGal, galaxies with more massive SMBHs usually have higher stellar masses at both low and high mass ranges, but this trend reverses at intermediate masses.In EAGLE simulation, there is no significant trend between the scatter and SMBH mass, which is also reported by Correa & Schaye (2020).A positive relationship is evident in TNG100-1 at lower masses, suggesting a co-evolution of the stellar components and the SMBH, which disappears at higher masses.The discrepancy among different simulations could be due to the different SMBH growth and feedback model in each simulation.This relationship suggests a complex co-evolution between galaxies and their central supermassive black holes (see Kormendy & Ho 2013, for a review).While such a dependence is expected at lower masses, its counter-intuitive manifestation at higher masses is noteworthy.In massive systems, potent feedback from AGN counteracts gas cooling, maintaining these galaxies in a quenched state.Stellar and SMBH mass growth predominantly stems from mergers and merger-related events.Previous studies indicate that at higher masses, mergers drive stellar mass growth in galaxies (Guo & White 2008;Oser et al. 2010;Rodriguez-Gomez et al. 2016;Qu et al. 2017;Pillepich et al. 2018b;Moster et al. 2018;Bradshaw et al. 2020), and both mergers and merger-induced gas inflow drive SMBH growth, thereby sustaining the co-growth dynamic between stellar mass and SMBH mass.However, at intermediate masses where AGN feedback initiates, the relationship between SMBH mass and stellar mass becomes more intricate.In certain systems, AGN feedback strongly suppresses star formation, while in others, it may not be sufficiently effective.Therefore, the association between stellar mass and SMBH mass in haloes of a given mass may depend on specific models of baryonic physics.
The bottom panel in Figure 8 elucidates the quantified relationship between the scatter in the SHMR and the SMBH mass in the simulations.This dependence closely mirrors that observed with LGal LGalw/oAGN EAGLE TNG100-1 Figure 7. Same as Figure 2 but each bin is coloured by the average cold gas mass of the central galaxies.Solid lines in the bottom panel show  (ΔM * ,prop ) from the second fitting which considers cold gas mass.
cold gas in both the LGal and EAGLE simulations.It is worth noting that the dependence on SMBH is considerably more pronounced in the TNG100-1 simulation.Specifically, at intermediate masses, the dependence on SMBH can account for up to 40% of the scatter, while at high masses, the SMBH contributes to approximately 20% of the scatter.This indicates the substantial co-evolution scenario of SMBH and host galaxy, particularly in the TNG100-1 simulation.

Relative dependence on halo and galaxy properties in various simulations
In this section, we will first investigate the impact of individual halo and galaxy properties on the scatter around the SHMR, as well as their variations with halo masses across a range of simulations and models.Following this, machine learning techniques will be employed to evaluate the collective influence of potential halo and galaxy parameters and ascertain the importance of different properties in regulating the scatter throughout all mass categories.

Diverse impact of individual halo and galaxy properties on various halo mass levels
Figure 9 shows (Δ * ,prop ) as a function of halo mass for various properties.The black dashed lines represent the standard deviation of the Δ * (Eq.14), where only halo mass is considered in determining stellar mass.Solid lines show the standard deviation of Δ * ,prop (Eq.16), where both halo mass and secondary properties are taken into account in determining stellar mass.
The distribution of the scatter around the SHMR is influenced by both the halo properties and galaxy properties, with the extent of their contribution varying based on the halo mass and differing across different simulations.In LGal, galaxy properties primarily influence the scatter both at low and at high masses, while halo properties such as concentration and formation time significantly affect the scatter at middle masses, 10 11.5 M ⊙ <  h < 10 12 M ⊙ .For  h < 10 11 M ⊙ , galaxy age exhibits a greater impact than halo concentration and formation time.In the ranges 10 12 M ⊙ <  h < 10 12.5 M ⊙ and 10 12.5 M ⊙ <  h < 10 13 M ⊙ , SFR and age dominate the scatter, respectively.At higher masses, supermassive LGal EAGLE TNG100-1 LGal log 10 (c) log 10 (1 + z f ) log 10 ( + 1) log 10 (M major /M h, 0 ) log 10 (1 + z major ) log 10 (SFR) log 10 (M coldgas ) log 10 (M BH ) log 10 (age) LGalw/oAGN  black hole mass (and consequently AGN feedback), becomes the driving factor.
Since AGN feedback primarily impacts high-mass galaxies in LGal, we observe that at higher masses, without AGN feedback, halo concentration and formation time are the main factors affecting the scatter around the SHMR.This highlights a co-evolution between the SFR and halo growth rate.Interestingly, at lower masses  h < 10 12 M ⊙ , galaxy properties like age also play a significant role in the variability in stellar mass, even in the absence of AGN feedback, showcasing the substantial influence of baryonic processes in regulating stellar mass growth in galaxies.
Significant differences are found in EAGLE and TNG100-1 compared to LGal.In both the EAGLE and TNG100-1 simulations, it is observed that none of the properties are able to significantly diminish the scatter at high masses.At low masses, halo concentration and formation time are always seen as the primary factors contributing to scatter.A detailed comparison reveals that in EA-GLE, considering the dependence on concentration could be more effective in reducing scatter compared to formation time.Concentration remains the dominant and effective factor in reducing scatter in halos up to 10 12.5 M ⊙ .In TNG100-1, the advantage of using concentration over formation time is less pronounced, and at around 10 11 M ⊙ , the dependence on concentration and formation time is similar.Additionally, concentration ceases to effectively reduce scatter in halos more massive than 10 12 M ⊙ , which is lower than in EAGLE.In EAGLE, no explored galaxy properties were found to effectively reduce scatter, while in TNG100-1, SMBH mass, which is related to AGN feedback, could also contribute to scatter.By using the UM (Behroozi et al. 2019), Bradshaw et al. (2020) find that halo formation time is the primary contributor to the scatter in SHMR for haloes larger than 10 13.5 M ⊙ , while halo accretion rate also plays a significant role.Moreover, they note that the last major merger time contributes to the scatter, particularly for haloes with  h ∼ 10 14.4 M ⊙ , indicating that the halo assembly history in UM is different from L-Galaxies, EAGLE and TNG100-1.Additionally, Martizzi et al. (2020) observe that halo formation time predominantly influences the scatter in the stellar mass to peak halo mass relation, whereas environmental factors exert a weaker influence in TNG100.
In summary, it is only in the middle mass ranges, in all full physics models, that halo concentration and formation time dominate over the scatter.At low masses and higher masses, galaxy properties such as SFR, age, and SMBH mass are more closely related to the scatter in LGal.In contrast, in EAGLE and TNG100-1, halo properties, especially concentration and formation time, emerge as the primary contributors to the scatter, which disappears in the most massive systems.In general, galactic properties have a greater impact on the scatter around the SHMR in LGal than halo properties, as opposed to EAGLE and TNG100-1.

General dependence on halo and galaxy properties
In this section, we employ a machine learning approach, similar to the methodology employed by He et al. (2022), to investigate the correlation between the scatter in the SHMR and the previously mentioned halo/galaxy attributes.For this purpose, we utilize the Lasso (least absolute shrinkage and selection operator, Tibshirani 1996) algorithm, which is part of Scikit-Learn (Pedregosa et al. 2011), to compute the correlation coefficient between the SHMR scatter and various features.Lasso is a regression analysis technique originally devised for linear models.Incorporating a penalty term proportional to the sum of the absolute values of the coefficients, the loss function is 1 It stands out for its dual functionality in variable selection and regularization.Lasso effectively shrinks less important coefficients towards zero, promoting sparsity and facilitating model interpretability.Two distinct Figure 11.The results of machine learning with only the halo properties as input ("Halo" model) for all the simulations.The "MR" and "MR without AGN" only contain haloes larger than 10 12 M ⊙ while "MRII" and "MRII without AGN" only contain haloes smaller than 10 12 M ⊙ .The left column shows the predicted  * vs. true  * for the validation set.The right column shows the absolute coefficients in the result of the Lasso regression of the features we have used in the training.Since we have normalised the features into units, the coefficients represent the contributions of each feature.machine-learning models have been developed, one only relying on halo properties ("Halo" model) including halo concentration , halo formation time  f , overdensity , and major merger halo mass  major , are utilized as inputs, and the other also include galaxy properties ("All" model) 11, but all the halo/galaxy properties are used ("All" model).
we exclude  major for its influences are weak in all masses and many systems never undergo major mergers. BH is also excluded in simulations without AGN feedback.To evaluate the contribution of each feature to predictions, we normalize them into units.
For each simulation, we randomly choose a sample of 20,000 central galaxies, mix them up, and split them into a training set (75% of the sample) and a validation set (25% of the sample).We also experiment with a different sample selection method by picking the same number of randomly chosen galaxies in each halo mass bin, with consistent results.All halo masses are considered for EAGLE and TNG100-1.For LGal and LGalw/oAGN, we perform machine learning in MR ( h > 10 12 M ⊙ ) and MRII (10 10.6 <  h < 10 12 M ⊙ ) separately.These are referred to as LGal_on_MR, LGal_on_MRII, LGalw/oAGN_on_MR, and LGalw/oAGN_on_MRII, in order to maintain consistency with previous sections.As a consequence, a potential bias toward low masses might exist in MRII, as well as a potential bias toward high masses in MR.Due to their better statistics at high masses, we place more reliance on the results from MR, despite the more pronounced resolution effect at these masses.It should be acknowledged that incorporating less massive systems in MR could somehow alter the machine learning results.
The correlation coefficients among the eight parameters in the "All" model are presented in Figure 10.Apart from the wellknown correlation between  and  f , other halo properties do not show strong correlation with each other.The coefficients among galaxy properties are higher in semi-analytical models compared to hydrodynamical simulations, indicating a degree of degeneracy in L-Galaxies.Particularly, the age, cold gas mass, black hole mass and SFR are strongly correlated in LGal_on_MR, while the connections vanish in LGal_on_MRII.The main reason for this are due to the different halo mass ranges adopted in MR and MRII.We have tried to explore the correlation coefficients for haloes larger than  h > 10 12 M ⊙ in LGal_on_MRII and find similar connections between these galaxy properties.To mitigate the effect of degenerate parameters, we exclude one of the features from our input if the correlation coefficient exceeds 0.75.
Results based on the "Halo" model are presented in Figure 11.The model exhibits satisfactory predictions overall, with better performance in MRII, EAGLE, TNG100-1 and LGalw/oAGN_on_MR compared to those in LGal_on_MR with full physics, reflected by their relatively lower Root Mean Square Errors (RMSE).This aligns with expectations that at high masses galaxy growth deviates from halo growth when AGN feedback comes into play.The lower RMSE in LGal_on_MRII compared to hydrodynamical simulations indicates that baryonic effects impact halo properties, complexing the influence of halos on galaxy formation.The absolute coefficients in the right column show that the halo concentration and formation time are identified as the most influential features in all simulations.The environment overdensity  has a relatively minor influence on the scatter, except for TNG100-1 where the coefficient drops to zero.The impact of  major is negligible for all the simulations.These findings align with those presented in Figure 9, highlighting halo concentration and formation time as the key secondary properties that significantly reduce the scatter in the SHMR.In contrast,  and  major have minimal effect on reducing scatter.Furthermore, we expand our analysis to include massive haloes in MRII (all haloes with  h > 10 10.6 M ⊙ ) and smaller haloes in MR (all haloes with  h > 10 11 M ⊙ ), finding that the main outcomes of the "Halo" model remain stable, irrespective of the selected mass ranges.
Figure 12 illustrates the results obtained using the "All" model.The overall fitting is better when including galaxy properties.The RMSE of LGal_on_MR decreased from 0.26 in the "Halo" model to 0.2 in the "All" model, indicating a non-negligible improvement in predicting stellar mass within high-mass dark matter haloes by incorporating galaxy properties.Other simulations show a decrease of approximately 0.02 in RMSE, representing a slight improvement in performance in the "All" model.
In "All" model, at high masses, it is the age that dominates the scatter in LGal_on_MR, while concentration have a slightly smaller coefficient than age.This is consistent with those found in Figure 9 that SFR, age, and  BH dominate the scatter at high masses, and all these properties are strongly degenerate.For LGalw/oAGN_on_MR, halo concentration becomes the primary factor influencing the scatter around the SHMR, consistent with those in Figure 9.The galaxy property  gas becomes a secondary important factor with a comparable effect on estimating stellar mass compared to halo concentration.We perform a test to train the "All" model by adding halos of relatively low mass [10 11 , 10 12 ] M ⊙ from MR (Figure D1).As shown in Figure 9, within [10 11 , 10 12 ] M ⊙ , halo formation time and concentration are key secondary properties.Considering the similar importance of halo concentration/formation time and galaxy age, the incorporation of these lower mass halos shifts the most relevant properties from galaxy characteristics to halo features, specifically the formation time.The galaxy property,  gas , has become a slightly lower but comparable effect on estimating stellar mass compared to halo formation time.In summary, galaxy properties play a dominant or subdominant role in determining the scatter around the SHMR in MR, whether excluding or including the less massive systems.Nevertheless, in both scenarios, its role is comparable to that of halo formation time and concentration.
At low masses, halo concentration and formation time are the most efficiency properties in reducing the scatter in all the simulations, including L-Galaxies with and without AGN on MRII, as well as EAGLE and TNG100-1. gas is the second important contributor to the stellar mass but the coefficient is only 1/3 of concentration in L-Galaxies with and without AGN on MRII.This indicates the stellar mass in MRII is predominately determined by halo properties, and the contribution of galaxy properties is much weaker.Figure 9 indicates that galaxy age could reduce the scatter significantly at very low masses in LGal and LGalw/oAGN, which is not shown in Figure 12.This could be attributed to the broader mass ranges used in machine learning, where halo properties are significant in the scatter at most masses.We also train the "All" model for all haloes larger than 10 10.6 M ⊙ in MRII, and find that halo concentration remains the dominant factor in determining the scatter around the SHMR (Figure D1).
In conclusion, our machine learning analysis reveals that halo concentration/formation time dominates the scatter around the SHMR at low masses.These findings align with results obtained by Matthee et al. (2017) in the EAGLE simulation and Martizzi et al. (2020) in TNG100.Particularly, Bradshaw et al. (2020) noted that in the UM, halo formation time contributes significantly in galaxy clusters with  h ∼ 10 14.4 M ⊙ , indicating that the detailed dependence relies on the underlying sub-grid physics.Galaxy properties play a pivotal role in MR, while they contribute less in MRII.The influence of galaxy properties in hydrodynamical simulations are even weaker.In general, the role of baryonic processes varies among simulations and galaxy formation models, with baryonic processes playing a more crucial role in semi-analytical galaxy formation models in MR at high masses.

INFLUENCE OF AGN FEEDBACK
One of the most significant findings of our study is the substantial decrease in the scatter of the SHMR at the high-mass end observed in L-Galaxies when the AGN feedback is turned off.The introduction of AGN feedback doubles the scatter at high masses, while it has no impact on the scatter at low masses.This increase in scatter is also reported by Rosas-Guevara et al. (2015).At high masses, a significant number of galaxies show low stellar mass, low SFR, and low cold gas mass, indicating that AGN feedback inhibits star formation activity in these galaxies, resulting in the appearance of a population of low-mass galaxies that contribute significantly to the scatter.This is in line with the conclusions of Cui et al. (2021) in the SIMBA simulation, where jet-mode and X-ray AGN feedback suppresses galaxies, leading to the formation of red, lower-mass galaxies.
Incorporating AGN feedback additionally decreases the influence of factors such as halo concentration and formation time on the scatter around the SHMR.At lower masses (< 10 12 M ⊙ ) where AGN feedback is ineffective in all simulations and models, star formation activity aligns with halo growth.Consequently, halo concentration and formation time serve as secondary factors in determining stellar mass (major contributors to the scatter around the SHMR).In contrast, at higher masses, AGN feedback impacts the entire star formation process, decreasing the influence of halo concentration and formation time in EAGLE and TNG100-1.The AGN feedback effect is more prominent in LGal, becoming the main driver and resulting in an increase in the scatter around the SHMR.
In summary, AGN significantly affects not only the scatter amplitude but also the impact of halo properties on determining the scatter around the SHMR, especially at high masses.While models to link galaxies and halos using halo mass and concentration are reliable at low masses, more caution should be exercised in establishing such a link in massive systems.

CONCLUSION
Despite a well-established correlation between the stellar mass and halo mass, substantial variations in stellar mass persist at a given halo mass.In this work, we analyze data from three simulations-L-Galaxies, EAGLE, and TNG100-1-we explore a range of halo properties, encompassing halo concentration, formation time, environment, and mass assembly history.Additionally, we consider galaxy properties such as star formation rate, age, cold gas mass and supermassive black hole mass.To discern the influence of AGN feedback, we compare results from simulations with and without this process.Overall, our findings exhibit consistency across simulations, with minor divergences attributed to distinct sub-grid physical recipes.Our main conclusions can be summarised as follows: (i) For low-mass haloes ( vir < 10 11.5 M ⊙ ), the scatter in SHMR is almost constant ( ≈ 0.15) in L-Galaxies, while the scatters are much larger in EAGLE and TNG100-1 simulations.For high-mass haloes ( vir > 10 12 M ⊙ ), the scatter is 2 times larger in LGal (with AGN feedback on) compared to LGalw/oAGN (with AGN feedback off), EAGLE, and TNG100-1.
(ii) The scatter in the SHMR strongly depends on the concentration and the formation time of the host haloes, especially at lower masses.For a given halo mass, galaxies in early-formed high-concentration haloes exhibit an excess of stellar mass compared to those in late-formed low-concentration haloes.
(iii) Baryonic processes play a more critical role in determining the scatter around the SHMR in LGal compared to EAGLE and TNG100-1.In LGal, SFR, age, and BH mass dominate the scatter around the SHMR in most halo mass ranges except for 10 11.5 M ⊙ <  h < 10 12 M ⊙ , where halo concentration and formation time take precedence.Conversely, in EAGLE and TNG100-1, halo concentration and formation time govern the scatter at lower masses, with galaxy properties never being the primary driver of the scatter.
(iv) AGN feedback plays a crucial role in the scatter around the SHMR at high masses.It dominates the scatter in LGal and diminishes the dependencies of scatter on halo concentration and formation time in EAGLE and TNG100-1.
(v) Environmental overdensities and mergers do not significantly affect the scatter in the SHMR.Their impact is noticeable only in low-mass halos.Galaxies in high-density areas have slightly higher stellar masses than those in low-density areas; galaxies in halos that experienced an early major merger have slightly larger stellar masses compared to galaxies in halos with a late major merger.
In summary, our analysis reveals that the source of the scatter in SHMR exhibits a transitional mass at around 10 12 M ⊙ and varies among simulations and models.At lower masses, more massive galaxies are characterized by higher halo concentration, earlier formation time, denser environment, and earlier last major merger time given a halo mass.In LGal, SFR and age are more important than in EAGLE and TNG100-1.At higher masses, AGN feedback plays a prominent role in increasing the scatter, and in diminishing the influence of halo concentration and formation time on the growth of stellar mass.In general, in semi-analytical models, baryonic processes play a more important role compared to TNG100-1 and EAGLE.LGal_on_MR LGalw/oAGN_on_MR LGal_on_MRII LGalw/oAGN_on_MRII represent the 68%/96%/99.8%regions, respectively.Grey points denote galaxies that fall beyond the 3 boundaries.Here we show the results from MR and MRII separately, and select galaxies from MR with  h > 10 11 M ⊙ and from MRII with  h > 10 10.6 M ⊙ .The vertical dashed lines show the point where  h = 10 12 M ⊙ .We find that at mass range 10 11 <  h < 10 12 M ⊙ , both LGal_on_MR and LGalw/oAGN_on_MR show similar 2 scatter as LGal_on_MRII and LGalw/oAGN_on_MRII, indicating that the L-Galaxies has a good convergence at this mass range.The outlier is much larger in MR than in MRII due to the larger sample sizes in MR.However, LGal_on_MR shows a steep increase around 10 12 M ⊙ , which is absent in the other three simulations with L-Galaxies.This could be attributed to different SMBH growth histories which are closely related to mergers and thus sensitive to the resolution of the simulation.The good overall agreement between LGalw/oAGN_on_MR and LGalw/oAGN_on_MRII also indicates that the increasing scatter in LGal_on_MR is related to SMBH and AGN feedback.Both EAGLE and TNG100-1 show a decreasing scatter with increasing halo mass.At low masses, they predict higher scatter than LGal.In general, the overall behaviour of the scatter is similar to Figure 1.

APPENDIX B: THE DEPENDENCE ON HALO FORMATION TIME AND LAST MAJOR MERGER TIME B1 Halo formation time
The halo formation time is tightly related to its concentration, as early-formed halos are more concentrated (Wechsler et al. 2002;Zhao et al. 2009;Jeeson-Daniel et al. 2011;Ludlow et al. 2014;Correa et al. 2015).Our works also find that the correlation between the scatter in SHMR and halo formation time resembles that of concentration.The Δ * - h relation, colour-coded by halo formation time, is shown in Figure B1, similar to Figure 2. Galaxies in earlier-formed halos tend to be more massive, with this trend being particularly pronounced at lower masses across all these simulations.At high masses, the trend is relatively weak except for LGalw/oAGN, indicating that AGN feedback may decouple the growth of stellar components from halo accretion.All these find-ings mirror the dependence on concentration, revealing the strong correlation between halo formation time and concentration.Further discussion and comparison with prior studies are provided in Section 3.1.1.The bottom panel in Figure B1 quantifies the dependence on halo formation time as a function of host halo masses.At low masses, the scatter is reduced by approximately one-third after considering formation time.However, at high masses, the role of  f weakens, with AGN influence becoming predominant.Particularly, for LGalw/oAGN, the scatter can be reduced by 30% across all mass ranges.The overall dependence on formation time mirrors that on concentration (illustrated in the lower panel of Figure 2), affirming the robust correlation between halo concentration and formation time.Furthermore, it is noteworthy that the magnitude of the decrease in formation time is slightly smaller compared to concentration, especially for LGal and EAGLE, suggesting that concentration may have a greater influence on the scatter than formation time.

B2 Last major merger time
We also consider the last major merger redshift,  major , to further investigate the impact of major mergers.For a central galaxy,  major represents the redshift when it underwent its last major merger.A larger  major indicates that the galaxy experienced its last major merger earlier and has had a relatively smooth growth history since then.In Figure B2, we present the SHMR coloured according to  major .If a halo does not experience a major merger in the given simulation, we set  major =127.The impact of  major is similar across all simulations at low masses.Massive galaxies with similar halo mass have a higher  major , implying that galaxies with a more gradual growth history accumulate more stellar mass.This is also related to the positive correlation between  major and halo formation time.The influence of  major weakens significantly at high masses.Similar to the halo concentration and formation time, at these masses, AGN feedback can suppress star formation, while the host haloes keep growing, uncoupling the stellar mass growth history from the halo mass growth history.
We show the quantitative impact of  major in the bottom panel LGal LGalw/oAGN EAGLE TNG100-1 of Figure B2.At all masses, the contribution of  major is almost negligible.This influence is slightly larger at 10 12 <  h < 10 12.5 M ⊙ and can reach up to 10% in LGal, which diminishes towards both lower and higher halo masses.

APPENDIX C: DEPENDENCE ON BOX SIZE AND RESOLUTION IN HYDRODYNAMICAL SIMULATIONS
It is worth noting that IllustrisTNG might have some convergence problems in simultaneously reproducing the stellar mass function with the same sub-grid physics and parameters in simulations with different box sizes and particle masses, like TNG100 and TNG300 (Pillepich et al. 2018b).In this section, we test whether the scatter in SHMR as a function of halo mass is influenced by such a phenomenon.We examine the scatter in TNG50, TNG100, and TNG300 simulations and present the results in Figure C1.The left panel shows the SHMR in these simulations.We observe an increasing stellar mass with decreasing box size at fixed halo mass, indicating that IllustrisTNG tends to predict more massive galaxies with smaller box sizes and higher resolutions.The right panel shows the scatter in SHMR as a function of halo mass in these simulations.We find a general decreasing trend towards the higher halo mass.However, simulation with smaller box, higher resolution predicts lower scatter at fixed halo mass with  h < 10 12 .The scatter appears to be independent of the size of the simulation box at high masses.For the EAGLE simulation, Matthee et al. ( 2017) compared the scatter in EAGLE100 and EAGLE50 and found similar results (see their Appendix B for details).They also report that higher resolution tends to have a smaller scatter in SHMR.

APPENDIX D: MACHINE LEARNING ON THE WHOLE MASS RANGE OF MR AND MRII
We further expand our machine learning method to include massive haloes in MRII (all haloes with  h > 10 10.6 M ⊙ ) and smaller haloes in MR (all haloes with  h > 10 11 M ⊙ ) for both "Halo" model and "All" model.The main findings in the "Halo" model remain stable, irrespective of the selected mass ranges.The main and the Millennium-II Simulation (MRII; Boylan-Kolchin et al. 2009), and rescale the simulations to those with the first-year Planck cosmology parameters(Planck Collaboration et al. 2014):  8 = 0.829,  0 = 67.3km s −1 Mpc −1 , Ω Λ = 0.685, Ω m = 0.315, Ω b = 0.0487 (   = 0.155) and  = 0.96.

Figure 1 .
Figure 1.Top panel: the median values of the stellar-halo mass relation (SHMR) in different simulations.Red (Orange) lines show the results ofLGalaxies with (without) the inclusion of the AGN model.Blue and cyan lines show the results from hydrodynamical simulations, EAGLE and TNG100-1, respectively.For comparison, the dashed line shows the result from the semi-analytical model byGuo et al. (2010) and the results (shaded regions) obtained from abundance matching techniques(Moster et al. 2018;Behroozi et al. 2019).We only show the data points with their mass bins containing at least 20 galaxies.Bottom panel: the 1 scatter of stellar mass as a function of halo mass.The red dotted line shows the result from LGal_on_MRII.Observational results(Yang et al. 2009;More et al. 2009;Zu & Mandelbaum 2015) are presented for comparison.
Figure 2. Lines with different colours represent different simulations.The dashed lines show the standard deviation of Δ * ,h from Eq. 14, which only adopts halo 11.0 12.0 13.0 14.0 log 10 (M h /M )

Figure 2 .
Figure 2. The distribution of delta stellar mass for central galaxies as a function of halo mass in LGal, LGalw/oAGN, EAGLE, and TNG100-1 simulations.Black solid lines correspond to the medians of the distributions, while the black dashed (dotted) lines show the 16th (2nd) and 84th (98th) percentiles.The colours indicate the concentration, , of haloes in each halo and delta stellar mass bin.The gray vertical dotted lines in the top panels represent the transition mass,  h = 10 12 M ⊙ , between MRII (left) and MR(right).Bottom Panel: The standard deviation of Δ * as a function of halo mass.Different colours represent different simulations.Dashed lines show  (ΔM * ,h ) from SHMR while solid lines show  (ΔM * ,prop ) from the second fitting which considers halo concentration.See the text for details.

Figure 3 .
Figure 3. Same as Figure 2 but each bin is coloured by the mean environmental overdensity, , of the haloes.Solid lines in the bottom panel show  (ΔM * ,prop ) from the second fitting which considers .

Figure 4 .
Figure 4. Same as Figure 2 but each bin is coloured by the mean major-merged satellite mass fraction,  major / h .Solid lines in the bottom panel show  (ΔM * ,prop ) from the second fitting which considers  major / h .

Figure 5 .
Figure 5. Same as Figure2but each bin is coloured by the average SFR of central galaxies at  = 0. Solid lines in the bottom panel show  (ΔM * ,prop ) from the second fitting which considers SFR.The subplot in LGalw/oAGN highlights the differences at high masses with a different colour-bar range.

Figure 6 .
Figure 6.Same as Figure 2 but each bin is coloured by the mass-weighted age of the central galaxies.Solid lines in the bottom panel show  (ΔM * ,prop ) from the second fitting which considers age.

Figure 8 .
Figure 8. Same as Figure 2 but each bin is coloured by the average mass of the supermassive black holes in LGal, EAGLE, TNG100-1.Solid lines in the bottom panel show  (ΔM * ,prop ) from the second fitting which considers black hole mass.

Figure 9 .
Figure 9.The standard deviation of ΔM * from fitting formula.Black dashed lines show  (ΔM * ,h ) from SHMR while solid lines show  (ΔM * ,prop ) from the second fitting which considers secondary properties.Different panels represent different simulations as labelled in the upper-right corner.Different properties are presented by different colours labelled at the top.

Figure 10 .
Figure 10.Pearson correlation coefficients of the halo/galaxy properties used in machine learning in different simulations.The colour bar shows the absolute correlation coefficients between the two features.Larger coefficients mean stronger correlation.

Figure A1 .
Figure A1.The SHMR in different simulations.We show the results from MR and MRII separately.Different panels represent different simulations, while the black solid lines represent the median value, the solid/dashed/dotted coloured lines in each panel represent the 68%/96%/99.8%regions, respectively.The vertical dashed lines show the point where  h = 10 12 M ⊙ .Grey points denote galaxies that fall beyond the 3 boundaries.

Figure B1 .
Figure B1.Same as Figure 2 but each bin is coloured by the formation time,  f , of the haloes.Solid lines in the bottom panel show  (ΔM * ,prop ) from the second fitting which considers  f .
with star formation rate SFR, cold gas mass  gas , black hole mass  BH , and galaxy age.Note that here