The impact of gas accretion and AGN feedback on the scatter of the mass-metallicity relation

The gas-phase metallicity of galaxies encodes important information about galaxy evolution processes, in particular star formation, feedback, outflows and gas accretion, the relative importance of which can be extracted from systematic trends in the scatter of the mass-metallicity relation (MZR). Here, we use a sample of low redshift (0.02<z<0.055) galaxies from SDSS to investigate the nature of the scatter around the MZR, the observables and physical processes causing it, and its dependence on galaxy mass. We use cold gas masses inferred from optical emission lines using the technique of Scholte&Saintonge (2023) to confirm that at fixed stellar mass, metallicity and gas mass are anti-correlated, but only for galaxies up to M*= 10^{10.5} Msun. In that mass regime, we find a link between the offset of a galaxy from the MZR and halo mass, using the amplitude of the two-point correlation function as a proxy for halo mass; at fixed stellar mass, the most gas-poor galaxies reside in the most massive halos. This observation is consistent with changes in gas accretion rates onto galaxies as a function of halo mass, with environmental effects acting on satellite galaxies also contributing. At higher stellar masses, the scatter of the MZR does no longer correlate with gas or halo mass. Instead, there is some indication of a link with AGN activity, as expected from models and simulations that metallicity is set by the interplay between gas in- and outflows, star formation, and AGN feedback, shaping the MZR and its scatter.


INTRODUCTION
The gas-phase metallicity of a galaxy is directly linked to the metal yield of the star formation process and the cycling of gas in and out of the interstellar medium.The metallicity can for example be reduced by accretion of metal-poor gas or the ejection of preferentially metalenriched gas.Given this fundamental connection with all aspects of the baryon cycle, many of which are challenging to observe directly, the gas-phase metallicity is a powerful tool to constrain galaxy formation and feedback models (e.g.Tinsley 1980;Edmunds 1990;Davé et al. 2012;Ma et al. 2016).
The mass-metallicity relation (MZR) is the scaling between the stellar mass of galaxies and their gas-phase metallicity (e.g.Lequeux et al. 1979;Tremonti et al. 2004, see also Fig. 1,left).The MZR has been characterised up to  ∼ 3 (e.g.Erb et al. 2006;Mannucci et al. 2009;Zahid et al. 2013;Sanders et al. 2021), and over 5 orders of magnitude in stellar mass at  ∼ 0 (e.g. Lee et al. 2006;Berg et al. 2012;Andrews & Martini 2013;Jimmy et al. 2015;James et al. 2017).The shape of the MZR is determined by the relative balance between metal production, loss and dilution, as a function of galaxy mass and redshift.The shape is characterised by a flattening above a mass of M * ∼ 10 10−10.5 M ⊙ , with the value of the threshold ★ E-mail: nancy.yang@physics.ox.ac.uk † E-mail: a.saintonge@ucl.ac.uk dependent on metallicity calibration and redshift (Kewley & Ellison 2008;Zahid et al. 2013).
Various models have been invoked to explain the shape of the MZR.Early investigations considered closed-box chemical evolution models (Searle & Sargent 1972), or variations thereof that allow for inflow of metal-poor gas and outflow of metal-rich gas (e.g.Larson 1972Larson , 1974)).Several studies identify outflows specifically as the main mechanism shaping the MZR: as galaxies get more massive, the deeper potential wells and/or less efficient outflows mean that galaxies retain a higher fraction of the metals produced (Dekel & Silk 1986;Dalcanton 2007;Davé et al. 2011;Chisholm et al. 2018).Equilibrium models, based on the regulation of star formation by gas inflows and outflows, specifically propose that the gas-phase metallicity is determined by the instantaneous properties of the galaxies, in particular their metal production and cold ISM mass (Finlator & Davé 2008;Lilly et al. 2013).In the model of Forbes et al. (2014), it is the recent gas accretion history that sets the presentday gas-phase metallicity of a galaxy.Therefore, while the shape of the MZR encodes information about metal production and feedback processes, its scatter should be related to the natural galaxy-to-galaxy variations at fixed stellar mass in formation/accretion histories and present-day gas contents.
This expectation from theory has now been verified, in the form of an observed anti-correlation between cold gas mass (both atomic and molecular) and offset from the MZR: at fixed stellar mass, the most gas-rich galaxies are the most metal-poor (and vice-versa) (Hughes et al. 2013;Lara-Lopez et al. 2013;Bothwell et al. 2013;Jimmy et al. 2015;Bothwell et al. 2016;Brown et al. 2018;Chen et al. 2022;Scholte & Saintonge 2023).The same result is also recovered in numerical simulations (e.g.Torrey et al. 2019;De Lucia et al. 2020;van Loon et al. 2021), where it arises from the self-regulation of the star formation activity via gas accretion and feedback.This important three-way relation between stellar mass, gas mass and metallicity also gives rise to the observed correlation between MZR scatter and star formation rate (e.g.Ellison et al. 2008;Lara-López et al. 2010;Mannucci et al. 2010), via the link between gas contents and star formation activity.This relation is commonly referred to as the "fundamental metallicity relation" (FMR), although both observations and simulations now show that gas mass, rather than SFR, is the more fundamental observable linked to the scatter of the MZR (e.g.Bothwell et al. (2013); Brown et al. (2018);Torrey et al. (2019); Scholte & Saintonge (2023), see however Baker et al. (2023) for the respective roles of SFR and gas mass as the third parameter in the resolved MZR).
The aim of this paper is to go one step further and seek observational evidence for the physical drivers of the relation between MZR scatter and gas mass.The most common interpretation invokes variations in gas accretion: at fixed stellar mass, galaxies which have recently accreted a significant amount of (metal-poor) gas will have both lower metallicities and higher gas masses.Conversely, galaxies where gas accretion has been inefficient would have depleted gas reservoirs and higher metallicites.Multiple studies (e.g.Yates et al. 2012;De Rossi et al. 2017;Trayford & Schaye 2019;Torrey et al. 2019;van Loon et al. 2021) have found through simulations that gas accretion does explain most of the scatter at lower stellar mass, but that at M * ≳ 10 10.5 M ⊙ , AGN feedback may be a more important factor.This is the scenario we are testing in this study.In particular, we use information about the characteristic halo mass of different galaxy sub-samples (using the amplitude of the two-point correlation function as a proxy), since simulations show that accretion rates onto galaxies depend on their large scale environment, including factors such as the mass of their dark matter halo, whether they are the central or a satellite galaxy in this halo, and on the position of the system with respect to the cosmic web (Kereš et al. 2005;Dekel & Birnboim 2006;van de Voort et al. 2011).Some studies have so far looked at the impact of large scale structure on the gas-phase metallicites.Donnan et al. (2022) considered specifically the influence of the position of a galaxy with respect to the cosmic web on its gas phase metallicity, finding that galaxies closest to nodes of the cosmic web are more metal-enriched.Similarly, but with a much smaller sample of galaxies in a specific cosmic filament at  ∼ 0.5, Darvish et al. (2015) report higher metallicities for galaxies closest to the filament.Other authors have reported an increase in metallicity with the local density of galaxies (Mouhcine et al. 2007;Petropoulou et al. 2012;Peng & Maiolino 2014).The challenge is in the interpretation of these results, in particular disentangling the relative contributions of the intrinsic and environmental factors, a process that can be eased by drawing upon the results of numerical simulations and models (e.g.Peng & Maiolino 2014;Genel 2016).
In this work, we provide new insights into the joint relation between stellar mass, gas mass, and metallicity, and the role that gas accretion and halo mass have in shaping it.This is achieved by using a large sample of galaxies with both metallicites and cold gas masses derived from their SDSS optical spectra.The sample selection and physical parameter derivations are explained in Sec. 2. The sizeable sample allows us to use galaxy clustering as a proxy for halo mass; these methods are presented in Sec.2.5, with the results presented and discussed in Sec. 3. Unless stated otherwise, physical quantities measured and used in this work assume a ΛCDM cosmology with Ω  = 0.3 and H 0 = 70 km s −1 Mpc −1 , and a Chabrier IMF.

SDSS sample selection and data products
The sample is selected from SDSS Data Relesase 8 (Aihara et al. 2011), using emission line fluxes, stellar masses and star formation rates from the MPA-JHU catalogue (Brinchmann et al. 2004;Kauffmann et al. 2003a;Tremonti et al. 2004).From the Main Galaxy Sample (MGS; Strauss et al. 2002) of SDSS, we select galaxies with 0.02 <  < 0.055 and M * > 10 9 M ⊙ .At  = 0.055, the SDSS spectroscopic sample is complete for galaxies with M * > 10 9.4 M ⊙ .We have tested that the results are consistent between the two M * thresholds for the sample selection, and adopt M * > 10 9 M ⊙ since the resulting increase in sample size reduces the uncertainties on the correlation function, especially at small separations (see Sec. 2.5).
As we will be measuring correlation functions, we further restrict the sample to the contiguous sky area delineated by 130 Finally, as we wish to study gas-phase metallicity, we remove quiescent galaxies from the sample by rejecting galaxies located more than 0.5 dex below the main-sequence, using the definition of Saintonge & Catinella (2022).In order to measure robust metallicities, we require galaxies to have an H line flux measured with S/N> 15.Such a high threshold ensures the reliable detection of the four strong lines required to calculate robust metallicities (see Sec. 2.2), without introducing metallicity-dependent biases, as would be the case if instead we used the approach of requiring a specific S/N level in each of the four strong lines (Cid Fernandes et al. 2010;Juneau et al. 2014).Given our selection based on distance from the main sequence, the requirement of having S/N> 15 in the H line only rejects 3% of the galaxies, and therefore does not bias the sample.
After applying these selection cuts, the sample consists of 36000 galaxies.
Based on their position in the [NII]-BPT diagram (Baldwin et al. 1981), we identify star forming galaxies using the Kauffmann et al. (2003b) criterion.This "SF sample" includes 31000 galaxies and is used in the analysis in Sec.3.1 and 3.2.
To study the possible impact of AGN activity on the scatter of the MZR, in Sec.3.3 we use as our "LINER sample" the 2600 galaxies above the Kauffmann et al. (2003b) line in the BPT diagram, excluding only Seyfert galaxies (identified as those galaxies with log([NII]/H)> −0.22 and log([OIII]/H)> 0.48, Ho et al. (1997)) due to the complications in calculating their metallicities via strong emission lines -see Sec.2.2 for more on this.

Gas-phase metallicities
The observed behaviour of the mass-metallicity relation, in terms of its exact shape, normalisation, scatter and redshift evolution, can be sensitive to the method employed to calculate the metallicity (e.g.Kewley & Ellison 2008;Yates et al. 2012;Curti et al. 2020).The main results of this study make use of the metallicity calibration of Pettini & Pagel (2004, hearafter PP04): To test the robustness of the results against metallicity measurement techniques, we also calculate metallicities using the empirical calibration of Denicoló et al. (2002) which, unlike Eq. 1, only makes use of the [NII]/H ratio.We also derive metallicities from CLOUDY photoionisation modeling using the implementation of Scholte & Saintonge (2023).All results presented in this paper, while using the PP04 calibration, have been verified to be robust when using the other calibrations.Note that while there is relatively good agreement between PP04 and D02 (with differences in 12+log(O/H) of ∼0.1 dex, (Kewley & Ellison 2008)), other metallicity calibrations can have larger discrepancies (∼0.4-0.7 dex (e.g.Kewley & Ellison 2008;Teimoorinia et al. 2021).With larger samples of galaxies with robust metallicity measurements from temperature-sensitive indicators, as are now becoming available from the DESI survey for example (Zou et al. 2023), we will in the future be able to validate the results in this study with yet more varied metallicity indicators.
Caution must be used when estimating gas-phase metallicites in situations where line emission is mostly triggered by shocks or AGN activity rather than star formation.As strong line methods for determining metallicity are calibrated using H II regions, they may not result in a reliable metallicity when the target is not a star forming region.Sanders et al. (2017) studied the bias introduced by contamination of emission lines by regions of diffuse interstellar gas (DIG) and flux weighting effects.In the strong-line relations, biases up to Δ3 = +0.3dex were found between spectra of mock galaxies and HII regions.Kumari et al. (2019) used HII-DIG region pairs to study the bias that diffuse gas introduces in the calibration of strong line metallicities.Based on these, they then proposed the following correction term to be subtracted from the metallicities of data points beyond the Kauffmann line (Kauffmann et al. 2003b) on the [NII]-BPT diagram: which we have done for our LINER sample.As this correction has not been validated in the regime of Seyfert galaxies, these objects have been excluded from our sample.This method has previously been used successfully by Kumari et al. (2021) to include LINERclassified galaxies in a study of the FMR.We note however that there is not a clear consensus on the optimal way to calculate metallicties in LINERs (see e.g.Oliveira et al. 2022, in contrast to Kumari et al. (2019)), therefore the results obtained in this study for the LINER sample (see Sec. 3.3) should be considered as encouraging, but requiring confirmation in the future through alternative metallicity measurements, for example using stellar continuum rather than emission lines (e.g.Thorne et al. 2022) or detailed photoionisation modeling (e.g.Vidal-García et al. 2022).

Gas masses
Despite significant progress in the past decade, cold atomic and molecular gas masses from direct HI and/or CO observations are to this day not available for very large and complete samples such as those that can be assembled from optical samples like SDSS (see review by Saintonge & Catinella 2022).Here, we therefore make use of an additional SDSS dataset that includes molecular gas masses (M H 2 ) derived using optical emission lines and photoionisation modeling (Scholte & Saintonge 2023).The uncertainly on individual cold gas mass measurements obtained through optical emission lines at ∼ 0.25 dex is large compared to other more direct methods, however, the sample size for which gas mass measurements can be derived with this technique is orders of magnitude larger.The principle is that the optical lines encode information about the dust contents of the galaxies, which can be turned into a gas mass estimate (Brinchmann et al. 2013;Concas & Popesso 2019;Yesuf & Ho 2019;Piotrowska et al. 2020).We refer the reader to Scholte & Saintonge (2023) for details of the method, but in short, the gas masses are derived from the attenuation of the Hydrogen Balmer lines, in combination with metallicity estimates to constrain the dust-to-gas ratio.Calibration is done against the xCOLD GASS galaxies with CO-based M H 2 measurements (Saintonge et al. 2017).To allow for comparison with the results of Brown et al. (2018) in Fig. 1 and Sec.3.1, we infer M HI values from these M H 2 estimates using the relation between M H 2 /M HI and M * from Catinella et al. (2018).It should be noted that the dataset used in Fig. 1 and Sec.3.1 is slightly different from the dataset mentioned previously and used throughout the rest of the work.The Scholte & Saintonge (2023) dataset does not include cuts in  J2000 and  J2000 , but does come with a redshift cut of 0.027 <  < 0.055 (compared to 0.020 <  < 0.055 used in the rest of this study), explaining the very small differences in the MZRs shown in Figures 1 and 2.

AGN properties
To study the possible effect of AGN feedback on the scatter of the MZR (Sec.3.3), we calculate for our LINER sample the quantity: which serves as proxy for the Eddington ratio, and therefore the rate of accretion onto the supermassive black hole.Note that this equation is only applicable to the inner few kpc of a galaxy.As the SDSS' fibers only covers the innermost 3 kpc for the highest redshift galaxies in our sample, we may use it here.To account for the resolution limit of the SDSS, a velocity dispersion cut of  > 70 km s −1 was applied.Following Wild et al. (2007) and Graves et al. (2009), the [O III] luminosity was corrected for dust extinction using the Balmer decrement, and the SDSS fiber stellar velocity dispersion was corrected for aperture effects as: where  • =  deV √︁ / deV is the circularised galaxy radius in arcseconds, and  deV and / deV are the radius and axis ratio of the r-band de Vaucouleurs model.For the SDSS spectra,  fib = 1.5 ′′ .

Clustering amplitude; a proxy for halo mass
As introduced above and discussed in more detail in Sec. 3, an aim of this study is to link the gas-phase metallicity to the gas contents of galaxies, and specifically with the rate of gas accretion onto the system.We cannot directly measure gas accretion rates with the data available, but simulation shows that it correlates strongly with halo mass (e.g.van de Voort et al. 2011).There are on the other hand multiple methods to infer the dark matter halo masses of galaxies, each with its set of advantages and disadvantages: abundance matching (Conroy et al. 2006;Behroozi et al. 2010), dynamical mass calculations, weak lensing (Kaiser et al. 1995;Mandelbaum et al. 2006;van Uitert et al. 2016), and most recently even machine learning (Calderon & Berlind 2019).In this study, we use the clustering amplitude of different sub-populations, as measured through the two-point correlation function.Since higher mass halos are more strongly clustered than lower mass halos (Sheth & Tormen 1999), at fixed stellar mass, the relative clustering amplitude of different galaxy sub-samples tells us which characteristically reside in the most/least massive halos.In Sec.3.2, we also validate our results by considering the abundance matching-derived halo masses and central/satellite classifications from SDSS group catalogs (Yang et al. 2007;Lim et al. 2017).
Correlation functions are computed in Corrfunc (Sinha & Garrison 2020), which uses the Landy & Szalay (1993) estimator for the redshift-space correlation function.DD, DR and RR are the pair counts of data-data, data-random and random-random.The random catalog was generated to have a random uniform distribution in RA and sin(Dec), with redshift values assigned based on the distribution in the real catalog.These are then converted within Corrfunc into the projected correlation function to minimise the effects of redshift space distortion.Here   and   are separations perpendicular and parallel to the line of sight.The upper limit of the summation was set to The uncertainty on the projected correlation function is estimated by bootstrapping, selecting 80% of the data at random in each of the   draws.The error on the mean of the bootstrapped distribution was then calculated as
To compare the strength of the clustering across different galaxy sub-samples, we calculate the relative bias,    (  ), by comparing the correlation function of a specific sub-sample,   (  ), with the correlation function of the entire sample,   (  ): Following Berti et al. (2021), we calculate the relative bias on a "one-halo" scale of 0.1 ℎ −1 Mpc <   < 1 ℎ −1 Mpc and a "two-halo" scale of 1 ℎ −1 Mpc <   < 10 ℎ −1 Mpc.The calculations are made using the default cosmology in Corrfunc which has Ω  = 0.302.By selecting these scales, we can analyse separately effects relating to the relation between satellites and their central galaxies (one-halo scales), and those connected to the clustering of galaxies pairs in different halos (two-halo scales).
The relative bias is calculated for each bootstrap resampling.We adopt the mean over these repetitions as our measured    and calculate the error on this value to be , where again   is the number of bootstrap repetitions, and  2   is the variance of the relative bias across the   repetitions.

Gas mass and the scatter of the mass-metallicity relation: observations and expectations
The mass of the cold ISM of typically star-forming galaxies, both atomic and molecular, has been observed to correlate with the scatter of the MZR; at fixed stellar mass, the more gas-rich a galaxy, the lower its metallicity (Bothwell et al. 2013(Bothwell et al. , 2016;;Brown et al. 2018).Analytic models and simulations suggest that this correlation between metallicity and gas mass emerges from the self-regulation of the star formation activity and gas contents of galaxies via accretion of typically metal-poor gas, and outflows of metal-enriched gas (Davé et al. 2012;Lilly et al. 2013;Torrey et al. 2019;De Lucia et al. 2020;van Loon et al. 2021).The scatter of the MZR therefore contains important information about many components of the baryon cycle; in this study we aim to go one step further and identify which specific processes are most responsible for driving the scatter of the MZR via the gas mass dependence.
In Figure 1 we show the median atomic gas mass as a function of stellar mass, in five different ranges of offset from the massmetallicity relation, ΔZ.The first important observation is that the results obtained here with optically-derived gas masses (see Sec. 2.3), are qualitatively consistent with the measurements of Brown et al. (2018) from stacking of Hi spectra from the ALFALFA survey (Haynes et al. 2018).In both sets of results, we observe an anticorrelation between gas mass and metallicity at fixed M * , a trend that is strongest at the lowest stellar masses probed (∼ 10 9 M ⊙ ), gradually weakening as M * increases.The weakening of the anticor-relation between gas mass and metallicity over the stellar mass range 10 9 − 10 10.5 M ⊙ is also seen in the GAEA semi-analytic model (De Lucia et al. 2020).We have checked that the weakening of the anticorrelation in our observations presented in Fig. 1 is not due to the reduced dynamic range in absolute values of metallicity for a given ΔZ at high M * .The results are unchanged after repeating the analysis using ΔZ bins of fixed width instead of fixed quantiles.We therefore show the version of the figure with the fixed quantiles binning; this is consistent with the Brown et al. (2018) methodology.
Beyond ∼ 10 10.5 M ⊙ , the dependence of M HI on ΔZ disappears, with even indications of a reversal of the trend, with the most metalpoor galaxies at fixed mass now being the most gas-poor.The reversal in the role of gas mass in driving the scatter of the MZR above 10 10.5 M ⊙ is seen in the EAGLE simulations (van Loon et al. 2021), where it can be attributed to AGN feedback (De Rossi et al. 2017).The effect is also reminiscent of the reversal in the role of SFR on the MZR scatter seen by Yates et al. (2012) and Yates & Kauffmann (2014) in both observations (SDSS) and semi-analytic models (L-Galaxies).
The main aim of this study is to look for the physical drivers behind the gas mass-metallicity relation.The results shown in Fig. 1 not only confirm the existence of this relation, but also strongly suggest that we may be looking for different mechanisms at low and high stellar masses.It is instructive to look at the results of simulations to guide us in this search.

Low/intermediate mass galaxies
In the low-to intermediate-mass regime (i.e. up to M * ∼ 10 10.5 M ⊙ ), there is broad consensus that the scatter in the MZR and its dependence on gas mass are due to fluctuations in accretion rates.Using the IllustrisTNG simulation suite, Torrey et al. ( 2019) find that galaxies alternate between being enrichment-and accretion-dominated: periods of increased accretion causing high gas mass and low metallicities are followed by periods of more active star formation that both increase metallicity and decrease gas contents.In this regard, the simulations fully support the simple principles of gas-centric analytic models, where galaxies naturally restore their equilibrium gas mass and metallicity after perturbations to their gas reservoirs (Davé et al. 2012;Lilly et al. 2013;Forbes et al. 2014).In the EAGLE simulations, van Loon et al. ( 2021) also find that in this M * range most of the scatter of the MZR comes from systematic variations in gas accretion rates.It is however important to note that these results typically only hold for galaxies whose evolution is dominated by steady gas accretion and secular processes, and not for those held out of equilibrium for an extended period of time by processes such as major mergers and AGN feedback.
In order to know where to look for an observational signature of these gas accretion rate variations, it is necessary to consider their origins and timescales.Generally speaking, the variability in lowmetallicity gas accretion rates onto the central star-forming regions can have causes that are either external (accretion of gas onto the galaxy from the large-scale environment) or internal (displacement of gas from extended low density HI envelopes to the denser starforming disc).In either case, the timescales are expected to be long, on the order of several Gyr, at least at the present time.For example, the timescale expected for the natural up-and-down changes in gas mass and star formation activity under the equilibrium model are long, at 0.4  ∼ 5 Gyr (Lilly et al. 2013;Tacchella et al. 2016).Similarly the timescale to transport gas radially from the outer disk to the central region of a galaxy is set by the dynamical friction timescale, which approaches the Hubble time at  = 0 (Tacconi et al. 2020), with simulations suggesting a timescale of ∼ 5 Gyr to transport gas radially by 10kpc (Okalidis et al. 2021).There is also evidence from simulations that MZR offset, and the systematic variations in gas accretion rate, are long-lived (timescales of several Gyr), although we note that Torrey et al. ( 2019) report shorter timescale variations (on the order of ∼ 1Gyr) in the gas mass, SFR and metallicity of low mass galaxies.
Our aim here is to look for evidence of this scenario, and investigate whether any changes in gas supply are due to internal or external factors.On the one hand, we expect galaxies in overdense regions of the cosmic web to have less access to fresh gas supply, and therefore to display higher metallicities.This is seen for example by Donnan et al. (2022) as an increased metallicity in galaxies closest to nodes in the cosmic web, and in a range of studies that look for the impact of environment on metallicity, which is particularly strong for satellite galaxies (see summary in e.g.Maiolino & Mannucci 2019).On the other hand, if the link between gas mass and metallicity is caused either by feedback or by internal gas displacement (rather than external accretion), then at fixed stellar mass we might not expect a strong link between gas, metallicity, and halo properties/environment.
To shed light on the possible roles of internal and external factors, we perform a test in Sec.3.2, using two different methods: first we use the clustering amplitude measured from the two-point correlation function as a proxy for the typical halo mass of different galaxy subsamples, and secondly, by using the central/satellite classification and halo masses from the group catalog of Lim et al. (2017).

High mass galaxies
In the high mass regime (M * > 10 10.5 M ⊙ ), both EAGLE and Illus-trisTNG simulations find that the scatter of the MZR is no longer driven by systematic variations in gas inflow rate, but instead dominated by the impact of AGN feedback.There should be no strong correlation between clustering and MZR offset, if this is the case.

The role of gas accretion: clustering analysis
We use the "SF sample" to calculate the correlation function and compare the clustering amplitude of different galaxy sub-samples as a function of their offset from the MZR.Based on the observed dependence of MZR scatter on gas mass (see Fig. 1) and expectations from simulations, we calculate the correlation function over two different stellar mass intervals.To ensure enough statistics in the high-M * sample, we consider the mass ranges 10 9 − 10 10 M ⊙ and 10 10 − 10 11 M ⊙ .We have verified that changing the transition point between the two subsamples to 10 10.5 M ⊙ does not affect the results other than increasing the uncertainty in the high-M * sample.In each stellar mass interval, the galaxies are divided in equally populated subsamples based on the offset from the MZR (see Fig. 2 and Table 1; 6 bins for the low-M * subsample, 4 for the high-M * subsample).
The correlation function for the low-and high-M * subsamples are shown in the middle and right panels of Fig. 2, respectively.The correlation functions show that for the low-M * subsample, galaxies in the two highest metallicity bins cluster more strongly.This is confirmed by the relative bias for both the one-and two-halo terms.Conversely, the most metal-poor galaxies are less clustered than the average population, especially on two-halo term scales (Table 1).
For the high-M * subsample, the correlation functions of the different metallicity sub-samples are more similar.The two-halo relative bias values for the four metallicity bins are consistent with the overall population, within uncertainties (Table 1 1.SF sample binned by offset from the MZR, for samples in low and high stellar mass.Given are the metallicity offset, number of galaxies in the bin and the minimum, mean and maximum metallicity of the bin.Relative biases are given on the one-halo and two-halo scale, as well as for a centrals-only dataset.The dashed lines represent the correlation function calculated for the full sample within the two stellar mass ranges, with respect to which the relative biases from Table 1 were calculated. amplitude as a proxy for the mass of the main halo (Sheth & Tormen 1999), then the conclusion is that the scatter in the MZR at high M * is not connected to a physical process that depends strongly on halo mass.
An important consideration is that the galaxy sample used so far includes both central and satellite galaxies.This somewhat complicates the interpretation of the clustering results, because the evolutionary pathways of central and satellite galaxies are different, resulting in different galaxy-halo connections (Watson & Conroy 2013).It has also been shown that satellite galaxies have on average higher metallicities than centrals of the same mass (Pasquali et al. 2012;Peng & Maiolino 2014;De Rossi et al. 2015).We therefore need to see whether the difference in clustering observed in Fig. 2 can be explained by variations in the fraction of satellite galaxies.To perform this test, we use the SDSS group catalog of Lim et al. (2017), which itself is based on the method of Yang et al. (2007).For each galaxy in our "SF sample", the group catalog provides an estimate of the dark matter halo mass,  ℎ , and an indication of whether the galaxy is the most massive in its halo (in which case it is labeled as the central), or else a satellite.
First, we look in Figure 3   stellar mass, galaxies with the higher/lowest metallicities reside on average in more/less massive halos.
Secondly, in Figure 4, we show the fraction of satellite galaxies in each ΔZ subsample.At low stellar masses (M * < 10 10 M ⊙ ), the satellite fraction is significantly enhanced for the most metal-rich subsamples, while at higher stellar masses there is no strong dependency of satellite fraction of ΔZ.This result is perfectly consistent with the expectations from equilibrium galaxy models: Davé et al. (2012) make the argument that satellite galaxies are expected to have lower gas content and higher metallicities than centrals at fixed mass, because gas accreted onto the system goes preferentially to the centres of the halos, bypassing the satellites.
To check whether the results of Fig. 2 are entirely caused by these variations in the satellite fraction, we calculate the correlation function for central galaxies only, and report the two-halo relative bias values in Table 1 and Figure 5.The amplitude of the bias is reduced, but there is still a statistically meaningful result for the highest and lowest ΔZ bins.
In summary, as shown in Fig. 5, we find that at fixed stellar mass, galaxies with the highest metallicities are significantly more clustered than average.This effect is partly caused by a significantly higher fraction of satellite galaxies amongst these metal-rich and gas-poor galaxies where environment-specific processes such as stripping and strangulation are at play, but the result still holds when considering only central galaxies.Conversely, galaxies with the lowest metallicities at fixed stellar mass are the least clustered.For galaxies within the 1 scatter of the MZR, there is a mild trend between metallicity and clustering amplitude for low mass galaxies (M * < 10 10.0−10.5 M ⊙ ), especially on one-halo scales.
For high mass galaxies (M * > 10 10 M ⊙ ), there is no correlation between ΔZ and clustering.In the next section, we investigate what might instead be responsible for the scatter of the MZR at high masses.

AGN influence
The L-Galaxies, IllustrisTNG and EAGLE simulations (Yates et    the MZR scatter of high stellar mass galaxies ( * > 10 10.5  ⊙ ) and AGN feedback.In Sec.3.2 we used the projected correlation function to show that there is very little correlation between halo mass (and therefore gas accretion, following the line of arguments presented above) and the scatter of the MZR in massive galaxies.To test whether a relation can be found instead with AGN activity, we show in Figure 6 our proxy for the Eddington ratio,  [OIII] / 4 , as a function of M * for the different ΔZ bins.We consider the "LINER sample" here, which, same as the SF sample from the previous section, does not include galaxies that are more than 0.5 dex below the main-sequence.
We see a strong correlation at high M * between AGN activity and ΔZ: galaxies with the most actively accreting black holes have the lowest metallicity.Under the assumption that AGN with the highest accretion rates also have stronger feedback, there is then some consistency between the simulation predictions and the results shown in Figure 6.De Rossi et al. (2017) interpret the low metallicities of the galaxies with the most active black holes as the result of ejection of preferentially metal-enriched gas.Yates & Kauffmann (2014) provide an explanation for the low metallicities of massive galaxies with large black holes that relates to a slow dilution of the ISM, but that mechanism is less likely to be important in our sample as we have excluded galaxies with low SSFRs where this process dominates.
We also see in Figure 6 an intriguing effect: the galaxies with the highest metallicities at fixed mass also have elevated AGN accretion rates (although to a lesser extent), compared to the bulk of the galaxies located within 1 of the MZR.A possible interpretation is that we are seeing the impact of the different timescales regulating AGN activity, star formation and ISM enrichment interacting in an intricate way, but a more thorough analysis would require in depth comparisons with simulations and a larger observational sample to control for additional parameters and bin more finely as a function of MZR offset.

SUMMARY OF OBSERVATIONS AND DISCUSSION
We have used a sample of ∼ 36, 000 main-sequence galaxies from SDSS DR8 to study the mass-metallicity relation in the local Universe (0.02 <  < 0.055), and especially the nature of its scatter.The specific aim of this work was to look for the possible physical drivers of this connection between the scatter of the MZR and gas mass.Our main findings are as follows: • The scatter of the mass-metallicity relation correlates with gas mass below log  * ∼ 10 10.5  ⊙ ; at fixed stellar mass the most gas-rich galaxies are also the most metal-poor, as already found by previous works (e.g.Hughes et al. (2013); Bothwell et al. (2013).This link between gas and metallicity is strongest at the lowest stellar masses probed in this work (M * = 10 9 M ⊙ ) and vanishes at M * ∼ 10 10.5 M ⊙ , in agreement with simulations (De Lucia et al. 2020) and results obtained from stacking of HI spectra (Brown et al. 2018).The novelty here is that our results were obtained from cold gas masses inferred from optical emission lines.While such indirect measures of gas mass have significant uncertainty for individual galaxies, they are robust for statistical studies such as this (see also e.g.Piotrowska et al. 2020;Scholte & Saintonge 2023).
• Given the change in both the shape and scatter of the MZR at M * = 10 10.5 M ⊙ , we investigate separately low-and high-mass galaxies.At low stellar masses, we find that the most metal-rich galaxies are found on average in more massive dark matter halos (and vice versa), as inferred both by the clustering amplitude in the two-point correlation function and using the halo masses from the Lim et al. (2017) group catalog.This seems to be in contradiction to the simulations ran by De Lucia et al. (2020).Looking separately at centrals and satellites shows a contribution of both environmental effects on satellites and the halo mass itself.Since the accretion rate of gas onto galaxies is dependent on halo mass, our results support the picture emerging from simulations and analytic models, which find that the scatter of the MZR and its dependence on gas mass are linked to fluctuations in accretion rates.
• At higher stellar masses, our observations support the prediction from various simulations such as the IllustrisTNG and EAGLE simulations (De Rossi et al. 2017;Torrey et al. 2019;van Loon et al. 2021) that it is no longer gas accretion that drives the scatter of the MZR, but rather AGN feedback.In our sample of LINERs, we indeed find a correlation between the Eddington ratio and MZR offset, with galaxies with the most active black holes having the lowest metallicities.However, the reverse was not shown to be the case for galaxies with the highest metallicities.While we applied a correction to the metallicity measurements to extend their range of applicability into the regime of LINERs, this result should be revisited in the future with alternative metallicity calibrations that are even more robust to the presence of AGN activity.
There is abundant evidence that gas mass is the observable that most closely correlates with the scatter of the MZR, at least at low stellar masses (M * < 10 10.5 M ⊙ ) and in comparison to other parameters such as SFR.This has been observed in the nearby universe using HI-and CO-derived gas masses (e.g.Hughes et al. 2013;Bothwell et al. 2013Bothwell et al. , 2016;;Brown et al. 2018), and now in this work using optically-derived cold gas masses (see also Scholte & Saintonge 2023).Differences in gas accretion rate are most often invoked to explain this observation, with high (low) gas accretion rate resulting in high (low) gas fraction and decreased (increased) gas-phase metallicity, since the inflowing gas is assumed to be more metal-poor than the ISM.This scenario is supported by simulations and analytic models, but as of yet very little observational evidence has been provided for it, given the considerable challenge of directly measuring gas accretion rates onto galaxies.
In this study, we showed that at fixed stellar mass, galaxies with the highest (lowest) metallicities are found in dark matter halos that are more (less) massive than average.Above a threshold mass of  ℎ ∼ 10 12 M ⊙ , simulations show that the rate of gas accretion onto galaxies (as opposed as into the halo) decreases, the more massive the halo is (van de Voort et al. 2011).Under this premise, our observation that the most metal-rich galaxies are found in more massive dark matter halos fully supports the idea that these galaxies have lower accretion rates, leaving them with both lower gas masses and higher metallicities, as observed.Recently, Donnan et al. (2022) tackled a similar question and report that galaxies closest to nodes in the cosmic web tend to have higher metallicities than average, due to the reduction in low-metallicity gas accretion.They find that their result persists even after controlling for halo mass.The combination of this work and that of Donnan et al. (2022) therefore suggests that both halo mass itself, and the location of the halo with respect to the cosmic web have an influence on the availability of fresh gas, and therefore on the gas-phase metallicity in the ISM.
When interpreting our results (in particular the correlation between offset from the MZR and halo mass), we have chosen to draw a link between halo mass and gas accretion rate.There are however other physical processes or galaxy properties that correlate with halo mass.For example, at fixed stellar mass, there are systematic variations in halo mass as a function of galaxy morphology and color (Mandelbaum et al. 2006;van Uitert et al. 2011;Taylor et al. 2020), as a result of differences in formation histories, gas accretion, and internal secular processes.With larger and more complete spectroscopic samples, such as those now produced by DESI and soon by surveys such as 4MOST-WAVES, it will be possible to repeat the kind of analysis performed here while controlling for some of these additional properties, in order to disentangle the relative impact of external (gas accretion from the cosmic web) and internal processes (radial flows, disc instabilities).
This study was focused on the local Universe, but even more powerful conclusions will be drawn in the future from tracking changes in the relative roles of environment, gas accretion and feedback in setting the MZR and its scatter as a function of redshift.There is already evidence that the relation between metallicity and environment disappears or even reverses at  ≳ 2 (Kacprzak et al. 2015;Chartab et al. 2021).Tracking systematic changes in the nature of the scatter of the MZR with both mass and redshift will be made easier with the next generation of spectroscopic surveys.Additionally, with JWST allowing us to measure robust metallicities at significantly higher redshifts and in lower mass galaxies than previously possible (e.g.Li et al. 2022;Curti et al. 2022;Sanders et al. 2022), it will be interesting to push the study of the scatter of the MZR and its dependence on both gas mass and halo mass/environment to these large lookback times where galaxies are expected to be out of equilibrium.

Figure 1 .
Figure 1.Relation between mean gas mass and stellar mass (right panels) for star-forming galaxies binned by offset from the MZR (as defined in the left-hand panels).The results are shown for two sets of measurements: the optically-derived gas masses from Scholte & Saintonge (2023) (see Sec. 2.3 for details on this dataset) and the results obtained by Brown et al. (2018) from stacking of ALFALFA HI spectra.

Figure 2 .
Figure 2. Left: Mass-metallicity relation for the "SF sample", showing the binning adopted as a function of ΔZ, over two different stellar mass ranges.Center and right: projected correlation functions for each ΔZ subsample, for stellar mass ranges of 9 < log(  * ) < 10 ⊙ and 10 < log(  * ) < 11 ⊙ respectively.The dashed lines represent the correlation function calculated for the full sample within the two stellar mass ranges, with respect to which the relative biases from Table1were calculated.

Figure 3 .
Figure 3. Cumulative distribution of dark matter halo masses (from the Lim et al. (2017) group catalog) for the low-M * (solid lines) and high-M * (dotted lines) subsamples.In each case, the distribution is given for different bins of offset from the MZR, ΔZ.

Figure 4 .
Figure 4.The satellite galaxy fraction as a function of stellar mass, for different sub-samples defined by their offset from the median MZR.The satellite/central galaxy classification is obtained from the Lim et al. (2017) SDSS group catalog.

Figure 5 .
Figure 5. Relative bias on one-and two-halo scales for different ΔZ subsamples as compared to the entire "SF sample".The results for the low mass subsample (10 9 − 10 10 M ⊙ ) are shown with filled circles and solid lines, the high mass subsample (10 10 − 10 11 M ⊙ ) with open circles and dotted lines.

Figure 6 .
Figure6.The Eddington ratio for galaxies binned by their offset from the median MZR, as a function of stellar mass.The high/low metallicity galaxies are defined as those that fall above/below one standard deviation from the mean MZR at fixed setellar mass.