Training Custom Light Curve Models of SN Ia Sub-Populations Selected According to Host Galaxy Properties

Type Ia supernova (SN Ia) cosmology analyses include a luminosity step function in their distance standardization process to account for an observed yet unexplained difference in the post-standardization luminosities of SNe Ia originating from different host galaxy populations (e.g., high-mass ($M \gtrsim 10^{10} M_{\odot}$) versus low-mass galaxies). We present a novel method for including host-mass correlations in the SALT3 light curve model used for standardising SN Ia distances. We split the SALT3 training sample according to host-mass, training independent models for the low- and high-host-mass samples. Our models indicate that there are different average Si II spectral feature strengths between the two populations, and that the average SED of SNe from low-mass galaxies is bluer than the high-mass counterpart. We then use our trained models to perform a SN cosmology analysis on the 3-year spectroscopically confirmed Dark Energy Survey SN sample, treating SNe from low- and high-mass host galaxies as separate populations throughout. We find that our mass-split models reduce the Hubble residual scatter in the sample, albeit at a low statistical significance. We do find a reduction in the mass-correlated luminosity step but conclude that this arises from the model-dependent re-definition of the fiducial SN absolute magnitude rather than the models themselves. Our results stress the importance of adopting a standard definition of the SN parameters ($x_0, x_1, c$) in order to extract the most value out of the light curve modelling tools that are currently available and to correctly interpret results that are fit with different models.


INTRODUCTION
Type Ia supernovae (SNe Ia) are standardisable candles whose distances can be measured to within ∼ 5% via the classical twoparameter standardisation process (Tripp 1998). 1 These distance measurements can be made out to ∼ 2.2 with current observing facilities (Jones et al. 2013;Graur et al. 2014;Rodney et al. 2014;Riess et al. 2018;Hayden et al. 2021), making SNe Ia a uniquely useful class of object for studying the history of the expansion of the Universe and phenomena such as dark energy (Guy et al. 2010;Suzuki et al. 2012;Betoule et al. 2014;Abbott et al. 2019;Scolnic et al. 2018;Jones et al. 2019;Brout et al. 2022a).Low redshift SNe Ia are also useful for making local measurements of the Hubble constant (Branch 1998;Riess et al. 2016;Galbany et al. 2022;Riess et al. 2022).
★ E-mail: georgina.taylor@anu.edu.au 1 Some studies have suggested that SN observations made in the nearinfrared are consistent enough to yield distance measurements without the need for standardisation parameters (Mandel et al. 2009;Johansson et al. 2021); however, the scarcity of near-infrared data means that this technique has not yet been widely adopted.
The SN parameters necessary for standardising distances are obtained by fitting a light curve model to observed time-series photometry of a sample of SNe Ia.For the last decade, SALT2 has been the most widely used light curve model (Guy et al. 2007(Guy et al. , 2010;;Mosher et al. 2014;Betoule et al. 2014;Taylor et al. 2021;Brout et al. 2022b;Taylor et al. 2023).Recently, Kenworthy et al. (2021) introduced SALT3, a modern refactored version of the SALT2 model.SALT3 has a number of advantages over SALT2 (Kenworthy et al. 2021;Pierel et al. 2022;Taylor et al. 2023).In particular, SALT3's actively maintained, well-documented, opensource training procedure makes it straightforward for the SN community to use and update SALT3 as we improve training data sets and gain a deeper understanding about the relationship between the model inputs and cosmological constraints (e.g., Jones et al. 2023;Dai et al. 2022).
An open question in the measurement of current SN distances (and moreover, cosmological constraints) is the true nature of the "mass step".The mass step is an observed ∼ 0.06 magnitude shift in standardised SN Ia distance moduli as a function of host galaxy mass (Kelly et al. 2010;Lampeitl et al. 2010;Sullivan et al. 2010;Smith et al. 2020;Kelsey et al. 2021).It is observed to be more significant for redder SNe Ia (Brout & Scolnic 2021;Kelsey et al. 2023).The astrophysical origins of this effect are as yet unconfirmed; the step in luminosity has also been observed to correlate with other parameters, such as progenitor metallicity (D Andrea et al. 2011;Hayden et al. 2013;Rose et al. 2021), progenitor ages (Childress et al. 2014), specific star formation rate (i.e., the star formation rate per unit of stellar mass, be it local: Rigault et al. 2013Rigault et al. , 2015;;Kim et al. 2018;Rigault et al. 2020;Briday et al. 2022;Hand et al. 2022;or global: D Andrea et al. 2011;Hand et al. 2022), age of the stellar population (Rose et al. 2019), host galaxy morphology (Pruzhinskaya et al. 2020), local host colour or mass (Roman et al. 2018;Jones et al. 2018;Kelsey et al. 2021), local dark matter density (Steigerwald et al. 2022), or host galaxy dust (Brout & Scolnic 2021;Popovic et al. 2021;Wiseman et al. 2022).However, it is still most common for SN cosmology analyses to standardise their distances with a mass step (Conley et al. 2011;Betoule et al. 2014;Scolnic et al. 2018;Abbott et al. 2019;Jones et al. 2019;Brout et al. 2022a) -although (Brout & Scolnic 2021) suggest that the observed step in SN luminosities is actually caused by differences in galactic dust laws that happen to correlate with host galaxy mass, and that by accounting for the diversity of dust properties, the need for the mass-step disappears.
Currently, the mass step is accounted for in SN cosmology analyses by deriving distances from a model (such as SALT2 or SALT3) that does not parameterise any host galaxy relationship, and then measuring how the Hubble residuals correlate with host properties in order to make a luminosity correction.For example, in the Dark Energy Survey 3-Year Supernova Analysis (DES-SN3YR; Abbott et al. 2019) the host-mass correction is added to the standard Tripp equation (Tripp 1998) to measure distance moduli: where is the distance modulus of a SN Ia in magnitude units; is the absolute magnitude of a fiducial ( 0 = 1, 1 = 0, = 0) SN Ia (and is degenerate with the Hubble constant 0 );2 and are global dimensionless nuisance parameters representing the amplitude of the stretch-luminosity and colour-luminosity relationships; , 1 and are respectively the apparent magnitude at peak, stretch and colour parameters of an individual SN (recovered from the SALT3 light curve fit); and bias accounts for bias corrections.The hostmass correction is: where log host = log 10 ★ / ⊙ represents the stellar mass of the host galaxy, log step is the chosen "split" between high-mass and low-mass galaxies (typically set to be 10), ★ is the width of the mass split, and is the mass step which is fit (e.g., Brout et al. 2019b).Regardless of its origins, it is now possible to explore the cosmological impact of the mass step in the underlying light curve model.By leveraging its training code accessibility and the large SNe Ia training samples that are now available, SALT3 can offer a insight into the mass step by revealing differences in the spectral energy distributions (SEDs) between high-host-mass and low-hostmass populations of SNe Ia.A SALT3-based investigation into the impact of the mass step was first made in Jones et al. (2023) (hereafter J23).They added a mass step parameterization to the model, finding differences in some spectral features; most notably in the equivalent widths of Ca H&K and Si II (at ∼ 2 significance).While using the J23 model parameterisation for light curve fitting does reduce the mass step by 0.021 ± 0.002 mag, they find little change to the Hubble residual dispersion.In order to ensure robust host galaxy masses, J23 only used 278 low-( < 0.15) SNe Ia to train its models -just 30% of the SALT3 training sample used in Kenworthy et al. 2021 (hereafter K21).
Here, we build on the investigation of J23 by splitting the complete K21 SALT3 training sample of 1083 SNe Ia on host galaxy mass.We take a different approach to J23, choosing to use the traditional two-parameter SALT3 model (without any host galaxy considerations) and simply train separate SALT3 surfaces (i.e., model iterations) from a "high-host-mass" ( host > 10 10 ⊙ ) training sample and "low-host-mass" sample.In this way, we can see how all SALT3 components change with host-mass, rather than parameterising the impact of host-mass within a single SED component.We focus on comparing the results on a cosmological level by performing a reanalysis of the spectroscopically confirmed Dark Energy Survey 3 Year SN data (Brout et al. 2019a;Brout et al. 2019b;Abbott et al. 2019).
The SALT3 model training procedure underlying this work is outlined in §2.Our methods for training SALT3 models on samples with different host-mass distributions are given in §3.We present and discuss all results in §4.Our conclusions are presented in §5.

SALT3 MODEL TRAINING
The SALT3 model spectral flux density at phase and wavelength is defined as: where 0 , 1 and are the parameters of a particular supernova and 0 , 1 , and are the global model components.These components are iteratively fit using a 2 minimization process during the model training.
The SALT3 training process is outlined in K21.This process requires an input training sample of photometric and spectroscopic data from well-measured SNe Ia, along with information about the systems on which the photometry was observed, the redshifts, and a set of starting guesses for the SALT3 model components and training sample light curve parameters.We use the the public SALTShaker training code with the default public K21 training configuration in this work, and select our training subsamples from the K21 training sample.3Following Taylor et al. (2023), we use the photometric calibration presented in Brout et al. (2022b). 4We make the following modifications to the default SALT3 configuration: • The number of component fitting iterations between each error fitting iteration is changed from 5 to 10, for computational speed.The error estimation is the most computationally-intensive part of the training process, and this choice halves the time spent in that training stage.This is not anticipated to yield any noticeable difference to the model surfaces (Jones & Kessler 2018).• Following Dai et al. (2022), we reduce the maximum number of spectral calibration parameters from 10 to 5.During the training, spectra are iteratively recalibrated to match the best-fit model.This accounts for any poor-quality calibration in the spectral training sample, which often has large uncertainties.However, using too high an order for the recalibration polynomial may fit out real spectral features.
• The spectral 2 scaling, which ensures roughly equal photometric and spectroscopic contributions to the final model reduced 2 , is fine tuned for each surface such that the contributions are within 10%.This procedure, which follows Guy et al. (2007); Kenworthy et al. (2021), ensures that the spectroscopic and photometric training data are considered with equal weighting, despite the spectroscopic uncertainties often being much larger.
• SALTShaker defines the training convergence to occur when the change in 2 between consecutive iterations is less than 1.We find that for some of the models trained in this work, this criteria is never reached.In this case, we consider a model to have "converged" if the change in 2 is less than 0.01%.

INVESTIGATING HOST GALAXY MASS IN THE SALTMODEL
Here, we investigate the possibility that separate intrinsic populations of SNe Ia exist, and are correlated with host galaxy mass.We perform separate light curve modelling for SNe Ia in "low-mass" and "high-mass" host galaxies.We split our populations using host stellar mass, as this is the most commonly used proxy for the host galaxy-luminosity relationship in SN cosmology analyses -SNe in high-mass hosts are observed to have brighter post-standardization luminosities than those in low-mass hosts. 5The methods developed in this work could be applied to other host galaxy parameters of interest (e.g., host galaxy colour, age, morphology, specific star formation rate), provided that there exists sufficient data to train the models.

SALT3 Training Samples and Resultant Surfaces
We use host-masses from the public K21 data files, which were provided for 662 of the 1083 SNe.The missing masses were primarily from SNe in the Sloan Digital Sky Survey (SDSS, Holtzman et al. 2008), Supernova Legacy Survey (SNLS, Astier et al. 2006), and various low-surveys (Jha et al. 2006).We supplement the K21 host-mass information with data from SNLS (M.Sullivan 2021, private communication) and the Joint Light Curve Analysis (JLA, Betoule et al. 2014).This gives us mass information for all but 20 SNe in the K21 sample.We split the K21 sample to construct two training sub-samples: TS-HIGHMASS, consisting of SNe in "highmass" ( log > 10) host galaxies, and TS-LOWMASS, consisting of SNe in "low-mass" ( log < 10) host galaxies.We apply the mass-split at log = 10 (where log = log 10 ★ / ⊙ ) for consistency with the literature; this is slightly higher than the mean log = 9.483 of the full sample.Summary statistics of the training samples are given in Table 1.
Unlike J23, we have not derived our own host galaxy masses.It's therefore possible that using multiple sources of host galaxy mass could introduce some errors.However, as we are only applying a broad cut at log = 10 and do not use the host galaxy mass information in the actual training process, our masses need only be accurate enough to assign each SN to the appropriate high-or low-host-mass bin.The K21 training sample has a mean low-hostmass error of low log = 0.661 and a mean high-host-mass error of high log = 0.469. 6aking use of publicly available host-masses allows us to use nearly all of the K21 training sample SNe, while J23 could only use a fraction of the low-SNe.It is important to include as many SNe in the SALT3 training as one reliably can.The training of SALT models will only converge if there is adequate phase-wavelength coverage in the training sample.Even if the training does converge, low-coverage regions of the phase-wavelength space may be poorly constrained or suffer from training artefacts that require tailored regularization.We do not adjust the regularization parameters in this work, instead choosing to only focus on results obtained from well-constrained regions of the trained models.
In order to assess the reliable regions of our trained models, we provide phase-wavelength coverage plots for each of the training samples (Figure 1).TS-HIGHMASS largely follows the phasewavelength space footprint of the K21 sample, with only minor gaps in some late-phase spectral coverage.The density of the data coverage is at least as good as that of the Joint Light curve Analysis training sample from Betoule et al. (2014) (shown in Figure 10 of K21) which was used to produce the SALT2.4"JLA" model -the most widely used model in the literature (e.g., Betoule et al. 2014;Abbott et al. 2019;Scolnic et al. 2018).We are therefore confident that our high-host-mass training sample is sufficient to produce a reliable SALT3 model ("SALT3.HIGHMASS").
While the relative phase and wavelength coverage of TSlowmass broadly matches that of the K21 sample, the number of SNe is lower.It is even lower than the JLA sample, with the maximum density bins reduced by ∼50%.This follows directly from the ∼50% reduction in the number of SNe in TS-LOWMASS.The spectral coverage is also slightly poorer: there are gaps at multiple phases in the rest-frame ultraviolet (UV, 4000Å) and near-infrared (NIR, 9000Å) regions -though the spectroscopic coverage matches JLA reasonably well for the other regions. 7iven this limited data, we carefully inspect the resultant surface ("SALT3.LOWMASS") components for ringing artefacts.We find The relative distribution of data between samples is broadly similar, with slightly higher spectroscopic coverage in SALT3.HIGHMASS (which has a higher proportion of low-redshift SNe).These plots were generated using SALTShaker.
evidence of significantly ringing in the 1 component (and mild ringing in 0 ) at many phases, including peak, for 4000Å (e.g., Figure 2).At longer wavelengths, our SALT3.LOWMASS surface appears to be smoothly constrained (there is a peculiar 1 feature around 6000Å, but we find no evidence to suggest this is a training artefact).Ameliorating the UV ringing artefact would require a refined SALT3 regularization scheme or a supplementary training sample of SNe in low-mass host galaxies, which are beyond the scope of this paper.Instead, we quantify the impact of the UV ringing by performing two identical cosmological analyses over different rest-frame wavelength ranges, that either include or exclude the affected region of the SALT3.LOWMASS SED.
A further limitation in our analysis relates to scaling employed by SALT3. 8The SALT3 training performs a scaling of the SN parameters, such that the training sample has Dai et al. 2022).Similarly, the SALT3 components are defined relative to the training sample demographics: 0 is the average SED for a 1 = 0, = 0 SN, and 1 scales with 1 .Assuming our input training sample distributions are representative of the true 1 and populations of SNe in high-and low-mass host galaxies as they occur in the Universe, any differences in the 0 components should represent the average astrophysical differences between SNe originating from high-and low-mass host galaxies.However, any biases in the input training sample distribution will complicate this interpretation.The end result is that the fitted light curve parameters from each model cannot be meaningfully com- pared without a scaling correction.We therefore compare the SN parameters that are determined from fitting a single SALT3 surface to the entire training sample.We plot these parameter distributions for each of our mass-split training samples in Figure 3.While the distributions are largely consistent across our training samples, there is (unsurprisingly) a significant change in the 1 distributions -SNe Ia that occur in more massive galaxies tend to be faster declining, so have lower 1 values (Hamuy et al. 1995;Branch et al. 1996;Hamuy et al. 2000;Howell 2001;Gallagher et al. 2005). 9Moreover, the low-mass training sample has a narrower range of 1 values than the high-mass or full training samples.

Methods to Assess the Impact on Distances and Cosmology
We use the Dark Energy Survey's 3-year spectroscopically confirmed SNe Ia sample and its companion low-sample (DES-SN3YR Brout et al. 2019a) to test the end-to-end cosmological impact of treating SNe Ia from different host populations as intrinsically different.We fit the DES-SN3YR light curves with our SALT3.HIGHMASS and SALT3.LOWMASS surfaces, as well as with a control surface, SALT3.FRAG (Taylor et al. 2023), which was trained with the same inputs but uses the entire K21 training sample.We split the DES-SN3YR sample SNe according to the surface demographics -for example, we fit the entire sample with SALT3.FRAG, but only the log > 10 ⊙ subsample with SALT3.HIGHMASS.We use the published SVA1-Gold DES-SN3YR host-masses to select these subsamples (Abbott et al. 2019), as these were the masses used in the K21 training sample.The highhost-mass subsample contains 91 DES SNe and 98 low-SNe; the low-host-mass subsample contains 116 DES SNe and 24 low-SNe.There are 18 SNe in the DES sub-sample for which hosts could not be adequately identified.Following Brout et al. (2019b), these SNe are assumed to be in low-mass hosts (as high-mass hosts would likely be detectable).There are 10 "high-host-mass" DES SNe within 1 log of our chosen mass boundary ( log = 10), and 6 "low-host-mass" DES SNe within 1 log of the boundary.There are 28 high-host-mass, low-SNe within 1 log of our chosen mass boundary, and 11 low-host-mass, low-SNe within 1 log of the boundary.We use SNANA (Kessler et al. 2009) to fit the light curves over 3000 < eff < 8000Å, where eff is the rest-frame effective mean wavelength of the photometric filter.In this range, -band observations in low-SNe (and band observations at 0.3) will be impacted by the UV ringing artefact seen in the SALT3.LOWMASS surface (Figure 2).We explore the impact of this ringing by performing a similar analysis over a truncated rest-frame wavelength range (4000 < eff < 8000Å); this is presented in Appendix A.
We apply standard light curve cuts of −0.3 < < 0.3, −3 < 1 < 3 following Brout et al. (2019b); Jones et al. (2023).We use the SALT2mu procedure of Marriner et al. (2011) to fit nuisance parameters and distance moduli for three sets of light curve fit results -the full DES-SN3YR sample (including low-) fit with SALT3.FRAG, the high host-mass SNe fit with SALT3.HIGHMASS, and the low host-mass SNe fit with SALT3.LOWMASS.For these three sets of results, we include 1D bias corrections but do not include any host-mass correction ( host , Equation 1).We fit an additional set of distance moduli 9 A relationship with colour has also been found, though as we see in Figure 3, this are weaker than the stretch relationships (Smith et al. 2020).
(labelled SALT3.FRAG+Mass Step) that uses light curve parameters from the full DES-SN3YR sample fit with SALT3.FRAG and does include the fit of a mass step term, with log step = 10.These results use the same set of bias correction simulations as the nominal SALT3.FRAG results, and yield a value (Equation 2) that is comparable to the 1D mass step found in (Smith et al. 2020) ( Smith2020 = 0.066 ± 0.020mag, Taylor2023 = 0.079 ± 0.021mag).
The 1D bias corrections are calculated using large sets of realistic simulations of the DES-SN3YR data, which are generated following Kessler et al. (2019).Separate simulations are created for the low-and DES SNe, for each SALT3 surface.The simulated SN are drawn from parent populations that are calculated according to predicted distributions from each individual surface, following the methods of Popovic et al. (2021).While there are no explicit mass-luminosity relationships included in our simulations, each set selects SNe from parent populations that are calculated according to the corresponding real distribution of fitted DES-SN3YR data.For example, the SALT3.HIGHMASS simulations are generated from stretch and colour distributions that are calculated using only the high host-mass DES-SN3YR SNe fit with the SALT3.HIGHMASS model.Each set of bias correction simulations also uses the corresponding SALT3 model as the input model to generate the rest-frame SED.We use the G10 intrinsic scatter model of Kessler et al. (2013), based on Guy et al. (2010), and assume an underlying flat ΛCDM cosmology with Ω = 0.315, Ω Λ = 0.685, 0 = 70km/s/Mpc.These assumptions introduce a small systematic uncertainty in the recovered cosmological parameters (Table 8 of Brout et al. 2019b), though we only include statistical uncertainties in our results here.
For each set of results, we fit Ω and assuming a flat CDM model and applying a CMB prior, tuned such that the constraining power is similar to that of Planck (Planck Collaboration et al. 2016).This is performed with SNANA's fast cosmology fitting program, wfit.We verify our SALT3.HIGHMASS and SALT3.LOWMASS surfaces' ability to recover accurate cosmological parameters by using them to fit simulated DES-SN3YR light curves that are generated with a known cosmology.In these test, we recover the input Ω and values to within 2 (when applying a CMB prior).
We also test how much of the difference in results from our three surfaces can be attributed to the surfaces themselves, versus the use of different subsamples of DES-SN3YR SNe.We do this by fitting the high/low-host-mass DES-SN3YR subsample light curves with SALT3.FRAG, and using each set of results to obtain separate distance measurements and cosmological fits.The results of these tests (given in § 4.3) are labelled "SALT3.FRAG (HIGH)" and "SALT3.FRAG (LOW)".There is no explicit mass-step applied in this test, as we expect any luminosity difference between the subsamples to be absorbed into the parameter of Equation 1.

SALT3 Model Components
The combined 0 and 1 components of SALT3.HIGHMASS and SALT3.LOWMASS at different 1 values are shown in Figure 2. The difference in the 0 components is shown in Figure 4.These plots show that fiducial SNe Ia in low-mass galaxies appear to be bluer than those in high-mass galaxies.We observe this trend at all phases.As 0 is by definition the average SED of a 1 = 0, = 0 SNe, this result confirms that the definition of = 0 changes slightly between the two surfaces.
Apart from the difference in the colour, there are subtle  For reference, we also plot the Milky Way laws from (Cardelli et al. 1989) for different values of .For much of the wavelength range, the SALT3.HIGHMASS and SALT3.FRAG colour laws directly overlap.The shaded regions indicate the wavelengths where the SALT3 colour law is linearly extrapolated, rather than fit as a polynomial.
differences in some of the spectral features.The Si II features in SALT3.HIGHMASS (at 4000, 5770 and 6100Å) are slightly blueshifted compared to SALT3.LOWMASS.The blueshift reflects the velocity of ejecta coming towards us along the line of sight; higher ejecta velocities are associated with brighter SNe Ia (Benetti et al. 2005).The Si II feature at 5770Åis noticeably deeper in SALT3.HIGHMASS.The depth of this line relative to the stronger Si II feature at 6100Åcorrelates with SN Ia luminosity.Lower ratios correspond to fainter SNe Ia (Nugent et al. 1995).This is inconsistent with the blueshifts seen in the Si II lines but consistent with the redder colours of SALT3.HIGHMASS.
The × components (i.e. the exponential term from Equation 3) for a = −0.1 SN are shown in Figure 5.We find that SNe occurring in low-mass galaxies have a steeper colour law in the UV.J23 do not train separate phase-independent colour laws for low-host-mass or high-host-mass SNe; but they do find that the phase-dependent colours exhibit a change with host-mass.Their results disagree with our findings (though we reiterate that we use different training samples); the J23 model of high-host-mass SNe is slightly bluer than the low-host-mass model, both at peak phase and in the Lira law tail (Lira et al. 1998;Phillips et al. 1999).They find that these phase-dependent colour differences were previously captured by the 1 component and encompassed by the 1 parameter.
Our results may indicate that SNe in low-mass galaxies are intrinsically more luminous than those in high-mass galaxies.Alternatively, the change in colours between SNe from high-and low-mass hosts may be caused by changes in the extrinsic dust component.However, at the current level of modelling it is not possible to disentangle the intrinsic SN colour from the effects of host galaxy dust.

Light Curve Fits
As the SN parameters and nuisance parameters are defined on independent scales for each SALT3 surface, we cannot directly compare these results.For example, Figures 6-7 show that the definitions of and 1 have systematically changed between the three surfaces used in our analysis.We find that 1 (SALT3.LOWMASS) is systematically lower than 1 (SALT3.FRAG), and 1 (SALT3.HIGHMASS) is systematically higher.This is consistent with the findings of J23, as shown in their Figure 8 (where their 1 parameter is redefined to only include host-mass independent variability).We also find a (weaker) redefinition of such that for the same SN, (SALT3.LOWMASS) appears to be systematically higher than (SALT3.FRAG), and (SALT3.HIGHMASS) appears to be systematically lower.This follows directly from the rescaling expected based on Figure 3, in which the mean (i.e.model = 0 point) is lower for SALT.LOWMASS and higher for SALT3.HIGHMASS.In contrast, the results from J23 (where is defined on the same scale for all models) indicate that in reality, (SALT3.LOWMASS) is systematically lower than (SALT3.K21), and (SALT3.HIGHMASS) is slightly higher.This highlights the complications introduced by the floating definitions of 1 and .
Similarly, the nuisance parameters and (which quantify the global stretch-luminosity and colour-luminosity relationships of SNe Ia) are defined on different scales for each surface -though this effect is very slight for .The nuisance parameters are presented in Table 2.
The intricacies of comparing fitted light curve parameter values that we have demonstrated here will also apply when comparing any fitted light curve parameters from models trained on different samples (e.g. the Dark Energy Survey's 3-year versus 5-year SN analyses, which use light curve models from Betoule et al. 2014 andTaylor et al. 2023 respectively).
During light curve fitting, we lose four SNe to light curve fitting cuts related to the 1 and bounds across the LOWMASS and HIGHMASS samples.A further five fail the requirement of having at least one observation before peak -band maximum (as their estimated peak MJD have shifted when fit with our new surfaces).We lose 56 SNe due to the fit probability > 0.01 criterion, the majority of which are from the HIGHMASS low-sample and have high signal-to-noise ratios. 10Our SALT3.HIGHMASS model has lower model uncertainties than SALT3.FRAG or SALT3.LOWMASS, which leads to an increased reduced 2 for this data, thus causing it to fail fit probability cuts.This may be remedied in future work by increasing the number of light curve fit iterations when using models with low levels of uncertainty.

Distances and Cosmology
We compute the distance moduli for each set of light curve and nuisance parameter fits.In Figure 8, the distance modulus values do not display any systematic offset between the SALT3.FRAG and SALT3.HIGHMASS surfaces (left panel).However, the right panel shows a slight slope, such that more distant SNe are measured to be (on average) further away with SALT3.LOWMASS than SALT3.FRAG.This trend is not observed when repeating our analysis over a reduced rest-frame wavelength range of 4000 < eff < 8000Å (Figure A1).This abridged range restricts the use of the less reliable UV region of SALT3.LOWMASS, which is more impactful for distant SNe.
The spread of the Hubble residuals (HR) represents the level of remaining scatter in the standardised SNe Ia luminosities and measurement errors.We calculate Hubble residuals as HR = fit, − cosmology, for each set of results obtained from our different SALT3 surfaces.In Table 3, we report the root-mean-squared (RMS, i.e., spread of the Hubble residuals) for each set of results.We also report the weighted RMS.We define the weight of a particular HR to be = 1/ 2 fit .We note that the mean measured uncertainty in the distance modulus is lower for the SALT3.HIGHMASS results ( ¯ fit = 0.146 mag) than those from SALT3.FRAG ( ¯ fit = 0.156 mag) or SALT3.FRAG + Mass Step ( ¯ fit = 0.155 mag).The mean uncertainty for the SALT3.LOWMASS results is slightly higher ( ¯ fit = 0.157 mag).For each set of results, we estimate the uncertainty in RMS HR using bootstrapping, resampling the HR values 1000 times and reporting the standard deviation in the resulting distribution.
We split the SALT3.FRAG results into low-and high-hostmass subsamples and calculate the HR results for each, using the best-fit cosmology of the full sample.We also report the HR results for the low-and high-host-mass subsamples fit with SALT3.FRAG (a.k.a., SALT3.FRAG (HIGH/LOW)), where the best-fit cosmology parameters are calculated for each subsample independently.
We find measuring the HR of a subsample of SNe from highmass hosts lowers the RMS HR by 0.5-1.2(depending on weighting, application of mass step, and subsample used to calculate bestfit cosmology) compared to using the full DES-SN3YR sample fit with SALT3.FRAG.When fitting the high-host-mass sample with SALT3.HIGHMASS, we find a reduction of 0.1-0.9 in the RMS HR relative to the same sample fit with SALT3.FRAG.
On the other hand, using a subsample of SNe from low-mass hosts increases the RMS HR by 0.5-1.2 , compared to using the full DES-SN3YR sample fit with SALT3.FRAG, when both samples use the SALT3.FRAG best fit cosmology.When fitting the low-host-mass sample with SALT3.LOWMASS, we intead find a reduction of 0.5-0.7 in the RMS HR relative to the same sample with SALT3.FRAG.
These results indicate that the DES-SN3YR SNe in high-mass hosts are better standardisable candles than those    in low-mass hosts, and that the standardisability in the DES-SN3YR sample is slightly improved by using custom SALT3.HIGHMASS and SALT3.LOWMASS surfaces (albeit at low statistical significance).This is consistent with the results of J23, who also find negligible changes in the HR scatter.The SALT3.HIGHMASS and SALT3.LOWMASS surfaces yield similar cosmology results, with the estimated value of shifting by an insignificant 0.6-1.3stat compared to the SALT3.FRAG results (Table 2).Combining the binned Hubble diagrams of the mass-split samples in order to fit a single cosmological model from all the data is not possible with the current fitting infrastructure, but would be necessary if our methods were to be adopted by future cosmology analyses.

The Mass Step
In Figure 9, we plot the Hubble residuals as a function of log for each surface.We also show the binned weighted average the Hubble residuals in the two mass step bins, and approximate the mass step as the difference in weighted averages between the high-and lowhost-mass sets of SNe.11We bootstrap our samples 1000 times to obtain the uncertainty on the mass step.
We find that our method of treating SNe from low-and highmass host galaxies as unique populations in the SALT3 framework removes the observed mass step as efficiently as our explicit fitting for , from step = -0.044± 0.019 mag (SALT3.FRAG) to step = 0.006 ± 0.019 mag (SALT3.LOWMASS and SALT3.HIGHMASS).However, the fitted fiducial SN ( 0 = 1, 1 = 0, = 0) absolute magnitude from Equation 1 changes by a +0.14 magnitudes between SALT3.LOWMASS and SALT3.HIGHMASS, such that SNe in high mass galaxies are fainter.This Δ arises from both the redefinition of a fiducial SN between different surfaces and the absorption of the host-mass luminosity step.For our SALT3.FRAG (HIGH) and SALT3.FRAG (LOW) results -which are based on the same surface and so have a consistently defined fiducial SN -we find Δ = −0.04.In this case, the observed mass step is also "removed" (step = -0.0005).This result also holds when calculating HR with respect to a reference ΛCDM cosmology from Planck Collaboration et al. ( 2020), as was done in J23.
Without enforcing a consistent definition of the parameters across different light curve models, we cannot claim that the treatment of the mass-step benefits from adopting bespoke SALT3.HIGHMASS and SALT3.LOWMASS light curve models for sub-populations of SNe in our analysis.The analysis performed by J23 (in which the parameters are defined on a consistent scale, and the host-mass contribution to the SN flux is modelled and then excluded from the calculation of distances) suggest that improved SALT3 modelling should at least partially reduce the mass-step (by 0.021±0.002mag).

Future Work
Here, we have used host-masses from a variety of sources, relying on the broad, binary nature of the observed mass step to mitigate any biases from inconsistent mass estimates.While this has allowed us to use the full constraining power of the K21 training sample without having to derive our own masses for ∼ 1000 host galaxies, this analysis should be repeated with more consistently derived masses for both the training and cosmology samples, in order to develop this method into a robust tool for current SN cosmology analyses.For example, we use the DES-SN3YR SVA1-Gold catalogue hostmasses described in Brout et al. (2019b) for the DES-SN training and cosmology samples, but more recent work by Wiseman et al. (2020); Smith et al. (2020) updates these masses using the full 5yr DES deep-stack photometry.19 DES SNe change "mass bins" with the rederived masses.Similarly, 107 of our K21 training sample are assigned to different mass bins in the Pantheon+ analysis (Scolnic et al. 2022).Future work should also consider the distributions of host galaxy mass in the training and cosmology samples; many of the SNe used in this work were discovered by targeted surveys which biases our sample.Adopting larger training and cosmology samples would reduce the statistical uncertainties on SN distance moduli, providing more clarity on the statistical significance of the potential reduction in HR scatter hinted at in this work.
An interesting extension to this work would be to apply our methods to other host galaxy parameters, such as the star formation rate.Childress et al. (2014) predicts that there are a higher number of young SNe in low-mass galaxies -and that this relationship appears to evolve with redshift.Childress et al. (2014) conclude that SNe Ia in low-mass, active star-forming galaxies represent a more standardisable population for cosmology analyses.We have already seen promising results for SN standardisability when splitting the populations according to host galaxy mass, despite the limitations of our SALT3.LOWMASS model (e.g., inconsistent sources of mass,   poor coverage of the training sample, rest-frame UV ringing artefacts).Modelling separate populations of SNe according to parameters such as specific star formation rate (sSFR, i.e. the star formation rate normalised by the stellar mass) may not only help to identify any unique physical processes in the SN Ia explosions, but reduce the intrinsic scatter remaining in SN cosmology analyses (Rigault et al. 2015(Rigault et al. , 2020)).
Our straightforward method could be extended to other models that include additional SN Ia variability (e.g., SNEMO, Saunders et al. 2018, SUGAR, Léget et al. 2019, BayeSN (Mandel et al. 2020)).Using models that include near infra-red data (e.g., BayeSN or SALT3.NIR Pierel et al. 2022) may be a promising avenue to break the degeneracy between host galaxy mass (or other properties), intrinsic colour, and extrinsic dust effects.However, further use of this method must develop a solution for the pervasive light curve and nuisance parameter scaling problem.Without a standardised, surface-independent definition of 0 , 1 and , any results derived using different SALT surfaces will be difficult to interpret.Extensions to this work would also benefit from a program to combine the Hubble diagrams from multiple, independently fit samples into one input for cosmological model fitting (for example, an extension of the current Pippin pipeline for SN cosmology, Hinton & Brout 2020).Without this infrastructure, the method of J23 is currently the clearest way to incorporate spectrophotometric modelling of SNe Ia sub-populations into cosmology analyses.

SUMMARY AND CONCLUSIONS
For decades, SN studies have observed relationships between the luminosity of a SN Ia and its host galaxy properties.These relationships may indicate the existence of intrinsically unique subpopulations within the class of SNe Ia.If that is case, the overall distance measurements and standardisation (i.e. the distance moduli dispersion/Hubble residual scatter) of SNe Ia could be improved by modelling each intrinsic population separately.
To test this hypothesis, we have produced and tested separate SALT3 light curve models for SNe Ia that explode in high-or lowmass host galaxies.This straightforward method utilises publicly available training programs and data samples, as an alternative and complement to the method of J23 (which alters the SALT3 model to include a host galaxy mass component).
In either method, the resulting high-resolution model spectrum may provide astrophysical clues as to the origins of the observed mass step that would not be obvious when looking at the population photometry or single-object spectra.We find evidence of minor differences in Si II spectral features between low host-mass and high host-mass samples, in agreement with Foley & Kasen ( 2011 While we find that fiducial SNe in low-mass hosts are intrinsically bluer than those in high-mass hosts, J23 find the opposite.The inconsistency in the fiducial SN SED (which is defined relative to the mean 0 , 1 , values of a SALT3 training sample) highlights the sensitivity of the SALT3 model to the training sample demographics.This simple finding reinforces the importance of a quality, unbiased light curve training sample (which must be comprehensive and representative of the natural SN populations) for SN Ia analyses (Dai et al. 2022).
Here, the fitted SN parameters 0 , 1 , are defined independently for each surface, leading to unique definitions of the fitted nuisance parameters , , and for each surface.We assume that the standardised distance moduli are defined consistently across all surfaces, and so can be fairly compared.We recover consistent cosmological parameters for a flat CDM model (with CMB priors from Planck Collaboration et al. 2016) from our low host-mass and high host-mass analyses.We find that fitting the high hostmass subsample of SNe Ia from DES-SN3YR is intrinsically more standardisable than the low host-mass subsample when fit with the same SALT3.FRAG surface, with a ∼ 2 difference between the Hubble residual RMS of these subsamples.Using our custom SALT3.LOWMASS and SALT3.HIGHMASS surfaces to fit these subsamples reduces the Hubble residual RMS of both by < 1 .Though J23 suggest that the observed mass-step in the Hubble residuals should be reduced when including host mass considerations in the SALT3 modelling, the varying definitions of the nuisance parameter mean we cannot conclusively claim this is the case in our analysis.Standardising the definitions of 0 , 1 and (and therefore, the nuisance parameters of Equation 1) across different light curve models is a critical step in realising the full potential of our method.This is also important for comparing any results generated using different light curve models; the nuance of the redefined parameters may not be obvious, but it can be impactful.
We have demonstrated a novel application of the SALT3 light curve model training procedure, that has the potential to improve SN Ia standardisability.Our straightforward method may be easily used to incorporate new, high-quality training data or explore other host-galaxy relationships.Moreover, this is proof of how far the light curve model training infrastructure has developed in recent years: the SALTShaker training code we use is publicly available, well documented (and tested), and actively maintained, making it a useful tool for the entire SN community.
the Deutsche Forschungsgemeinschaft, and the Collaborating Institutions in the Dark Energy Survey.The Collaborating Institutions are Argonne National Laboratory, the University of California at Santa Cruz, the University of Cambridge, Centro de Investigaciones Energéticas, Medioambientales y Tecnológicas-Madrid, the University of Chicago, University College London, the DES-Brazil Consortium, the University of Edinburgh, the Eidgenössische Technische Hochschule (ETH) Zürich, Fermi National Accelerator Laboratory, the University of Illinois at Urbana-Champaign, the Institut de Ciències de l'Espai (IEEC/CSIC), the Institut de Física d'Altes Energies, Lawrence Berkeley National Laboratory, the Ludwig-Maximilians Universität München and the associated Excellence Cluster Universe, the University of Michigan, the National Optical Astronomy Observatory, the University of Nottingham, The Ohio State University, the OzDES Membership Consortium, the University of Pennsylvania, the University of Portsmouth, SLAC National Accelerator Laboratory, Stanford University, the University of Sussex, and Texas A&M University.Based in part on observations at Cerro Tololo Inter-American Observatory, National Optical Astronomy Observatory, which is operated by the Association of Universities for Research in Astronomy (AURA) under a cooperative agreement with the National Science Foundation.Table A1.The best fit nuisance parameters (from Equation 1) and cosmology parameters for the DES-SN3YR sample when fit with different SALT3 surfaces, when using a reduced wavelength range for light curve fitting.

Figure 1 .
Figure 1.Data density plot showing the phase-wavelength coverage of the full K21, SALT3.HIGHMASS, and SALT3.LOWMASS training samples, applying a consistent heatmap scaling.The upper panels show the coverage of the photometric data, and the lower panel shows the spectral coverage.The relative distribution of data between samples is broadly similar, with slightly higher spectroscopic coverage in SALT3.HIGHMASS (which has a higher proportion of low-redshift SNe).These plots were generated using SALTShaker.

Figure 2 .
Figure2.The combined 0 and 1 components of SALT3 at various 1 values for the trained SALT3.LOWMASS and SALT3.HIGHMASS surfaces, as well SALT3.FRAG(Taylor et al. 2023).There is clear evidence of ringing (rapid fluctuations in phase or wavelength space) in the UV region of the SALT3.LOWMASS 1 component, which limits its usefulness in this analysis.

Figure 5 .
Figure 5.The colour law component of our trained SALT3.LOWMASS and SALT3.HIGHMASS surfaces, as well as SALT3.FRAG, for a = −0.1 SN.For reference, we also plot the Milky Way laws from(Cardelli et al. 1989) for different values of .For much of the wavelength range, the SALT3.HIGHMASS and SALT3.FRAG colour laws directly overlap.The shaded regions indicate the wavelengths where the SALT3 colour law is linearly extrapolated, rather than fit as a polynomial.

Figure 8 .
Figure 8.As in Figure6, for distance moduli.We plot the difference in distance moduli on the y-axis.

Figure 9 .
Figure 9. Correlations between host galaxy masses and Hubble residuals for the DES-SN3YR sample when fit with the SALT3.FRAG (upper plot), SALT3.HIGHMASS, and SALT3.LOWMASS surfaces.Here, the Hubble residuals are taken from the best-fit cosmology for each model; a similar result is obtained when calculating Hubble residuals from a constant reference cosmology (i.e., ΛCDM from Planck Collaboration et al. 2020).The mass step for each set of results is approximated as the difference in weighted mean Hubble residual between the high-and low-host-mass samples, with uncertainties estimated using bootstrapping.We plot a truncated mass range for visual clarity, but use the full range for the mass step calculations.The Hubble residuals are offset on the y-axis for each set of results.In the lower plot, the mass-step is ∼ 0 by design; the change in luminosity is absorbed by the SALT3 model and the term of Equation 1 (see Table2).

Figure A1 .
Figure A1.Comparison of the distance moduli for the DES-SN3YR sample when fit over an abridged range of 4000 ≤ rest ≤ 8000Å with SALT3.FRAG (including a mass step) versus SALT3.HIGHMASS (left panel, pink points) and SALT3.LOWMASS (right panel, blue points).We plot the difference in distance moduli on the y-axis.

Table 1 .
Host galaxy mass statistics for the mass-split SALT3 training samples used in this work.The low-redshift subsample tends to include more high-mass host galaxy SNe, while the untargetted high-redshift subsample favours low-mass hosts.

Table 2 .
Abbott et al. (2019)e parameters (from Equation1) and cosmology parameters for the DES-SN3YR sample when fit with different SALT3 surfaces.All results are obtained using the fast cosmology fitting program, wfit, and include a Planck Collaboration et al. (2016) cosmic microwave background (CMB) prior.As we have used a different set of surfaces to the original DES-SN3YR analysis (which used SALT2.JLA fromBetoule et al. 2014), and apply 1D rather than 5D bias corrections, we do not expect our results to exactly match those presented inBrout et al. (2019b);Abbott et al. (2019).The SALT3.FRAG (HIGH/LOW) results that are split by host galaxy mass have this selection applied prior to the SALT2mu distance fitting stage.The cosmology fit 2 values have been calculated without including contributions from the SALT3 model uncertainties, which change between surfaces.

Table 3 .
Scatter in the Hubble residual (from best-fit cosmology) for each SALT3 surface.All values are in units of magnitude.The best-fit cosmology is unique for each entry in the "Result" column.The SALT3.FRAG (HIGH) and (LOW) results are split by host galaxy mass prior to the SALT2mu fitting stage, to isolate the effects of fitting cosmology with a subsample of DES-SN3YR SNe versus the effects of changing surfaces.

Table A2 .
Scatter in the Hubble residual (from best-fit cosmology) for each SALT3 surface, when using a reduced wavelength range for light curve fitting..All values are in units of magnitude.The best-fit cosmology is unique for each entry in the "Result" column