On the impact of spectral template uncertainties in synthetic stellar populations

Uncertainties in stellar population models, both in terms of stellar evolution and stellar spectra, translate into uncertainties in our interpretation of stellar populations in galaxies, since stars are the source of most of the light we receive from them. Observations by JWST are revealing high-redshift galaxies in great detail, which must then be compared to models. One significant source of uncertainty is in the stellar spectra used to generate composite spectra of stellar populations, which are then compared to data. Confidence in theoretical models is important to enable reliable determination of the properties of these galaxies such as their ages and star formation history. Here we present a comparison of spectral synthesis carried out with 6 different stellar spectral libraries using the Binary Population and Spectral Synthesis (BPASS) framework. In photometric colours, the differences between theoretical libraries are relatively small (<0.10 mag), similar to typical observational uncertainties on individual galaxy observations. Differences become more pronounced when detailed spectroscopic properties are examined. Predictions for spectral line indices can vary significantly, with equivalent widths differing by a factor of two in some cases. With these index strengths, some of the libraries yield predictions of ages and metallicities which are unphysical. Many spectral libraries lack wavelength coverage in the ultraviolet, which is of growing importance in the era of JWST observations of distant galaxies, whose flux is dominated by hot, young stars.


INTRODUCTION
Fitting observed galaxy spectra, or less detailed spectral energy distributions (SEDs), to stellar population synthesis (SPS) models is a commonly used technique for determining the age, composition and star formation history of the stellar population in a distant galaxy where the individual stars cannot be resolved (for a review of the SED fitting technique, see Conroy 2013). However, this method is sensitive to the choice of SED fitting code and underlying theoretical models, meaning that there is an intrinsic level of uncertainty associated with the results (e.g. Pacifici et al. 2022). As a consequence, these inferred properties are sensitive to the uncertainties in the population synthesis models, including the nature of the stellar population (single vs multiple stars), the stellar evolution models used (uncertainties in massive star evolution, convection, etc.) and the choice of stellar spectra (wavelength coverage, spectral resolution, atomic line lists, choice of radiative transfer code, etc.). Understanding how these uncertainties propagate through stellar population synthesis models and the degeneracies they introduce in inferred E-mail: conor.byrne@warwick.ac.uk (CMB) galaxy properties has implications for our understanding of galaxy evolution.
The use of SPS models to interpret the appearance of galaxies is an area of research which was pioneered by Tinsley (1968). SPS models start with a set of stellar evolution models, weighted according to an initial mass function (IMF) which describes the proportion of stars with a given set of initial conditions that are expected to form in the stellar population. The evolution of these stars is then followed simultaneously to trace the evolution of the stellar population. Stellar spectra which match the atmospheric parameters of the stellar evolution models are combined in accordance with their relative contribution to the total luminosity of the stellar population at a given age and composition to produce a composite spectrum. Composite spectra for galaxies with complex star formation histories can be produced by combining simple stellar populations of different ages. Many different population and spectral synthesis models exist in the literature, each using a different combination of stellar evolution models and stellar spectral template libraries (e.g. Bruzual & Charlot 2003;Eldridge et al. 2017;Maraston et al. 2020).
In young stellar populations, such as those in the distant Universe, massive stars are the dominant contributors to the spectrum. Massive stars are expected to be found often in multiple star systems (Sana et al. 2012). Stellar multiplicity fractions also show an anti-correlation with stellar metallicity (e.g. El-Badry & Rix 2019; Moe et al. 2019;Donada et al. 2023). Binary stellar interactions can significantly alter the composite spectrum of a stellar population . For example, mass transfer can lead to rejuvenation of some stars, whereby accreting hydrogen-rich material spins up the star, mixing fresh hydrogen into the core. This allows them to stay on the Main Sequence longer, leading to a population appearing bluer than an equivalent population comprised solely of single stars of the same age. Therefore it is important to use SPS models which include binary effects when examining young galaxies.
There are multiple approaches to producing stellar spectra for population synthesis models. In some cases, empirical spectra are used, whereby spectra of observed stars are used to calculate the composite spectrum of the population (e.g. Sánchez-Blázquez et al. 2006). Using empirical spectra has the advantage of using real observations, and thus a spectrum of a stellar population is produced using a combination of stars as they actually appear in the sky. The limitation of this method is that it relies on detailed spectroscopic observations of bright, nearby stars and cannot necessarily be used to understand populations which have compositions or temperatures which are considerably different from stars close to the Sun. A related option is the use of semi-empirical spectra (e.g. Knowles et al. 2021), which modify the empirical spectra by applying corrections based on the changes in flux between slightly varied theoretical spectra. This expands the region of parameter space in which spectra based on empirical data can be used, but is still limited to conditions which are at least somewhat similar to the observed stars.
Another option is to solely use theoretical spectral libraries. These spectra are produced by determining the flux emitted from a model stellar atmosphere using a radiative transfer calculation, and can in principle be constructed for any arbitrary composition and effective temperature (e.g. Kurucz 1970). This is more advantageous than empirical libraries for studying young stellar populations at high redshift, where the typical composition, mass and temperature of the stars in the population differ considerably to those commonly seen in the local environment and used in empirical libraries. A limitation of theoretical libraries is that they rely on tabulated atomic physics data (opacities and atomic absorption line strengths), some of which are not well constrained. Different software packages follow a variety of approaches to model atmosphere construction and radiative transfer calculation, and use specific linelists of atomic and molecular transitions, and thus can produce varied predictions for a theoretical spectrum, despite having the same grounding in atomic physics.
An evaluation of the differences between theoretical and empirical spectral libraries in population synthesis has recently been carried out by Coelho et al. (2020). They find that photometric colours are sensitive to the parameter space coverage of the spectral grid, particularly for bluer filters. This is an unsurprising result given that the high-temperature portion of parameter space is poorly covered by the empirical spectra. One key derived quantity which differs between empirical and theoretical spectra is the metallicity determination of a stellar population. Theoretical libraries systemati-cally infer a lower metallicity than those inferred by fitting with empirical libraries.
The abundance of chemical elements increases at different rates, as a consequence of the many processes underlying their formation (e.g. Kobayashi et al. 2020). Large quantities of α-process elements such as oxygen and magnesium are synthesised in core collapse supernovae, an end state of massive star evolution, which happens on a comparatively short timescale. Meanwhile, Type Ia supernovae, which are an end state of low mass stellar evolution produce the majority of iron group elements (e.g. iron and nickel) on a much longer timescale. As a result, pristine, primordial gas will first become enriched in α-process elements before there is significant growth in the abundance of iron group elements. This process, referred to as α-enhancement, has been observed in deep spectroscopy of distant populations of galaxies at z ∼ 2 (Steidel et al. 2016;Strom et al. 2022) and z ∼ 3.4 (Cullen et al. 2021).
This non-uniform chemical evolution changes the behaviour and appearance of the stars which form in such an environment, adding another source of uncertainty to the interpretation of stellar populations. Young stellar populations in the early Universe, which will be observed in great numbers by JWST , are one of the environments where this effect will be most pronounced. As a result, it is important that efforts to produce theoretical composite spectra of stellar populations formed in such an environment take into account the difference in the chemical abundance pattern, rather than simply scaling the patterns seen in nearby, high-metallicity environments.
Studies have been done on the effects that α-enhanced compositions have on low-mass stellar evolution models, and the implications this has for stellar population synthesis. For example, Pietrinferni et al. (2021) calculated a grid of αenhanced stellar evolution models using the BaSTI stellar evolution code (Pietrinferni et al. 2006;VandenBerg et al. 2014). That study set out to explore the impact of old, lowmetallicity stellar populations in the Milky Way and the nearby environment, another regime where stars tend to be αenhanced. These are effectively the same stellar populations, just at vastly different ages. The young stellar populations in the distant Universe still contain the massive stars which are not present in old stellar populations, the evolution of which is significantly impacted by binary star interactions. In these works using the BaSTI code, the maximum initial stellar mass of the evolution models was 15 M , and only single star evolution was considered. This limits the applicability of such models to young stellar populations in the distant Universe.
The effect of α-enhancement on stellar spectra has also been explored in the context of stellar population synthesis. Vazdekis et al. (2015) used α-enhanced stellar evolution isochrones up to 10 M at a resolution of ∼ 2.5Å to measure the effect it has on intermediate and old stellar populations at optical wavelengths. A key finding was that α-enhanced populations appear bluer at blue wavelengths than their Solarscaled counterparts. Percival et al. (2009) also investigated α-enhanced stellar populations. They did so in the context of Galactic globular clusters, and found that such models could successfully reproduce the spread of ages and compositions seen in a large sample of Milky Way globular clusters. These studies neither accounted for the role of binary stellar evolu-tion, nor explored the far-ultraviolet spectrum, both of which are key considerations for understanding young stellar populations. Stellar population synthesis models including binary stars and incorporating a single grid of α-enhanced stellar spectra were calculated by Byrne et al. (2022). These models gave good agreement with the earlier studies, while also providing spectra which cover the UV spectrum. Several theoretical stellar spectral libraries have been produced which account for α-enhancement. A number of these will be outlined in Section 3.1. Each library uses differing methodology to generate its spectra, and thus the output spectra for a given set of stellar parameters will not be identical.
In general, each spectral library has been used in conjunction with a specific population synthesis model. To date, there has been no systematic study of different spectral libraries using a fixed binary stellar population synthesis model to evaluate how the choice of spectral library can change the predicted appearance of the same, fixed assortment of stars.
In this paper, we explore the impact that choices of spectral library can have on the observed properties of a stellar population, and the implications that they have for inferred properties of distant galaxies. Section 2 outlines the population synthesis methods used, while Section 3 outlines the various spectral template libraries considered and the methods used to normalise them for best comparison. Results are presented in Section 4, with further discussion in Section 6, future developments outlined in Section 7 and concluding remarks in Section 8.

POPULATION SYNTHESIS
Binary Populations and Spectral Synthesis (bpass, Eldridge et al. 2017;Byrne et al. 2022) is a framework of stellar population and spectral synthesis codes, notably including binary stellar evolution and the impact of binary interactions, tracing the evolution of simple stellar population from an age of 1 Myr up to 100 Gyr using a grid of detailed single and binary stellar evolution models.
Individual stellar evolution models are combined using a Kroupa (2001) initial mass function extending from 0.1 to 300 M to produce a stellar population of mass 10 6 M . Empirical estimates from Moe & Di Stefano (2017) of the period distribution, mass ratio distribution and mass-dependent binary fraction are used to weight the stellar models. The stellar evolution models are then matched according to their temperature, surface gravity and composition to an appropriate stellar atmosphere model to produce an integrated spectrum for the stellar population. This is usually achieved by selecting the appropriate template library based on the type of star (Main Sequence/Giant Branch, White Dwarf, Wolf-Rayet star, etc.) and interpolating between spectra in the library which bracket the stellar evolution model in effective temperature and surface gravity. Models which lie slightly outside the parameter coverage of the spectral template library are assigned the spectrum of the nearest template in the spectrum, rather than using extrapolation. The stellar evolution models assume a Solar abundance of Grevesse & Noels (1993) with a metal mass fraction Z = 0.02 and a hydrogen mass fraction scaled by X = 0.75−2.5Z. The detailed stellar models trace the radial distribution and nucleosynthesis of H, He, C, N, O, Ne, Mg, Si and Fe. Detailed descriptions of the stellar evolution physics and construction of the models can be found in Eldridge et al. (2017).
bpass has been used extensively to predict and interpret the appearance of massive star populations (e.g. Pahl et al. 2021;Cullen et al. 2021;Vijayan et al. 2021;Neugent 2021;Marshall et al. 2020;Stevance et al. 2020b;Neugent et al. 2020;Shivaei et al. 2020;Topping et al. 2020;Cullen et al. 2020;Dorn-Wallenstein & Levesque 2020;Wilkins et al. 2020;Chrimes et al. 2020;Sanders et al. 2020) as well as a number of studies of low-mass stellar populations (e.g. Eldridge et al. 2017;Byrne et al. 2021). The inclusion of binary stellar evolution pathways makes it a powerful tool for interpreting stellar populations of all kinds, but particularly those seen in distant galaxies. By keeping the properties of the stellar population fixed and changing the stellar spectral template library, it is possible to systematically evaluate a selection of spectral template libraries and the differences they make to the composite spectrum of a given stellar population.

Stellar Spectral Libraries
Six spectral libraries were selected for evaluation within bpass. These spectral libraries were chosen based on a number of criteria, such as being publicly available, having broad parameter space coverage, having spectra with non-Solar [α/Fe], and use in current or previous versions of bpass and other population synthesis codes. A brief description of each of these libraries and their properties are provided below, with important characteristics shown in Table 1. In each case, the surface gravity range covered refers to the widest possible range. In reality the low gravity limit is not fixed, and increases with increasing effective temperature.

AP -Allende Prieto et al. (2018)
The Allende Prieto et al. (2018) (AP) spectral library covers a broad range of composition parameter space, with spectra covering temperatures from 3 500 K to 30 000 K, [Fe/H] spanning 5.5 dex from -5.0 to 0.5 in 0.25 dex increments and a 2 dex variation in [α/Fe] from -1 to +1, also in increments of 0.25 dex. The wavelength coverage extends from the near ultraviolet (2000 Å) to the near infrared (2.5 µm). The Solar composition of Asplund et al. (2005) is used as the base elemental mixture.
These spectra were computed using the atlas9 (Kurucz 1992) model atmospheres of Mészáros et al. (2012). Spectral synthesis was carried out with the ass t radiative transfer code (Koesterke et al. 2008;Koesterke 2009).
The wide parameter space coverage of this spectral library makes it a useful direct comparator to the existing CKC14 and C3K libraries used in bpass (see below), while the fine increments of [α/Fe] also make it convenient for exploring the effects of α-enhancement in the context of population synthesis.
It should be noted that while some spectra in this grid cover temperatures between 25 000 K and 30 000 K, bpass presently uses a grid of OB stellar spectra calculated using WM-BASIC (Pauldrach et al. 1998) for main sequence stars  hotter than 25 000 K, so the hot spectra in the AP library are not used in the spectral synthesis in this work.

BaSeL -Westera et al. (2002)
The BaSeL spectral libraries (Lejeune et al. 1998;Westera et al. 2002) have been a popular choice in population synthesis codes. For example, the widely used Bruzual & Charlot (2003) single-star population synthesis models make use of the BaSeL spectra. The BaSeL spectra were also used in bpass up to and including v2.1 (Eldridge et al. 2017). While these spectra cover a broad region of parameter space, the spectral resolution is rather poor in comparison to the outputs of more recent stellar atmosphere models. BaSeL spectra were only used up to an effective temperature of 25 000 K in previous versions of bpass so this is maintained here.

CKC14 -Conroy et al.(2014)
The CKC14 spectral library (Conroy et al. 2014) was used as the Main Sequence and Giant Branch spectral library in bpass v2.2 . These stellar atmospheres and spectra were constructed using the atlas12 (Kurucz 2005; Castelli 2005) and synthe (Kurucz 1993) codes respectively. The effective temperature and surface gravity coverage is comparable to the BaSeL library, but is likewise restricted to a uniform abundance scaling of the Solar mixture. The CKC14 spectra use the Asplund et al. (2009) mixture as the base element distribution. As well as bpass, the CKC14 spectra have been implemented in the fsps population and spectral synthesis code (Conroy & Gunn 2010) and the related tool prospector (Johnson et al. 2021).

C3K -Choi et al. (2016)
The C3K spectra (Choi et al. 2016) are a successor to the CKC14 spectra, with updated Kurucz linelists, once again using atlas12 and synthe. The parameter space coverage and resolution are comparable to the CKC14 spectra, but cover a broader range of metallicities ([Fe/H] down to -4) and are notable for the inclusion of α-enhanced spectra ([α/Fe] from −0.2 to +0.6 in 0.2 dex increments). As with the CKC14 library, the Asplund et al. (2009) solar mixture is used as the base composition for this library. These spectra were introduced to bpass in v2.3 (Byrne et al. 2022). As with the AP library, the hottest stars in this library are not used in this work.

Coelho -Coelho (2014)
Coelho (2014) produced a small number of spectral libraries exploring the impact of α-enhancement on the stellar spectra. Model atmospheres were produced using atlas9, and spectral synthesis was carried out with synthe. A lower temperature limit of 3000 K was adopted by the author owing to the presence of dust in stars cooler than this and the lack of VO molecular lines in their line lists, which would make such spectra unreliable.
Spectra were generated at both [α/Fe]=0 and [α/Fe]=0.4, with the α-enhanced spectra being produced at 'fixed iron' and 'fixed Z', whereby the α-enhanced compositions are chosen such that they match [Fe/H] or Z of the [α/Fe]=0 spectra respectively. In these libraries, the Solar abundance mixture of Grevesse & Sauval (1998) is adopted as the base composition.
3.1.6 sMILES theoretical models - Knowles et al. (2021) The MILES (Medium Resolution Isaac Newton Telescope Library of Empirical Spectra, Sánchez-Blázquez et al. 2006) stellar spectral library consists of ∼ 1000 flux calibrated stars between 3500-7500Å. The empirical spectra provide good coverage of effective temperature and surface gravity, but are naturally constrained in composition by the composition of the stars which were observed in the programme. In order to explore the effects of arbitrary abundance changes, the sMILES (semi-analytical MILES, Knowles et al. 2021) study produced a number of theoretical spectral libraries to derive semi-analytical corrections to the observed spectra and generate pseudo-realistic observed spectra with a modified composition. These spectra were generated with the same underlying atlas9 stellar atmosphere models and ass t radiative transfer code as Allende Prieto et al. (2018), using the Asplund et al. (2005)  The major limiting factor of the sMILES spectral library is that the maximum effective temperature is 10 000 K. This is sufficient for the studies of old, low-mass stellar populations, where most stars are cool. However, to study younger stellar populations, which generally contain hotter stars, a wider effective temperature coverage is needed. References to the sMILES library throughout the remainder of the paper will refer to the theoretical spectral templates generated for the sMILES study, and not the resultant semi-empirical stellar spectra.

Normalisation
These spectral libraries use varying definitions of Solar metallicity and composition mixtures, cover different wavelength ranges, and have differing coverage of temperature and surface gravity parameter space. It is important to find a method to normalise all libraries to an equivalent bolometric luminosity. In the first instance, the total metallicity mass fraction, Z, is calculated for each composition sub-grid in each spectral library. This is computed based on the adopted choice of Solar metallicity and the stated values of [Fe/H] and [α/Fe]. Interpolated libraries of spectra are then produced to match the values of Z which are used in the stellar evolution models in bpass. This ensure that the value of Z in the stellar spectra matches the underlying stellar evolution models they are applied to. Given the different choices of Solar abundance profile by each grid, the relative proportions of individual elements will vary somewhat. All spectra were, where necessary, converted to vacuum wavelengths for consistency with the rest of bpass.
The next decision is a choice of common normalisation so that each library will return a spectrum with the same bolometric flux for a given value of temperature, gravity and composition. The CKC14 and BaSeL spectra, with their large parameter space coverage and wavelength range were normalised to have a total flux of 1 L , as is canonically the case in bpass, and then multiplied by the luminosity of the star to get the total flux. Without an additional normalisation step, the other spectral libraries, would have a larger implied luminosity since 1 L of luminosity would be contained within a narrower wavelength range.
To account for this, the flux for a given spectra in the Johnson-Cousins V, I and J filters were computed, and the AP, C3K, Coelho and sMILES spectra were normalised using the following scheme.
(i) All spectra with T eff > 6500 K are normalised to match the V-band flux of the equivalent CKC14 spectrum.
(ii) AP and C3K spectra with T eff ≤ 6500 K are normalised to match the J-band flux of the equivalent CKC14 spectrum.
(iii) C14 and sMILES spectra with 4500 < T eff /K ≤ 6500 are normalised to match the I-band flux of the equivalent CKC14 spectrum, since their wavelength coverage does not cover the J band.
(iv) C14 and sMILES spectra with T eff ≤ 4500 K are normalised to match flux of the equivalent CKC14 spectrum, in three specific wavelength bands (7450-7550 Å, 8075-8175 Å and 8750-8850 Å) owing to the strong molecular absorption bands in these cool star spectra and the challenge this presented in choosing a single filter profile for normalisation.
(v) An interpolated CKC14 spectrum is computed to determine the normalisation required for spectra in the other libraries which have T eff and log(g) combinations not already present in the CKC14 library.
This ensures that the integrated spectra sum to a value of less than 1, owing to the shorter wavelength coverage, but that the bolometric flux should be comparable between different libraries, allowing the best comparison of outputs. Fig 1 shows some example Z = 0.01, [α/Fe]=0 stellar spectra at a high (20,000 K), moderate (10,000 K) and low (3,500 K) effective temperature in the upper, middle and lower panels respectively. The two warmer spectra show the near ultraviolet and blue optical portion, while the cool stellar spectrum shows the red and near infrared portion of the spectrum. These fragments of spectra illustrate the overall success in normalisation of the spectra to match their fluxes in the wavelength ranges that are common to each spectral library, while also highlighting the issues and challenges in getting an exact match. In most cases, the underlying continuum flux is in agreement, but there can be significant differences in the strength of specific spectral line features.
The low spectral resolution of the BaSeL spectra is evident in the upper two panels, with most spectral features being heavily smoothed. The short wavelength limits of the AP and Coelho libraries are also evident in these two panels, while the absence of the sMILES library in the top panel is indicative of its narrow effective temperature coverage. The lower panel highlights the difficulties in normalising and comparing cool star spectra. Different predictions for the strength of the molecular absorption bands lead to considerably different fluxes. Additionally, there is a very sudden break seen in the AP spectrum at ∼9500Å, which is not seen in the other spectral libraries at this temperature. This break gets weaker with increasing effective temperature and is barely noticeable above an effective temperature of 4,000 K. This does pose a challenge in producing an effective bolometric normalisation. Normalising to the flux in the Johnson-Cousins J filter, which was the option chosen for this work, gives a good match to the infrared continuum of the CKC14 and C3K spectra, but leads to a lower predicted flux in the optical than the other libraries. Alternatively, normalising to match the flux maxima between the molecular absorption bands (as was done for the sMILES and Coelho cool star spectra due to the lack of coverage of the J filter) would better match the optical flux, but would significantly overpredict the infrared flux. The physical origin of this spectral break is unclear. There will be a noticeable divergence between spectral synthesis results in stellar populations where the flux is dominated by cool stars, either a small number of very luminous red supergiants at young stellar ages, or a large number of cool, low-mass Main Sequence stars at late ages. and cool (3,500 K, lower panel) stellar spectra with a total metallicity mass fraction of Z = 0.01 and a surface gravity of log(g/cm s −2 ) = 4.5 from each spectral library after the normalisation procedures have been carried out and the spectra have been resampled at a fixed 1 Å resolution. The two uppermost panels show the near-ultraviolet and optical spectrum, while the lower panel shows the optical and near-infrared.
The same bolometric normalisation procedure was applied to the α-enhanced libraries, using the [α/Fe]=0 CKC14 as the reference spectra. While this is not an ideal match, in the absence of full (1-100 000Å) α-enhanced spectra, this was deemed the most appropriate choice of normalisation, as the overall flux in the covered wavelength ranges is similar to the [α/Fe] case for the same stellar parameters.

Combining Population and Spectral Synthesis
Having constructed standardised libraries of stellar spectra, each one was used to produce bpass spectral synthesis outputs, for each of the bpass metallicities which are within the range of the metallicity of the original spectral libraries. bpass spectral synthesis outputs were also calculated for an α-enhancement of 0.4 dex for the spectral libraries where this was a possibility, namely the AP, C3K, Coelho and sMILES libraries. Aside from the varying spectral library, each calculation uses the fixed fiducial set of bpass input parameters (10 6 M initial population mass, Kroupa (2001) IMF from 0.1 to 300 M and initial binary parameter distributions from Moe & Di Stefano (2017)). A standard set of bpass outputs is generated, one for each metallicity, spectral library choice and [α/Fe] value, giving a total of 116 variants.

Photometric Colours
Photometric colours can be obtained for a very large number of objects simultaneously, with a relatively small cost in observing time compared to spectroscopy. Fig 2 uses colourcolour diagrams to compare the spectral synthesis results using the different libraries to photometric observations of nearby (z < 0.02) spectroscopically-confirmed galaxies identified by the Sloan Digital Sky Survey (SDSS DR7, Abazajian et al. 2009) whose properties have been characterised by the MPA-JHU analysis. We restrict the sample to galaxies with photometric uncertainties less than 0.05 mag. In particular, two observational populations were selected; dwarf starforming galaxies (log(M /M ) < 10 8 and assigned 'starforming' or 'starburst' spectroscopic subclasses in the SDSS data) and passive galaxies (galaxies not assigned a starforming or AGN subclassification). This allows us to evaluate the models against galaxies which are dominated either by young or old stellar populations respectively.
The stellar mass and star formation rates of these galaxies were extracted from the MPA-JHU data tables on the SDSS database. As such, these values, which are derived through a combination of star formation rate indicators and spectral energy distribution fitting, have been calibrated using models other than bpass. The stellar masses were determined by the MPA-JHU team using the Bayesian methodology and model grids described in Kauffmann et al. (2003), and they computed the star formation rates from nebular emission lines following the method described by Brinchmann et al. (2004).
These derived values will differ from the estimates that would be made had they been fitted with bpass. As Jones et al. (2022), Pacifici et al. (2022) and others have shown, the stellar masses recovered through different SED fitting templates are highly robust, showing systematic offsets of ∆log(M/M )<0.2, while star formation rate indicators can show a somewhat larger variation.
The left-hand panels of Fig 2 compare the galaxy photometry to the colours of a simple stellar population with metallicity Z = 0.008 as it ages from 20 Myr to 16 Gyr. The stellar population is bluer at young ages, so the time evolution of each line typically proceeds from the left-hand side to the right-hand side of the diagram. In the right-hand panels, the spectral synthesis results for all libraries and all metallicities (including α-enhanced compositions) are plotted, to illustrate the full spread in colours predicted by the models. In the case of the u − g vs r − i colour-colour space, the colours predicted by bpass cover the locus occupied by the observations very well. With the exception of the sMILES library at young ages (whose flux in u will be underestimated due to the absence of hot star spectra), there is good agreement between the different libraries, with a spread of less than 0.1 mag (0.08 mag in u − g, 0.09 mag in r − i), which is similar to typical uncertainties and spread in the observations.
In the g − r vs i − z case, the long wavelength limit of 9000 Å in the sMILES and Coelho libraries leads to spurious values for i − z since the z filter extends across wavelengths up to ∼ 10500 Å, as can be seen clearly in the lower-left panel. These libraries are excluded from the lower right-hand panel for clarity. For the remaining 4 libraries, the spread in colour-colour space is slightly larger (∼0.1 mag, occasionally a little more) but is still generally consistent with observational scatter. Looking at both of the lower panels, it is clear that the theoretical results from bpass consistently lie above the observations in i−z. Extinction vectors (AV = 1.0) are shown in each panel; invoking a modest amount of dust extinction leads to better agreement between the theoretical models and the observations.
A possible contributing factor to the offset in i − z is the theoretical uncertainty in stellar evolution predictions for red supergiant stars (e.g. Beasor et al. 2021;Davies & Plez 2021;Massey et al. 2021). As these stars are very luminous, uncertainties in their evolution could make a perceptible shift in the red and infrared flux of the stellar population.
In addition the bolometric normalisation of cool stellar spectra is challenging, given that their optical luminosities are strongly dependent on the assumed absorption species line-lists and line strengths. As was discussed in Section 3.2, this makes comparative cross-normalisation of spectra in the optical extremely challenging, while many observational constraints and theoretical atmospheres for these stars do not extend significantly into the infrared. It should be noted that ground-based observations in the i and z filters are also challenging due to the poor quantum efficiency of CCD detectors and reliably accounting for atmospheric contamination.

Mass-to-light ratios
The observed stellar mass-to-light ratio (M/L) can be a useful quantity, particularly for galaxy surveys in which photometric and spectroscopic coverage is insufficient to permit robust model fitting. Instead luminosity in a single photometric band can be used as a convenient proxy for stellar mass (e.g. Glazebrook et al. 2004). Fig 3 shows the mass-to-light ratio in the SDSS g and Johnson-Cousins B band filters for stellar populations generated using each spectral library as a function of age and metallicity. Each panel also reproduces the lines corresponding to our reference CKC14 library in light grey. In general, the differences in M/L are very small (< ±2.5 per cent from the reference library at ages of 1 Gyr or more) at a given age and metallicity, indicating that this is a robust quantity generally independent of the choice of spectral library.
For ages of 1 Gyr or more, the C3K and Coelho libraries differ from the CKC14 reference library by no more than 1.25 per cent at any metallicity. At the highest metallicities under consideration (Z = 0.040), the AP library predicts a slightly higher M/L (up to 7 per cent in g, 6 per cent in B at the highest age considered, 16 Gyr) than the CKC14 library, with the offset increasing with age. By contrast the SMILES library predicts a lower M/L at high metallicities and late ages (15 per cent lower in g and 18 per cent in B at 16 Gyr). The same pattern is seen in both g-and Bband derived values. The BaSeL library-derived values show a smaller offset from CKC14 (the largest offsets are about 6 per cent in both bands, seen at late ages and/or low metallicities).

4000 Angstrom break
The spectral break at 4000 Å is a powerful diagnostic of stellar population age. It captures the combined impact of hydrogen and metal absorption edges seen in the spectra of mid-sized stars. Thus it emerges as the dominance of more massive stars in young stellar populations wanes. We demonstrate the impact of assumed stellar library on this quantity in Fig 4, where we have calculated the break strength, D4000 defined as the ratio of the flux in the spectrum between 4000 − 4100 Å and 3850 − 3950 Å. The differences between the stellar libraries are less than 2.5 per cent at ages less than 100 Myr. The ex-ception to this is the sMILES library, which is temperaturelimited to 10,000 K, and thus unreliable for young stellar populations. At ages later than 100 Myr, when the population is dominated by cooler stars, spectra derived from the sMILES library are consistent with most of the other libraries. The BaSeL library consistently predicts the smallest break, while the AP and C3K libraries show the largest values for D4000, with the other libraries lying in between these values. At an age of 10 Gyrs, the spread between BaSeL-and C3K-derived values is ∆D4000 = 12 per cent at Z = 0.030, while all other libraries lie within ∆D4000 = 4.7 per cent of each other.
Since the difference between spectral libraries cannot arise from different temperatures or luminosities of stars in the population, it can only reflect a different treatment of the hydrogen, iron and other metal absorption line complexes that contribute to the break. In particular this is affected by the completeness and calibration of the metal species line lists incorporated in the spectral synthesis code used to generate each set of underlying stellar templates.

Spectral indices as indicators of age and metallicity
Spectral indices are pseudo-equivalent width line strength measurements that permit information on the stellar population to be extracted from low spectral resolution, or low signal-to-noise spectroscopy for which a full spectral fit is impossible. In each case, a pseudo-continuum region is defined and a ratio taken against the integrated flux of a region deemed to contain one or more absorption features. The spectral index list defined by the Lick Observatory has traditionally been used to classify and characterise nearby stars, and old simple stellar populations such as globular clusters (Faber et al. 1985;Burstein et al. 1986;Worthey et al. 1994;Worthey & Ottaviani 1997). Over time, additional indices have been added to this list by other authors (e.g. Fanelli et al. 1992;Kuntschner et al. 2010;Du et al. 2012;Vidal-García et al. 2017;Calabrò et al. 2021).
While many modern high-multiplex spectral surveys now adopt a full-spectrum fitting approach, the Lick indices continue to encode much of the key information defining the stellar population contributions to the integrated light and are thus appropriate tools in which to compare the impact of assumed spectral library.  largest outliers are the BaSeL and sMILES libraries. This is not particularly surprising given their respective limitations; the undersampled low resolution spectrum of the BaSeL templates leads to an underestimate of the equivalent widths of absorption line features, while the normalisation of cool stars in the sMILES templates presented challenges due to their limited wavelength coverage, and these stars are important at ∼ 10 Gyr. The remaining spectral template libraries all follow a broadly similar trend with metallicity. Nonetheless, the Coelho library produces index strengths which are con-  Kuntschner et al. 2010). This index was explicitly constructed to minimise the impact of uncertainty in α-element enhancement on total metallicity estimates. The H β index measured the absorption line equivalent width of the Balmer series and hence is closely related to stellar population age. The tracks indicate the time evolution of a simple stellar population from log(age/years)=9-10.2, while dashed lines connect tracks of different metallicity at constant age. The CKC14 library is shown as a reference in light grey on each of the subsequent panels. The Schiavon et al. (2012) measurements of these indices in M31 and Milky Way globular clusters are plotted as black points.
Indices calculated using the C3K library span broadly the same region of parameter space as those using the CKC14 library, with a slight shift towards higher MgFe50 strengths at a given age (i.e. an observation placed on the diagram would have a lower implied metallicity than the CKC14 case), with changes in H β being typically less pronounced. The offsets are most extreme in the youngest, and highest metallicity, populations considered, reaching |∆MgFe50|=0.5Å and |∆Hβ|=0.1Å.
The sMILES and Coelho libraries both show a significant shift to lower values of the MgFe50 index, (i.e. an observation placed on the diagram would have a higher implied metallicity than the CKC14 case) but the H β index remains comparable. For a population with an age of 4 Gyr and Z = 0.008, the MgFe50 index would have values of 3.67, 2.85 and 1.65 Å in the CKC14, sMILES and Coelho libraries respectively, a difference of more than a factor of two between the lowest and highest values. Objects identified as Solar metallicity with CKC-derived indices would lie outside the modelled parameter range of both the Coelho and sMILES models and be interpreted as lying amongst the most metal rich of local systems.
Indices derived using the AP library give MgFe50 strengths which are comparable to the CKC14 and C3K libraries, but show an offset to higher H β line strengths (i.e. an observation placed on the diagram would have a higher implied age than the CKC14 case). Indeed, a number of the observed globular cluster population would have implied ages exceeding the age of the Universe. The Basel library-inferred indices cover a much smaller region of the parameter space. While the calculated indices broadly span the full range of observations (except for extreme examples in MgFe50), the implied ages show a strong trend with implied metallicity, spanning the full range of both quantities modelled. As was the case for the Coelho and sMILES libraries, the inferred metallicity of some systems would substantially exceed Solar.
Additional visualisation of the differences between the spectral libraries in the photometric colours and selected spectral features are provided in the supplementary online material (labelled Appendix A) as a function of population age and metallicity. All differences are measured with reference to the results obtained using the CKC14 library, which was the fiducial spectral template library used in BPASS v2.2 .

EFFECT OF α-ENHANCEMENT ON OBSERVABLES
Thus far, we have focused on results comparing the impact of spectral libraries with Solar-scaled compositions. As discussed earlier, α-enhancement is widely observed and its underlying mechanism is an important process to consider, particularly in low-metallicity environments. Four of the six spectral libraries examined in this work, C3K, AP, sMILES and Coelho, provide α-enhanced template spectra. While some libraries offer a number of different [α/Fe] options, [α/Fe]=0.4 permits a direct comparison between all four of these libraries. In order to do so, the sMILES and AP α-enhanced spectra were interpolated between adjacent [α/Fe] values. In terms of photometric colours, changes are typically small (less than ±0.08 mag for a given age and metallicity), except for in the bluest filters at late times, where the α-enhanced models appear bluer. This is in agreement with previous studies of the impact of α-enhancement on colours (Vazdekis et al. 2015;Byrne et al. 2022). In general, these differences are independent of library choice. Illustrations of the differences in colour predictions with α-enhancement can be found in the supplementary online material (labelled as Appendix B). The mass-to-light ratio is mostly insensitive to changes in [α/Fe], with variations on the 2 − 3 per cent level. These differences are generally most pronounced at late ages and/or high metallicities. The 4000Å break is unchanged at young ages, but becomes up to ∼ 10 per cent weaker for a given Z at late ages. This is consistent with old α-enhanced populations appearing bluer, as seen in the colour information.
As might be expected, the impacts on probes of spectral line absorption are more pronounced. These manifest as changes in the behaviour of the Lick indices. be easily seen that these α-enhanced models agree much better with the observations than the scaled-Solar composition models. This is not unexpected, since these old, metal-poor populations are expected to be enriched in α-process elements. As before, indices derived using the sMILES library predicts a lower Fe5270 index strength at fixed MgB than the other models. The Coelho library spectral templates lead to a larger shift in the line indices with α-enhancement than is seen in the CKC14 and C3K libraries. The size of the difference between the theoretical results using different libraries is typically smaller than or comparable to the uncertainties in the observational measurements. The H β line index contains no contribution from αelements or other strong metal line features and so should not be affected by α-enhancement. Since the MgFe50 index was explicitly constructed to minimise the impact of αenhancement on its constraint of metallicity from the line shifts with α-enhancement, the magnitude of which depends on the stellar spectral template library adopted. The smallest impact is seen in the Coelho library-derived MgFe50 index in which the largest positive offset seen is ∆MgFe50=4.6 per cent at Z = 0.002 and age=16 Gyr, while the largest negative offset seen is ∆MgFe50=-4.5 per cent at Z = 0.01 and age=1 Gyr. The changes are less than 2.5 per cent for low metallicities and young ages (Z ≤ 0.006, age ≤4 Gyr) and high metallicities at all but the youngest ages considered (Z ≥ 0.008, age≥1.6 Gyr). The AP template library leads to offsets at young ages at all metallicities and at low metallicities across most ages. The largest offset is |∆MgFe50| = 7.4 per cent at Z=0.020 and age=1 Gyr. Only a small offset (|∆MgFe50| < 2.5 per cent) is seen in models with Z ≥ 0.04 and an age ≥4 Gyr. The sMILES library predicts indices which span a narrower range of MgFe50 than the other libraries, somewhat masking the impact of α-enhancement. However the line strength is increased in the [α/Fe]=0.4 case at all ages considered in this figure, and particularly at low metallicity. The MgFe50 index is typically 10 to 15 per cent stronger in the α-enhanced case for ages greater than 1 Gyr, increasing with age. This increase in strength is largest at low metallicity (Z ≤ 0.004).
The C3K template library generates indices which show the largest consistent offset in MgFe50 with α-enhancement. At all ages and metallicities, models with [α/Fe]=0.4 show a ∆MgFe50 towards lower values, with the offset ranging from 4.7 per cent at z=0.020 (age=10 Gyr) to 6.5 per cent at Z=0.002 (at the same age). The MgFe50 index combines a measure of the α-sensitive magnesium absorption triplet at λλ5167, 5173, 5184 with a measure of the low ionization iron line blanketing of the continuum at λ ≈ 5000Å. The index construction assumes that the strengthening of the former can be offset by the weakening of the latter for [α/Fe] =0. The C3K model grid was generated with updated absorption line lists and line strengths, based on the best available information in 2016. Compared to some other synthetic spectral templates, the iron absorption complexes have a bigger effect on modifying the detailed shape of the continuum. As a result, the iron component of the index may evolve more rapidly with composition than has been the case in older spectral libraries. Hence the fine balance between the two offsets which was defined using one spectral template set does not necessarily apply to all, and the scale and the direction of the offsets are determined by the line lists and atmosphere radiative transfer code adopted. These differences are illustrated graphically for H β and MgFe50 in Appendix B of the supplementary online material.

DISCUSSION
6.1 Impact of Spectral Library Uncertainties on Observables

Photometric Properties
The results presented here illustrate that the choice of stellar spectral library can have an impact on the predicted observational properties of a stellar population. Some observables such as the 4000 Å break and the mass-to-light ratio are relatively insensitive to the spectral templates adopted, their values and trends being dominated by the age and total metallicity of the stellar population. M/L varies by no more than 7 per cent from the reference CKC14 spectral library (with differences typically less than 2 per cent). The 4000 Å break, an important proxy for stellar population age, varies by no more than 4.7 per cent from the reference in the case of libraries with reliable results across the full age range. We note that this variation is consistent with the offset noted by Coelho et al. (2020) between values derived using synthetic and empirical spectra. Photometric colour predictions show good agreement both between spectral template libraries and with SDSS observations of galaxies dominated by either a young or an old stellar population. Photometric offsets between libraries are of order 0.08 mags in u − g and 0.09 mags in r − i. As discussed in Section 5, the variation in colour with α-enhancement is similar in magnitude regardless of choice of spectral template library.
That this is a < 10 per cent variation is to be expected, since the photometric colours of stellar populations are to first order determined by the thermal blackbody temperature of the most massive surviving stars at a given age, while the details of composition and resulting absorption lines provide a second order correction to the integrated flux on ≈ 2000Å broad-band scales. Hence colours are relatively insensitive to both spectral template library and [α/Fe] selection, compared to the underlying population synthesis. All spectral libraries examined show similar trends with stellar metallicity.
However the offsets between libraries found here are a fac-tor of 2-3 larger than those determined by Coelho et al. (2020) for the difference between synthetic and empirical libraries implemented in the galaxev spectral synthesis code. We note that a key difference between this study and the previous work, is the use of a full binary population synthesis, which precludes the implementation of smoothly varying, but overly simplistic, isochrones to describe the stellar population. As a result, the colour shows a stronger dependence on age and differences between libraries are amplified at ages associated with mass transfer or other binary interactions in massive stars (as can be seen, for example, in Fig. 2). The offsets are also comparable to or slightly larger than the range in optical colour arising from binary fraction distribution uncertainty as evaluated by Stanway et al. (2020). The fact that neither spectral nor population uncertainties clearly dominate, emphasises the need to quantify these in both the population and spectral synthesis separately and using multiple techniques, before assuming any absolute 'truth' in the relation between colour and population age.
The offsets between the colours predicted by different population and spectral synthesis models at a given age, are also somewhat larger than the photometric precision now attainable for galaxies at most redshifts at z < 7, where a photometric uncertainty of a few percent or better is not unreasonable. Since this photometry is used for spectral energy distribution template fitting, the uncertainties on derived population parameters from such fits may be dominated by the differences in either spectral libraries used or population synthesis technique, rather than data limited. More work is clearly needed to understand and reduce the uncertainties on the theoretical side.

Spectroscopic Properties
We have limited the discussion of spectral properties in this work to well-defined spectral indices, rather than attempting full-spectrum fitting analysis. However these exemplify the trends and behaviour shown by the different libraries.
As shown in Section 4, the model indices derived from all spectral libraries show a similar qualitative behaviour. Outliers in the properties of line indices occur due to undersampled low resolution spectra and the normalisation of cool stars resulting from limited wavelength coverage. Quantifying these offsets presents a challenge, since the line indices are, by design, highly sensitive to age, metallicity and chemical enrichment.
However, while the trends in all libraries are similar, models derived using the Coelho library are consistent with the highest metallicity for a given line index strength. By contrast, those derived from the AP library would lead measured indices to be interpreted as the oldest stellar populations, in places exceeding the age of the Universe. Coelho et al. (2020) demonstrated how these line indices depend on the choice of empirical vs theoretical spectral libraries, in the simple single-star isochrone case. They found a typical offset between the libraries consistent with 0.07 Å equivalent width in Hβ, and -0.294 Å in the Fe5270 index (which probe age and metallicity respectively). This suggests that iron line blanketing may differ substantially between the spectral libraries considered, while the treatment of hydrogen recombination is, perhaps unsurprisingly, very similar. They noted that the synthetic spectra typically yielded lower metallicity estimates for a given line index than their library of empirical spectra.
All the spectral libraries modelled in this work are theoretical. Compared to those considered by Coelho et al. (2020), they show comparable scatter in the Fe5270 index (see Fig.  5), but a larger scatter in the MgB index. This may reflect variation in the abundance pattern (rather than total metallicity) between the libraries considered. We note that all of the theoretical libraries would render low estimates of metallicity for a fixed Fe index by the criteria of Coelho et al. (2020), but caution that the empirical spectra are themselves, inevitably, calibrated against models to determine their metallicities, and the uncertainties inherent in this process must also be considered.
Any comparison of line index predictions against observational data must necessarily be conducted at the population statistics level, since multiple stellar populations which vary in age and metallicity substantially affect each measurement, and since the flux uncertainties on these narrow features are typically large. The uncertainty on a single globular cluster data point, for example, will span several metallicity increments and age steps on our model grid. However, the differences between theoretical models exceed even these large uncertainties, as Fig. 6 demonstrates.
Of all the spectral libraries considered, the CKC14 and C3K libraries are the only ones to predict grids of spectral line indices which span the full range of observed data in the local Globular Cluster systems.

Spectral Library Evaluation
The C3K spectral library is the only library tested in this work which includes all three of: (i) wavelength coverage extending from the far-ultraviolet to the far-infrared, (ii) broad coverage of T eff -log(g) parameter space, and (iii) multiple variations of [α/Fe]. As this work has demonstrated, the C3K library produces comparable predictions for stellar observables to other libraries tested, in regimes where the parameter coverage of all the libraries overlap. It predicts colours, D4000 and other photometric properties which lie in the midst of other library predictions and which agree with observations. It produces spectral diagnostics which are broadly consistent with other libraries, but which demonstrate a slightly higher density of iron line blanketing in the stellar continuum. Again this is consistent with observational constraints. As a result, the robustness and parameter coverage gives it the best suitability to the diverse range of applications that bpass is used for.
The AP spectral library also has significant parameter coverage, however issues such as the spectral break at ∼ 9000 Å in cool stars (which has no obvious physical origin), and the wavelength coverage (which stops short of the far-UV) limit the possibilities for exploring the uncertainties in young stellar populations. This shortcoming in UV wavelengths, particularly in the ionizing continuum of the far-UV, is common to all of the other spectral libraries.
The BaSeL and CKC14 libraries present different issues when it comes to population and spectral synthesis. The BaSeL library has a far lower spectral resolution than all other libraries considered here, and is also based on a sincesuperseded version of the stellar atmosphere modelling algorithms, which omit key improvements in our understand-ing. While the CKC14 library does not have these problems, it shares with the BaSeL grid an absence of any stellar spectra with non-Solar scaled abundance patterns. By contrast, all of the other libraries have one or more non-Solar α-enhancements.
The absence of near-infrared wavelength coverage in the sMILES and Coelho libraries present substantial difficulties, in particular regarding the normalisation of cool star spectra, as shown in Section 3.2. This is only one of several issues regarding spectral templates for low temperature stars, which are typically faint and extremely red, presenting empirical challenges, and also have turbulent, optically thick atmospheres, presenting theoretical challenges.
Given the large variance in spectral templates of the coolest stellar models, one potential source of offsets is the different low-temperature limits adopted by the individual spectral template libraries. As explained in section 2, in these cases, the bpass spectral synthesis adopts the nearest available (i.e. typically hotter) stellar template at the same surface gravity. This particularly affects the sMILES and AP stellar libraries.
Arbitrarily restricting the libraries to only use stellar models at temperatures common to all libraries (i.e. T eff ≥ 3500 K) should remove this effect, while only affecting relatively faint and low mass or extremely cool giant stars. In such a modified model, these will all be assigned the higher temperature, regardless of whether the correct temperature spectrum exists in the selected model library. We do this in a trial spectral synthesis grid, and find that it makes only a small difference to line indices and photometric colours, becoming more noticeable at high metallicity and at some ages, for example, the photometric difference between the AP and C3K libraries at Z=0.020 shows a noticeable shift towards bluer colours due to restricted parameter grid coverage at ages around 10 8 years (when red supergiants can dominate the integrated light) and at ages above 10 10 years (where cool dwarfs dominate). At Z=0.008, the differences only appear for red supergiants, while at Z=0.002 the models are indistinguishable. The differences between spectral libraries follow the same pattern as before, and remain larger than can be explained by this effect. This indicates that the differences between the libraries is more fundamental than a small artificial shift in the effective temperature assumed for the spectra of some cool stars.

UV and Young Stellar Populations
An area not explored in any of the results and comparisons presented here is the range of spectral features associated with very young stars. In the era of JWST , it has become possible to probe to much higher redshifts than hitherto, observing these very young stellar populations. In these, the flux is dominated by hot stars, which emit a significant portion of their flux in the ultraviolet. Such populations will also typically be α-enhanced. As very few of these spectral libraries extend into the UV (and only C3K both considers αenhancement and attempts to predict the ionizing spectrum) it was not possible to extend the detailed comparison carried out here down to shorter wavelengths.
The prediction of the ionizing spectrum is essential since young stars are typically embedded in a relatively dense circumstellar medium, leading to the formation of H ii regions (Xiao et al. 2018). These regions, dominate the spectrum all the way into the optical, through a combination of nebular continuum and strong line emission. This nebular emission must be incorporated into any comparison between synthetic spectra and observations in the young stellar population regime. It is, on the other hand, largely irrelevant to older stellar populations, and thus has been neglected here.
We note that weaknesses remain even in the C3K spectral library version of bpass (v2.3). In particular, the hottest stars require specialist atmosphere models which account for their radiative, diffuse and often-stripped envelopes. At present bpass adopts the Main Sequence/Giant Branch spectral libraries considered in this paper only for T eff ≤ 25000 K, even though some of these libraries include spectra for stars hotter than this. The comparison made here uses the same subgrid of custom O-star and WR spectra for these hot stars. The inclusion of the hotter stars from each template library would go some way toward exploring uncertainties in young, hot stellar populations. New hot star spectra, analogous to the existing O star spectral library generated with wmbasic (Pauldrach et al. 1998), have recently been published (e.g. Hainich et al. 2019), while others may need to be generated. This would enable a direct comparison with the current hot star spectra used in bpass.

JWST
The arrival of data from JWST opens our eyes to the highredshift Universe in unprecedented detail. Numerous high redshift galaxy candidates have already been identified in the Early Release Science data (e.g. Adams et al. 2023;Atek et al. 2023;Castellano et al. 2022;Donnan et al. 2023;Finkelstein et al. 2022;Fujimoto et al. 2023;Naidu et al. 2022;Yan et al. 2023) and in time, high resolution spectroscopic follow up will become available for many of these candidates. However, the spectroscopic subset will always be smaller than the photometric sample. As a result, the fitting of observed spectral energy distributions to template libraries will remain key to the interpretation of bulk galaxy properties including mass, age, star formation history and metallicity. Many SED fitting codes continue to use spectral templates tuned to reproduce galaxies in the local Universe accurately. Conditions in the distant Universe are known to differ substantially from those in the mature, nearby Universe . As such, it is important that any SED fitting for these galaxies uses spectral templates which are appropriate for the expected properties (incorporation of α-enhancement, young stellar populations, inclusion of binary stellar evolution) so that derived properties are reliable. The early, very young galaxies seen in JWST and deep HST data will provide crucial constraints on the evolution of galaxies as a function of redshift, so it is important that the modelling uncertainties are accounted for, so that any conclusions drawn from such work is robust. This work provides a foundation for such modelling.
In addition to young stellar populations observed in situ at high redshift, JWST will also enable precision photometry and spectroscopy of middle-aged and old stellar populations at lower redshift. With sufficient data precision, the star formation histories of these galaxies can be explored, and galactic archaeology used to constrain the distant Universe and the evolution of galaxies in more detail. The nearto mid-infrared wavelength coverage of JWST permits, for the first time, detailed studies of the rest-frame optical at intermediate redshifts, and the rest-frame infrared in local galaxies. Thus the wavelength coverage of stellar spectral libraries, which in turn determine the wavelength coverage of synthetic stellar population SEDs, is crucial to this work.
While SED fitting will be required for many distant galaxies, the sensitivity of JWST 's spectrographs will also be key to the detailed interpretation of galaxy properties across a broad redshift range. In particular, the metallicity, abundance indicator and age-sensitive lines targetted by the Lick and other spectral indices will be accessible for the first time beyond Cosmic Noon (z > 3). As we showed in section 4.4, the calibration of such indicators is subject to substantial variation depending on the adopted stellar library. This emphasises both the difficulty in obtaining an absolute calibration for these properties, and the importance of applying uniform methodologies when comparing different data sets and published results. While pixel-by-pixel full-spectrum fitting methods provide additional constraints, they will be sensitive to the same variation in line strengths between different template model libraries. These uncertainties are particularly pronounced when studying the oldest, and most metal rich, populations at any given redshift.

Future space telescopes
Future space telescopes, such as the Nancy Grace Roman Space Telescope (Roman 1 ) and Euclid 2 , will be well equipped to examine distant galaxies in great quantities. Euclid is predicted to produce images of around 1.5 billion galaxies and low resolution, near-infrared spectra of an estimated 25 million galaxies at redshifts of 1 < z < 3, covering 15 000 deg 2 of sky to (Sorce et al. 2022). Meanwhile, predictions by Drakos et al. (2022) suggest that 1 deg 2 ultra-deep field observations by Roman (depth of ∼ 30 mag) will detect more than 1 million galaxies photometrically, over 10 000 of which will lie at redshifts consistent with the Epoch of Reionisation (z > 7). Given the lack of high resolution spectroscopy in both of these instruments, the robustness of spectral energy distribution modelling, and the model-dependence of line strength indices, will continue to be an issue well into the next few decades.
These instruments will significantly increase the number of observed galaxies at intermediate and high redshifts, providing constraints on galaxy evolution and permitted metallicity enrichment and star formation histories, but only if the bulk properties of the stellar populations can be determined with due precision, and devoid of systematic uncertainties. The results presented in this work suggest that we are not yet in a regime where this is the case. It is therefore of paramount importance that theoretical models continue to improve and sources of uncertainty are understood.

Theoretical Advances
This study highlights a number of areas in which improvements (to bpass specifically and the field in general) can be made to further understand and reduce the uncertainties in the spectra of synthetic stellar populations. This relies on improvements in the basic stellar template libraries employed for synthesis.
As discussed above, significant disagreement exists between different synthetic models of cool star spectra. These spectra are highly sensitive to the atomic and molecular line-list data included in the model, as there are very large molecular absorption bands present in the high opacity atmospheres of these stars. Further work, such as the consideration of non-LTE effects and calibration against nearby observed spectra should be done to bring these models into closer agreement.
Here JWST spectroscopy may help improve our understanding of low temperature stellar atmospheres through providing improved constraints in the near-infrared. Large differences in these spectra can play a role in the composite spectrum of a stellar population in regimes where cool stars dominate (i.e. red supergiants at intermediate ages and red dwarfs in old populations).
The development of new tools for calculating the spectra of stars with atmospheres of arbitrary compositions (e.g., mps-atlas, Witzke et al. 2021) provides an opportunity to further examine the effects of non-Solar-scaled compositions, and refine spectral synthesis models for stellar populations in the early Universe, in line with best estimates for galactic chemical evolution. Along with considerations of αenhancement in stellar spectra, some recent studies have also explored the impact of non-Solar-scaled compositions on stellar evolution models (e.g. Grasha et al. 2021;Farrell et al. 2022). The bpass stellar evolution code has recently been used to calculate small grids of α−enhanced stellar evolution models, and analyses of these are under way. However the recalculation of the ∼250,000 binary interaction models which are combined in a BPASS population synthesis will take some time. Combining stellar evolution models and stellar atmosphere models which both use non-Solar-scaled compositions will allow for self-consistent modelling of stellar populations in low-metallicity environments.
As mentioned above, forthcoming observations by JWST in particular will produce high-resolution rest-frame spectra of distant galaxies. This will eventually enable detailed statistical studies of galaxies at a broad range of redshifts, provided there are robust theoretical models with which to fit them. In particular, the high resolution in the redshifted rest-frame ultraviolet is likely to present a problem for most extant stellar population synthesis model suites.
Presently, bpass adopts a fixed 1 Å resolution for its output composite spectra, as a compromise to account for the varying resolutions of the numerous spectral libraries. Most modern theoretical spectral libraries (see e.g. Table 1) have a resolution much greater than 1Å at UV and optical wavelengths. Indeed, this is one of the strengths of using theoretical spectral libraries over empirical ones. Future developments of bpass will seek to increase the resolution of synthetic population spectra, so that more detailed spectral features can be studied in the output composite spectra. With the rise of full spectral fitting as a technique, it is important that the models have a similar or better resolution than expected from observations, to optimise the amount of science which can be extracted from the observations.

Future Approaches
In this paper we have concentrated on simple stellar populations in order to provide clear, direct comparisons between different libraries. Another way in which the impact of theoretical uncertainties on interpretation can be demonstrated is through constructing the spectrum of an entire galaxy by combining multiple simple stellar populations. This must account for the age, metallicity and star formation history of the galaxy. This can be extracted, for example, from each pixel of a galaxy in observed data, or from the known properties of each particle in a cosmological simulation.
By using a synthetic galaxy from a cosmological simulation, the results produced by using fitting to recover the inferred properties, based on each spectral template library can be evaluated. This approach will form the basis of a future publication from our team.
Such an approach may also incorporate an analysis of the impact of non-stellar emission components. The models presented in this work have only considered the stellar component of an observed population. No contribution from the nebular gas which occurs ubiquitously in the neighbourhood of hot, young stars has been included. Similarly, no account has been taken of the presence in old stellar populations of diffuse ionized gas emission, powered by evolved stars. These introduce further uncertainties in the interpretation of observed data.

CONCLUSIONS
The choice of stellar spectral library can lead to differences in the photometric and spectroscopic properties of a synthetic stellar population. This will directly impact determinations of derived ages, metallicities, and star formation histories of composite stellar populations from SED and photometric fitting of observations.
(i) Photometric properties such as colours and mass-tolight ratios, are perhaps the least sensitive observables to changes in the spectral library. Colours for simple stellar populations vary by up to 0.1 mag at a given age. The scatter in colour predictions as a function of age and metallicity is consistent with the distribution of both dwarf star-forming galaxies and massive passive galaxies, as observed by SDSS. The magnitude of the uncertainty is comparable to the differences seen between empirical and synthetic spectra, and also to that produced by variations in the population synthesis model, showing that no one particular effect dominates the uncertainty.
(ii) Variations in the mass-to-light ratio are typically less than 2.5 per cent for young stellar populations and less than 7 per cent for older stellar populations. Excluding outliers, most libraries predict values of D4000 which differ by less than 5 per cent.
(iii) A lot of uncertainty remains in the spectral synthesis of cool stars, especially those with tenuous atmospheres, as different radiative transfer codes, opacity and line-list data can produce considerably different spectra of these stars. This can have a knock-on impact on the modelling of stellar populations in the age ranges where cool stars are the main contributor to flux, either as red supergiants at ages ∼ 10 7 years or at late ages (∼ 10 10 years) when low mass Main Sequence stars dominate.
(iv) When looking at spectral line indices, the differences between stellar libraries can be quite considerable. Focusing on the H β and MgFe50 indices, which are sensitive primarily to age and metallicity respectively, large offsets are seen when comparing different libraries. The Coelho and sMILES libraries would imply a higher observed metallicity than the others, while the AP library would estimate older stellar populations than the other libraries. The low-resolution BaSeL library does cover the observed parameter space, but would estimate a much broader range of ages and metallicities than the other libraries.
(v) The effect of α-enhancement is noticeable in all libraries, with the behaviour generally being qualitatively similar. Colour index shifts with a change in [α/Fe] are very similar, regardless of library choice. Mass-to-light ratios and D4000 show consistent changes also. The spectral line indices show a variation in behaviour, attributable to different treatment of iron line blanketing and atomic data.
The C3K library is the only spectral template library used in this work which simultaneously provides the broad coverage of wavelengths, effective temperatures and [α/Fe] compositions necessary to begin to evaluate the young stellar populations that JWST has observed at high redshift. However this study demonstrates that uncertainties remain in stellar templates in the optical, and these are likely still larger at longer and shorter wavelengths. Care must be taken when attempting an absolute rather than relative calibration of galaxy properties based on stellar population synthesis.

Uncertainties in Population Synthesis: Spectra i APPENDIX A: VISUALISATION OF THE DIFFERENCES BETWEEN LIBRARIES RELATIVE TO THE CKC14 REFERENCE LIBRARY
The figures in this appendix illustrate the differences between the spectral libraries for some of the key observable quantities present in the figures in the main body of the paper. Fig A1 illustrates the difference in colour-colour values from the CKC14 values for each spectral library and each adjacent pair of SDSS colour filters as a function of metallicity and logarithmic age bin in the bpass output for ages between 16 Myr and 16 Gyr. The difference in colour, ∆ C, is determined as C Library − C CKC , so positive (negative) values indicate that a data point is redder (bluer) than the CKC grid at the corresponding age and metallicity. All panels use a colour scale extending to ±0.08 mag, which covers most of the variation seen. The sMILES library at young ages in u − g and the sMILES and Coelho libraries in i − z show much larger variations due to the absence of spectra hotter than 10000 K and incomplete flux in the z filter respectively, so these panels contain no meaningful data. Fig A2 illustrates the differences in the g-band mass-to-light ratio (first column), the strength of the 4000Å break (fourth column), and the strength of the MgFe50 and H β line indices (second and third columns respectively) from 1 Gyr to 16 Gyr. The differences in M/Lg and D 4000 are shown in terms of percentage change from the results predicted by the CKC14 spectra The differences in the line index measurements, X Library − X CKC , are given in units of Å. The range covered by the colour scale is indicated at the top of each column.

APPENDIX B: VISUALISATION OF THE DIFFERENCES PRODUCED BY α-ENHANCEMENT ON A PER LIBRARY BASIS
The figures in this appendix illustrate the predicted differences between the [α/Fe]=0 and [α/Fe]=+0.4 stellar spectral templates within each spectral library. Note therefore that these difference values are not in reference to the CKC14 library as in Appendix A, but, for example, the difference between the α-enhanced AP predictions and the non-α-enhanced AP predictions and so on. Fig B1 shows the differences in SDSS colour-colour space ∆ Cα = C Library,α/Fe=0.0 − C Library,α/Fe=0.4 . This means that positive (negative) values indicate that the α-enhanced spectra are bluer (redder) than the equivalent Solar composition spectra. All panels use a colour scale extending to ±0.08 mag, which covers most of the variation seen, and the known result that α-enhanced spectra appear bluer at blue wavelengths and late times is readily apparent. Fig B2 shows the differences in mass-to-light ratio in the g band, M/Lg (first column) and D 4000 (fourth column) in terms of percentage difference: The differences in the MgFe50 and H β line indices, X Library,α/Fe=0.4 − X Library,α/Fe=0.0 (second and third columns), are given in units of Å. The range covered by the colour scale is indicated at the top of each column. Note that the ranges spanned by these colour scales differ from those in Fig A1. ii Byrne et al. Figure A1. Differences between the predictions of the CKC14 spectral library and each other library for colours u − g (first column), g − r (second column), r − i (third column) and i − z (fourth column). Data is plotted as a function of logarithmic age (x-axis) and metallicity (y-axis, labels correspond to Z × 100) for a simple stellar population synthesised using each of the spectral template libraries (one row per library, as indicated by the titles on each panel). Differences are stated relative to the CKC14 spectral library. The values of ∆(i − z) for the Coelho and sMILES libraries (fourth column, bottom two panels) show values far exceeding the colour scale due to the wavelength range of thez filter exceeding the maximum wavelength of these spectra.
Uncertainties in Population Synthesis: Spectra iii Figure A2. Differences between the predictions of the CKC14 spectral library and each other library for M/Lg (first column), the MgFe50 line index (second column), H β line index (third column) and D 4000 (fourth column). Differences are calculated as outlined in the text, and refer to a simple stellar population between the ages of 1 and 16 Gyr. Each spectral library occupies a single row, as indicated by the titles on each panel. Figure B1. Differences between the predictions of [α/Fe]=0.0 and [α/Fe]=+0.4 spectra for colours u − g (first column), g − r (second column), r − i (third column) and i − z (fourth column) for each spectral template library (indicated by panel titles). Data is plotted as a function of logarithmic age (x-axis) and metallicity (y-axis, labels correspond to Z × 100) for a simple stellar population synthesised using each of the spectral template libraries (one row per library). Differences are stated relative to the [α/Fe]=0.0 case, as outlined in the text. Figure B2. Differences between the predictions of [α/Fe]=0.0 and [α/Fe]=+0.4 spectra for M/Lg (first column), the MgFe50 line index (second column), H β line index (third column) and D 4000 (fourth column). Differences are calculated as outlined in the text, and refer to a simple stellar population between the ages of 1 and 16 Gyr. Each spectral library occupies a single row, as indicated by the titles on each panel.